1 # Profile Guided Optimization
3 `rustc` supports doing profile-guided optimization (PGO).
4 This chapter describes what PGO is, what it is good for, and how it can be used.
6 ## What Is Profiled-Guided Optimization?
8 The basic concept of PGO is to collect data about the typical execution of
9 a program (e.g. which branches it is likely to take) and then use this data
10 to inform optimizations such as inlining, machine-code layout,
11 register allocation, etc.
13 There are different ways of collecting data about a program's execution.
14 One is to run the program inside a profiler (such as `perf`) and another
15 is to create an instrumented binary, that is, a binary that has data
16 collection built into it, and run that.
17 The latter usually provides more accurate data and it is also what is
22 Generating a PGO-optimized program involves following a workflow with four steps:
24 1. Compile the program with instrumentation enabled
25 (e.g. `rustc -Cprofile-generate=/tmp/pgo-data main.rs`)
26 2. Run the instrumented program (e.g. `./main`) which generates a
27 `default_<id>.profraw` file
28 3. Convert the `.profraw` file into a `.profdata` file using
29 LLVM's `llvm-profdata` tool
30 4. Compile the program again, this time making use of the profiling data
31 (for example `rustc -Cprofile-use=merged.profdata main.rs`)
33 An instrumented program will create one or more `.profraw` files, one for each
34 instrumented binary. E.g. an instrumented executable that loads two instrumented
35 dynamic libraries at runtime will generate three `.profraw` files. Running an
36 instrumented binary multiple times, on the other hand, will re-use the
37 respective `.profraw` files, updating them in place.
39 These `.profraw` files have to be post-processed before they can be fed back
40 into the compiler. This is done by the `llvm-profdata` tool. This tool
41 is most easily installed via
44 rustup component add llvm-tools-preview
47 Note that installing the `llvm-tools-preview` component won't add
48 `llvm-profdata` to the `PATH`. Rather, the tool can be found in:
51 ~/.rustup/toolchains/<toolchain>/lib/rustlib/<target-triple>/bin/
54 Alternatively, an `llvm-profdata` coming with a recent LLVM or Clang
55 version usually works too.
57 The `llvm-profdata` tool merges multiple `.profraw` files into a single
58 `.profdata` file that can then be fed back into the compiler via
62 # STEP 1: Compile the binary with instrumentation
63 rustc -Cprofile-generate=/tmp/pgo-data -O ./main.rs
65 # STEP 2: Run the binary a few times, maybe with common sets of args.
66 # Each run will create or update `.profraw` files in /tmp/pgo-data
71 # STEP 3: Merge and post-process all the `.profraw` files in /tmp/pgo-data
72 llvm-profdata merge -o ./merged.profdata /tmp/pgo-data
74 # STEP 4: Use the merged `.profdata` file during optimization. All `rustc`
75 # flags have to be the same.
76 rustc -Cprofile-use=./merged.profdata -O ./main.rs
79 ### A Complete Cargo Workflow
81 Using this feature with Cargo works very similar to using it with `rustc`
82 directly. Again, we generate an instrumented binary, run it to produce data,
83 merge the data, and feed it back into the compiler. Some things of note:
85 - We use the `RUSTFLAGS` environment variable in order to pass the PGO compiler
86 flags to the compilation of all crates in the program.
88 - We pass the `--target` flag to Cargo, which prevents the `RUSTFLAGS`
89 arguments to be passed to Cargo build scripts. We don't want the build
90 scripts to generate a bunch of `.profraw` files.
92 - We pass `--release` to Cargo because that's where PGO makes the most sense.
93 In theory, PGO can also be done on debug builds but there is little reason
96 - It is recommended to use *absolute paths* for the argument of
97 `-Cprofile-generate` and `-Cprofile-use`. Cargo can invoke `rustc` with
98 varying working directories, meaning that `rustc` will not be able to find
99 the supplied `.profdata` file. With absolute paths this is not an issue.
101 - It is good practice to make sure that there is no left-over profiling data
102 from previous compilation sessions. Just deleting the directory is a simple
103 way of doing so (see `STEP 0` below).
105 This is what the entire workflow looks like:
108 # STEP 0: Make sure there is no left-over profiling data from previous runs
111 # STEP 1: Build the instrumented binaries
112 RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" \
113 cargo build --release --target=x86_64-unknown-linux-gnu
115 # STEP 2: Run the instrumented binaries with some typical data
116 ./target/x86_64-unknown-linux-gnu/release/myprogram mydata1.csv
117 ./target/x86_64-unknown-linux-gnu/release/myprogram mydata2.csv
118 ./target/x86_64-unknown-linux-gnu/release/myprogram mydata3.csv
120 # STEP 3: Merge the `.profraw` files into a `.profdata` file
121 llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data
123 # STEP 4: Use the `.profdata` file for guiding optimizations
124 RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" \
125 cargo build --release --target=x86_64-unknown-linux-gnu
130 - It is recommended to pass `-Cllvm-args=-pgo-warn-missing-function` during the
131 `-Cprofile-use` phase. LLVM by default does not warn if it cannot find
132 profiling data for a given function. Enabling this warning will make it
133 easier to spot errors in your setup.
135 - There is a [known issue](https://github.com/rust-lang/cargo/issues/7416) in
136 Cargo prior to version 1.39 that will prevent PGO from working correctly. Be
137 sure to use Cargo 1.39 or newer when doing PGO.
141 `rustc`'s PGO support relies entirely on LLVM's implementation of the feature
142 and is equivalent to what Clang offers via the `-fprofile-generate` /
143 `-fprofile-use` flags. The [Profile Guided Optimization][clang-pgo] section
144 in Clang's documentation is therefore an interesting read for anyone who wants
145 to use PGO with Rust.
147 [clang-pgo]: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization