Alex Crichton [Sun, 23 Jul 2017 15:14:38 +0000 (08:14 -0700)]
rustc: Implement ThinLTO
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
bors [Thu, 5 Oct 2017 10:50:11 +0000 (10:50 +0000)]
Auto merge of #45019 - aidanhs:aphs-no-trans-worker-panic, r=alexcrichton
Don't unwrap work item results as the panic trace is useless
Fixes #43402 now there's no multithreaded panic printouts
Also update a comment
--------
Likely regressed in #43506, where the code was changed to panic in worker threads on error.
Unwrapping gives zero extra information since the stack trace is so short, so we may as well just surface that there was an error and exit the thread properly. Because there are then no multithreaded printouts, I think it should mean the output of the test for #26199 is deterministic and not interleaved (thanks to @philipc https://github.com/rust-lang/rust/issues/43402#issuecomment-333835271 for a hint).
Sadly the output is now:
```
thread '<unnamed>' panicked at 'aborting due to worker thread panic', src/librustc_trans/back/write.rs:1643:20
note: Run with `RUST_BACKTRACE=1` for a backtrace.
error: could not write output to : No such file or directory
error: aborting due to previous error
```
but it's an improvement over the multi-panic situation before.
bors [Wed, 4 Oct 2017 19:14:41 +0000 (19:14 +0000)]
Auto merge of #44901 - michaelwoerister:on-demand-eval, r=nikomatsakis
incr.comp.: Switch to red/green change tracking, remove legacy system.
This PR finally switches incremental compilation to [red/green tracking](https://github.com/rust-lang/rust/issues/42293) and completely removes the legacy dependency graph implementation -- which includes a few quite costly passes that are simply not needed with the new system anymore.
There's still some documentation to be done and there's certainly still lots of optimizing and tuning ahead -- but the foundation for red/green is in place with this PR. This has been in the making for a long time `:)`
r? @nikomatsakis
cc @alexcrichton, @rust-lang/compiler
bors [Wed, 4 Oct 2017 15:14:15 +0000 (15:14 +0000)]
Auto merge of #44890 - nvzqz:str-box-transmute, r=alexcrichton
Remove mem::transmute used in Box<str> conversions
Given that https://github.com/rust-lang/rust/pull/44877 is failing, I decided to make a separate PR. This is done with the same motivation: to avoid `mem::transmute`-ing non `#[repr(C)]` types.
bors [Wed, 4 Oct 2017 03:28:24 +0000 (03:28 +0000)]
Auto merge of #44999 - pnkfelix:mir-borrowck-fix-assert-left-right, r=nikomatsakis
Overlapping borrows can point to different lvalues.
Overlapping borrows can point to different lvalues.
There's always a basis for the overlap, so instead of removing the assert entirely, I instead pass in the prefix that we found and check that it actually is a prefix of both lvalues.
bors [Wed, 4 Oct 2017 00:57:30 +0000 (00:57 +0000)]
Auto merge of #44979 - hinaria:master, r=dtolnay
make `backtrace = false` compile for windows targets.
when building for windows with `backtrace = false`, `libstd` fails to compile because some modules that use items from `sys_common::backtrace::*` are still included, even though those modules aren't used or referenced by anything.
`sys_common::backtrace` doesn't exist when the backtrace feature is turned off.
--
i've also added `#[cfg(feature = "backtrace")]` to various items that exist exclusively to support `mod backtrace` since the compilation would fail since they would be unused in a configuration with backtraces turned off.
bors [Tue, 3 Oct 2017 22:19:40 +0000 (22:19 +0000)]
Auto merge of #44895 - stephaneyfx:master, r=dtolnay
Made `fs::copy` return the length of the main stream
On Windows with the NTFS filesystem, `fs::copy` would return the sum of the
lengths of all streams, which can be different from the length reported by
`metadata` and thus confusing for users unaware of this NTFS peculiarity.
This makes `fs::copy` return the same length `metadata` reports which is the
value it used to return before PR #26751. Note that alternate streams are still
copied; their length is just not included in the returned value.
This change relies on the assumption that the stream with index 1 is always the
main stream in the `CopyFileEx` callback. I could not find any official
document confirming this but empirical testing has shown this to be true,
regardless of whether the alternate stream is created before or after the main
stream.
bors [Tue, 3 Oct 2017 19:38:33 +0000 (19:38 +0000)]
Auto merge of #44949 - QuietMisdreavus:rustdoctest-dirs, r=nikomatsakis
let htmldocck.py check for directories
Since i messed this up during https://github.com/rust-lang/rust/pull/44613, i wanted to codify this into the rustdoc tests to make sure that doesn't happen again.
bors [Tue, 3 Oct 2017 12:18:13 +0000 (12:18 +0000)]
Auto merge of #44896 - qmx:move-resolve-to-librustc, r=arielb1
Move monomorphize::resolve() to librustc
this moves `monomorphize::resolve(..)` to librustc, and re-enables inlining for some trait methods, fixing #44389
@nikomatsakis I've kept the calls to the new `ty::Instance::resolve(....)` always `.unwrap()`-ing for the moment, how/do you want to add more debugging info via `.unwrap_or()` or something like this?
we still have some related `resolve_*` functions on monomorphize, but I wasn't sure moving them was into the scope for this PR too.
Overlapping borrows can point to different lvalues.
There's always a basis for the overlap, so instead of removing the
assert entirely, I instead pass in the prefix that we found and check
that it actually is a prefix of both lvalues.
bors [Sun, 1 Oct 2017 17:32:34 +0000 (17:32 +0000)]
Auto merge of #44897 - Havvy:doc-size_of, r=steveklabnik
Docs for size_of::<#[repr(C)]> items.
Most of this info comes from camlorn's blog post on optimizing struct layout and the Rustonomicon.
I don't really like my wording in the first paragraph.
I also cannot find a definition of what `#[repr(C)]` does for enums that have variants with fields. They're allowed, unlike `#[repr(C)] enum`s with no variants.
bors [Sun, 1 Oct 2017 09:14:53 +0000 (09:14 +0000)]
Auto merge of #44906 - dkl:main-signature, r=nagisa
Fix native main() signature on 64bit
Hello,
in LLVM-IR produced by rustc on x86_64-linux-gnu, the native main() function had incorrect types for the function result and argc parameter: i64, while it should be i32 (really c_int). See also #20064, #29633.
So I've attempted a fix here. I tested it by checking the LLVM IR produced with --target x86_64-unknown-linux-gnu and i686-unknown-linux-gnu. Also I tried running the tests (`./x.py test`), however I'm getting two failures with and without the patch, which I'm guessing is unrelated.
Auto merge of #44944 - dbrgn:trace-macros-docs, r=QuietMisdreavus
Docs: Add trace_macros! to unstable book
As TIL'd at Rustfest :)
Note: This is unfortunately untested, since I'm on my laptop battery, and compiling LLVM would probably eat at least 50% of it on my dual core CPU. (Is there a way to build docs without compiling LLVM?)
Auto merge of #44783 - alexcrichton:lto-codegen-units, r=michaelwoerister
rustc: Enable LTO and multiple codegen units
This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little
Nikolai Vazquez [Sat, 30 Sep 2017 14:01:41 +0000 (10:01 -0400)]
Cast inner type in OsStr::bytes
The innermost type is not [u8] on all platforms but is assumed to have
the same memory layout as [u8] since this conversion was done via
mem::transmute before.
Alex Crichton [Sun, 23 Jul 2017 15:14:38 +0000 (08:14 -0700)]
rustc: Enable LTO and multiple codegen units
This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little
Mark Simulacrum [Fri, 29 Sep 2017 23:59:04 +0000 (17:59 -0600)]
Rollup merge of #44900 - Havvy:normalize-lang-attribute-spacing, r=sfackler
Normalize spaces in lang attributes.
So, like, I grepped for all `lang` attributes for *reasons* and I noticed that they all share the same spacing of `#[lang = "item_name"]` except these five instances. So I decided to fix that. So enjoy this PR of exactly ten spaces.
Mark Simulacrum [Fri, 29 Sep 2017 23:59:00 +0000 (17:59 -0600)]
Rollup merge of #44840 - steveklabnik:fix-wording, r=BurntSushi
Improve wording for StepBy
No other iterator makes the distinction between an iterator and an iterator adapter
in its summary line, so change it to be consistent with all other adapters.