Dylan DPC [Wed, 12 Feb 2020 13:21:08 +0000 (14:21 +0100)]
Rollup merge of #68994 - Keruspe:sanitizers-conflict, r=Mark-Simulacrum
rustbuild: include channel in sanitizers installed name
Allows parallel install of different rust channels.
I'm not sure if the channel is the right thing to use there, but currently both beta and nightly try to install e.g. `/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_rt.asan.a` when before (and in current stable) it used to be `/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_asan-45a4390180e83d28.rlib` which contained a hash, making it unique.
With this patch, `/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc-nightly_rt.asan.a` gets installed
Dylan DPC [Wed, 12 Feb 2020 13:21:06 +0000 (14:21 +0100)]
Rollup merge of #68914 - nnethercote:speed-up-SipHasher128, r=michaelwoerister
Speed up `SipHasher128`.
The current code in `SipHasher128::short_write` is inefficient. It uses
`u8to64_le` (which is complex and slow) to extract just the right number of
bytes of the input into a u64 and pad the result with zeroes. It then
left-shifts that value in order to bitwise-OR it with `self.tail`.
For example, imagine we have a u32 input `0xIIHH_GGFF` and only need three bytes
to fill up `self.tail`. The current code uses `u8to64_le` to construct
`0x0000_0000_00HH_GGFF`, which is just `0xIIHH_GGFF` with the `0xII` removed and
zero-extended to a u64. The code then left-shifts that value by five bytes --
discarding the `0x00` byte that replaced the `0xII` byte! -- to give
`0xHHGG_FF00_0000_0000`. It then then ORs that value with `self.tail`.
There's a much simpler way to do it: zero-extend to u64 first, then left shift.
E.g. `0xIIHH_GGFF` is zero-extended to `0x0000_0000_IIHH_GGFF`, and then
left-shifted to `0xHHGG_FF00_0000_0000`. We don't have to take time to exclude
the unneeded `0xII` byte, because it just gets shifted out anyway! It also avoids
multiple occurrences of `unsafe`.
There's a similar story with the setting of `self.tail` at the method's end.
The current code uses `u8to64_le` to extract the remaining part of the input,
but the same effect can be achieved more quickly with a right shift on the
zero-extended input.
This commit changes `SipHasher128` to use the simpler shift-based approach. The
code is also smaller, which means that `short_write` is now inlined where
previously it wasn't, which makes things faster again. This gives big
speed-ups for all incremental builds, especially "baseline" incremental
builds.
Dylan DPC [Wed, 12 Feb 2020 13:21:05 +0000 (14:21 +0100)]
Rollup merge of #67585 - ranma42:fix/char-is-ascii-codegen, r=Amanieu
Improve `char::is_ascii_*` codegen
This PR is an attempt to fix https://github.com/rust-lang/rust/issues/65127
A couple of warnings:
1. the generated code might be further improved (in LLVM and/or MIR) by emitting better comparison sequences; in particular, this would improve the performance of "complex" checks such as those in `is_ascii_punctuation`
2. the second commit is currently marked "DO NOT MERGE", because it regresses SIMD on `u8` slices; this could likely be fixed by improving the computation/usage of demanded bits in LLVM
An alternative approach to remove the code duplication might be the use of macros, but currently most of the duplication is actually in the doc comments, so maybe just keeping the redundancy could be ok
Yuki Okushi [Wed, 12 Feb 2020 09:55:48 +0000 (18:55 +0900)]
Rollup merge of #69058 - TimDiekmann:box, r=Amanieu
Preparation for allocator aware `Box`
This cleans up the `Box` code a bit, and uses `Box::from_raw(ptr)` instead of `Box(ptr)`.
Additionally, `box_free` and `exchange_malloc` now uses the `AllocRef` trait and a comment was added on how `box_free` is tied to `Box`.
This a preparation for an upcoming PR, which makes `Box` aware of an allocator.
Yuki Okushi [Wed, 12 Feb 2020 09:55:46 +0000 (18:55 +0900)]
Rollup merge of #69027 - TimDiekmann:zeroed-alloc, r=Amanieu
Add missing `_zeroed` varants to `AllocRef`
The majority of the allocator wg has decided to add the missing `_zeroed` variants to `AllocRef`:
> these should be added since they can be efficiently implemented with the `mremap` system call on Linux. `mremap` allows you to move/grow/shrink a memory mapping, and any new pages added for growth are guaranteed to be zeroed.
>
> If `AllocRef` does not have these methods then the user will have to manually write zeroes to the added memory since the API makes no guarantees on their contents.
For the full discussion please see https://github.com/rust-lang/wg-allocators/issues/14.
This PR provides default implementations for `realloc_zeroed`, `alloc_excess_zeroed`, `realloc_excess_zeroed`, and `grow_in_place_zeroed`.
Yuki Okushi [Wed, 12 Feb 2020 09:55:44 +0000 (18:55 +0900)]
Rollup merge of #69026 - TimDiekmann:common-usage, r=Amanieu
Remove common usage pattern from `AllocRef`
This removes the common usage patterns from `AllocRef`:
- `alloc_one`
- `dealloc_one`
- `alloc_array`
- `realloc_array`
- `dealloc_array`
Actually, they add nothing to `AllocRef` except a [convenience wrapper around `Layout` and other methods in this trait](https://doc.rust-lang.org/1.41.0/src/core/alloc.rs.html#1076-1240) but have a major flaw: The documentation of `AllocRefs` notes, that
> some higher-level allocation methods (`alloc_one`, `alloc_array`) are well-defined on zero-sized types and can optionally support them: it is left up to the implementor whether to return `Err`, or to return `Ok` with some pointer.
With the current API, `GlobalAlloc` does not have those methods, so they cannot be overridden for `liballoc::Global`, which means that even if the global allocator would support zero-sized allocations, `alloc_one`, `alloc_array`, and `realloc_array` for `liballoc::Global` will error, while calling `alloc` with a zeroed-size `Layout` could succeed. Even worse: allocating with `alloc` and deallocating with `dealloc_{one,array}` could end up with not calling `dealloc` at all!
For the full discussion please see https://github.com/rust-lang/wg-allocators/issues/18
Yuki Okushi [Wed, 12 Feb 2020 09:55:41 +0000 (18:55 +0900)]
Rollup merge of #68947 - chrissimpkins:python-fmt, r=alexcrichton
Python script PEP8 style guide space formatting and minor Python source cleanup
This PR includes the following changes in the Python sources based on a flake8 3.7.9 (mccabe: 0.6.1, pycodestyle: 2.5.0, pyflakes: 2.1.1) CPython 3.7.6 on Darwin lint:
- PEP8 style guide spacing updates *without* line length changes
- removal of unused local variable assignments in context managers and exception handling
- removal of unused Python import statements
- removal of unnecessary semicolons
Yuki Okushi [Wed, 12 Feb 2020 09:55:36 +0000 (18:55 +0900)]
Rollup merge of #68487 - 0dvictor:nolink, r=tmandry
[experiment] Support linking from a .rlink file
Flag `-Z no-link` was previously introduced, which allows creating an `.rlink` file to perform compilation without linking. This change enables linking from an `.rlink` file.
This makes it faster and also changes it to a safe function. (Thanks to
Michael Woerister for the suggestion.) `load_int_le!` is also no longer
necessary.
Dylan DPC [Tue, 11 Feb 2020 15:37:04 +0000 (16:37 +0100)]
Rollup merge of #69047 - ehuss:rustfmt-vendor, r=Centril
Don't rustfmt check the vendor directory.
I need to be able to run `x.py tidy` to do license checks (which requires vendored dependencies). However, when vendoring is enabled, it wants to rustfmt check the entire vendor directory, which doesn't work.
Dylan DPC [Tue, 11 Feb 2020 15:37:02 +0000 (16:37 +0100)]
Rollup merge of #69044 - jonas-schievink:dont-run-coherence-twice, r=davidtwco
Don't run coherence twice for future-compat lints
This fixes the regression introduced by https://github.com/rust-lang/rust/pull/65232 (which I mentioned in https://github.com/rust-lang/rust/pull/65232#issuecomment-583739037).
Old algorithm:
* Run coherence with all future-incompatible checks off, reporting errors on any overlap.
* If there's no overlap (common case), run it *again*, with the future-incompatible checks on. Report warnings for any overlap found.
New algorithm:
* Run coherence with all additional future-incompatible checks *on*, which means that we'll find *all* potentially overlapping impls immediately.
* If this found overlap, run coherence again, with the future-incompatible checks off. If that *still* gives an error, we report it. If not, it ought to be a warning.
This reduces time spent in coherence checking for the nrf52810-pac by roughly 50% relative to current master.
Dylan DPC [Tue, 11 Feb 2020 15:36:54 +0000 (16:36 +0100)]
Rollup merge of #66498 - bjorn3:less_feature_flags, r=Dylan-DPC
Remove unused feature gates
I think many of the remaining unstable things can be easily be replaced with stable things. I have kept the `#![feature(nll)]` even though it is only necessary in `libstd`, to make regressions of it harder.
bors [Tue, 11 Feb 2020 14:45:00 +0000 (14:45 +0000)]
Auto merge of #68725 - jumbatm:invert-control-in-struct_lint_level, r=Centril
Invert control in struct_lint_level.
Closes #67927
Changes the `struct_lint*` methods to take a `decorate` function instead of a message string. This decorate function is also responsible for eventually stashing, emitting or cancelling the diagnostic. If the lint was allowed after all, the decorate function is not run at all, saving us from spending time formatting messages (and potentially other expensive work) for lints that don't end up being emitted.
jumbatm [Sun, 2 Feb 2020 09:41:14 +0000 (19:41 +1000)]
Make cx.span_lint methods lazy
- Make report_unsafe take decorate function
- Remove span_lint, replacing calls with struct_span_lint, as caller is
now responsible for emitting.
- Remove lookup_and_emit, replacing with just lookup which takes a
decorate function.
- Remove span_lint_note, span_lint_help. These methods aren't easily
made lazy as standalone methods, private, and unused. If this
functionality is needed, to be lazy, they can easily be made into
Fn(&mut DiagnosticBuilder) that are meant to be called _within_ the
decorate function.
- Rename lookup_and_emit_with_diagnostics to lookup_with_diagnostics to
better reflect the fact that it doesn't emit for you.
Andrea Canciani [Tue, 24 Dec 2019 11:47:51 +0000 (12:47 +0100)]
Improve `char::is_ascii_*` code
These methods explicitly check if a char is in a specific ASCII range,
therefore the `is_ascii()` check is not needed, but LLVM seems to be
unable to remove it.
WARNING: this change improves the performance on ASCII `char`s, but
complex checks such as `is_ascii_punctuation` become slower on
non-ASCII `char`s.
Victor Ding [Thu, 23 Jan 2020 10:48:48 +0000 (21:48 +1100)]
Support linking from a .rlink file
Flag `-Z no-link` was previously introduced, which allows creating
an `.rlink` file to perform compilation without linking.
This change enables linking from an `.rlink` file.
bors [Tue, 11 Feb 2020 07:36:59 +0000 (07:36 +0000)]
Auto merge of #68961 - eddyb:dbg-stack-dunk, r=nagisa
rustc_codegen_ssa: only "spill" SSA-like values to the stack for debuginfo.
This is an implementation of the idea described in https://github.com/rust-lang/rust/issues/68817#issuecomment-583719182.
In short, instead of debuginfo forcing otherwise-SSA-like MIR locals into `alloca`s, and requiring a `load` for each use (or two, for scalar pairs), the `alloca` is now *only* used for attaching debuginfo with `llvm.dbg.declare`: the `OperandRef` is stored to the `alloca`, but *never loaded* from it.
Outside of `debug_introduce_local`, nothing cares about the debuginfo-only `alloca`, and instead works with `OperandRef` the same as MIR locals without debuginfo before this PR.
This should have some of the benefits of `llvm.dbg.value`, while working today.
Changes:
````
Update jobserver.
Update tar.
Emit report on error with Ztimings.
Do not run `formats_source` if `rustfmt` is not available
Fix rebuild_sub_package_then_while_package on HFS.
Remove likely brittle test.
Install rustfmt for testing in CI
Fix build-std collisions.
Fix BuildScriptOutput when a build script is run multiple times.
Fix required-features using renamed dependencies.
Fix using global options before an alias.
Update changelog for 1.42.
Bump to 0.44.0.
Log rustfmt output if it fails; also do not check that rustfmt exists
Update pretty_env_logger requirement from 0.3 to 0.4
Swap std::sync::mpsc channel with crossbeam_channel
Fix tests on Linux/MacOS
Fix typo.
Add tests
Deduplicate warnings about missing rustfmt
Log entry 2: first implementation
Refactor code
Stabilize config-profile.
Log entry 1
Format code
Remove tempdir after install
Keep existing package with git install
Use non-ephemeral workspace
Test that git install reads virtual manifest
Fix failing test
Search for root manifest with ephemeral workspaces
Support out-dir in build section of Cargo configuration file
````
This repr-hint makes a struct/enum hide any niche within from its
surrounding type-construction context.
It is meant (at least initially) as an implementation detail for
resolving issue 68303. We will not stabilize the repr-hint unless
someone finds motivation for doing so.
(So, declaration of `no_niche` feature lives in section of file
where other internal implementation details are grouped, and
deliberately leaves out the tracking issue number.)
incorporated review feedback, and fixed post-rebase.
Dylan DPC [Mon, 10 Feb 2020 16:29:03 +0000 (17:29 +0100)]
Rollup merge of #69014 - dwrensha:fix-68890, r=Centril
change an instance of span_bug() to struct_span_err() to avoid ICE
After #67148, the `span_bug()` in `parse_ty_tuple_or_parens()` is reachable because `parse_paren_comma_seq()` can return an `Ok()` even in cases where it encounters an error.
This pull request prevents an ICE in such cases by replacing the `span_bug()` with `struct_span_error()`.
Dylan DPC [Mon, 10 Feb 2020 16:28:57 +0000 (17:28 +0100)]
Rollup merge of #68932 - michaelwoerister:self-profile-generic-activity-args, r=wesleywiser
self-profile: Support arguments for generic_activities.
This PR adds support for recording arguments of "generic activities". The most notable use case is LLVM module names, which should be very interesting for `crox` profiles. In the future it might be interesting to add more fine-grained events for pre-query passes like macro expansion.
I tried to judiciously de-duplicate existing self-profile events with `extra_verbose_generic_activity`, now that the latter also generates self-profile events.
Dylan DPC [Mon, 10 Feb 2020 16:28:56 +0000 (17:28 +0100)]
Rollup merge of #68908 - jwhite927:E0637, r=Dylan-DPC
Add long error code explanation message for E0637
Reference issue [#61137](https://github.com/rust-lang/rust/issues/61137)
To incorporate a long error description for E0637, I have made the necessary modification to error_codes.rs and added error_codes/E0637.md, and blessed the relevant .stderror files. ~~, however when I build rustc stage 1, I am unable to make `$ rustc --explain E0637` work even though rustc appears to be able to call up the long error explanations for other errors. I wanted to guarantee this would work before moving on the blessing the various ui tests that have been affected. @GuillaumeGomez Do you know the most likely reason(s) why this would be the case?~~
Update: `$ rustc --explain E0637` works now.
bors [Mon, 10 Feb 2020 15:24:59 +0000 (15:24 +0000)]
Auto merge of #68835 - CAD97:sound-range-inclusive, r=Mark-Simulacrum
Remove problematic specialization from RangeInclusive
Fixes #67194 using the approach [outlined by Mark-Simulacrum](https://github.com/rust-lang/rust/issues/67194#issuecomment-581669549).
> I believe the property we want is that if `PartialEq(&self, &other) == true`, then `self.next() == other.next()`. It is true that this is satisfied by removing the specialization and always doing `is_empty.unwrap_or_default()`; the "wrong" behavior there arises from calling `next()` having an effect on initially empty ranges, as we should be in `is_empty = true` but are not (yet) there. It might be possible to detect that the current state is always empty (i.e., `start > end`) and then not fill in the empty slot. I think this might solve the problem without regressing tests; however, this could have performance implications.
> That approach essentially states that we only use the `is_empty` slot for cases where `start <= end`. That means that `Idx: !Step` and `start > end` would both behave the same, and correctly -- we do not need the boolean if we're not ever going to emit any values from the iterator.
This is implemented here by replacing the `is_empty: Option<bool>` slot with an `exhausted: bool` slot. This flag is
- `false` upon construction,
- `false` when iteration has not yielded an element -- importantly, this means it is always `false` for an iterator empty by construction,
- `false` when iteration has yielded an element and the iterator is not exhausted, and
- only `true` when iteration has been used to exhaust the iterator.
For completeness, this also adds a note to the `Debug` representation to note when the range is exhausted.
bors [Mon, 10 Feb 2020 10:47:39 +0000 (10:47 +0000)]
Auto merge of #69012 - Dylan-DPC:rollup-13qn0fq, r=Dylan-DPC
Rollup of 6 pull requests
Successful merges:
- #68694 (Reduce the number of `RefCell`s in `InferCtxt`.)
- #68966 (Improve performance of coherence checks)
- #68976 (Make `num::NonZeroX::new` an unstable `const fn`)
- #68992 (Correctly parse `mut a @ b`)
- #69005 (Small graphviz improvements for the new dataflow framework)
- #69006 (parser: Keep current and previous tokens precisely)
The current code in `SipHasher128::short_write` is inefficient. It uses
`u8to64_le` (which is complex and slow) to extract just the right number of
bytes of the input into a u64 and pad the result with zeroes. It then
left-shifts that value in order to bitwise-OR it with `self.tail`.
For example, imagine we have a u32 input 0xIIHH_GGFF and only need three bytes
to fill up `self.tail`. The current code uses `u8to64_le` to construct
0x0000_0000_00HH_GGFF, which is just 0xIIHH_GGFF with the 0xII removed and
zero-extended to a u64. The code then left-shifts that value by five bytes --
discarding the 0x00 byte that replaced the 0xII byte! -- to give
0xHHGG_FF00_0000_0000. It then then ORs that value with self.tail.
There's a much simpler way to do it: zero-extend to u64 first, then left shift.
E.g. 0xIIHH_GGFF is zero-extended to 0x0000_0000_IIHH_GGFF, and then
left-shifted to 0xHHGG_FF00_0000_0000. We don't have to take time to exclude
the unneeded 0xII byte, because it just gets shifted out anyway! It also avoids
multiple occurrences of `unsafe`.
There's a similar story with the setting of `self.tail` at the method's end.
The current code uses `u8to64_le` to extract the remaining part of the input,
but the same effect can be achieved more quickly with a right shift on the
zero-extended input.
All that works on little-endian. It doesn't work for big-endian, but we
can just do a `to_le` before calling `short_write` and then it works.
This commit changes `SipHasher128` to use the simpler shift-based approach. The
code is also smaller, which means that `short_write` is now inlined where
previously it wasn't, which makes things faster again. This gives big
speed-ups for all incremental builds, especially "baseline" incremental
builds.
Dylan DPC [Mon, 10 Feb 2020 00:54:21 +0000 (01:54 +0100)]
Rollup merge of #69005 - ecstatic-morse:unified-dataflow-graphviz, r=wesleywiser
Small graphviz improvements for the new dataflow framework
Split out from #68241. This prints the correct effect diff for each line (the before-effect and the after-effect) and makes marginal improvements to the graphviz output for the new dataflow framework including using monospaced font and better line breaking. This will only be useful when #68241 is merged and the graphviz output changes from this:
Dylan DPC [Mon, 10 Feb 2020 00:54:17 +0000 (01:54 +0100)]
Rollup merge of #68966 - jonas-schievink:coherence-perf, r=varkor
Improve performance of coherence checks
The biggest perf improvement in here is expected to come from the removal of the remaining #43355 warning code since that effectively runs the expensive parts of coherence *twice*, which can even be visualized by obtaining a flamegraph:
Dylan DPC [Mon, 10 Feb 2020 00:54:15 +0000 (01:54 +0100)]
Rollup merge of #68694 - nnethercote:reduce-RefCells-in-InferCtxt, r=varkor
Reduce the number of `RefCell`s in `InferCtxt`.
`InferCtxt` contains six structures within `RefCell`s. Every time we
create and dispose of (commit or rollback) a snapshot we have to
`borrow_mut` each one of them.
This commit moves the six structures under a single `RefCell`, which
gives significant speed-ups by reducing the number of `borrow_mut`
calls. To avoid runtime errors I had to reduce the lifetimes of dynamic
borrows in a couple of places.