bors [Thu, 9 Dec 2021 15:01:42 +0000 (15:01 +0000)]
Auto merge of #85157 - the8472:drain-drop-in-place, r=Mark-Simulacrum
replace vec::Drain drop loops with drop_in_place
The `Drain::drop` implementation came up in https://github.com/rust-lang/rust/pull/82185#issuecomment-789584796 as potentially interfering with other optimization work due its widespread use somewhere in `println!`
bors [Thu, 9 Dec 2021 07:08:32 +0000 (07:08 +0000)]
Auto merge of #91692 - matthiaskrgr:rollup-u7dvh0n, r=matthiaskrgr
Rollup of 6 pull requests
Successful merges:
- #87599 (Implement concat_bytes!)
- #89999 (Update std::env::temp_dir to use GetTempPath2 on Windows when available.)
- #90796 (Remove the reg_thumb register class for asm! on ARM)
- #91042 (Use Vec extend instead of repeated pushes on several places)
- #91634 (Do not attempt to suggest help for overly malformed struct/function call)
- #91685 (Install llvm tools to sysroot when assembling local toolchain)
Matthias Krüger [Thu, 9 Dec 2021 04:08:34 +0000 (05:08 +0100)]
Rollup merge of #91685 - Aaron1011:install-llvm-tools, r=Mark-Simulacrum
Install llvm tools to sysroot when assembling local toolchain
Some projects (e.g. the `bootimage` crate) may require the
user to install the `llvm-tools-preview` rustup component.
However, this cannot be easily done with a locally built toolchain.
To allow a local toolchain to be used a drop-in replacement for
a normal rustup toolchain in more cases, this PR copies the built
LLVM tools to the sysoot. From the perspective a tool looking
at the sysroot, this is equivalent to installing `llvm-tools-preview`.
Matthias Krüger [Thu, 9 Dec 2021 04:08:33 +0000 (05:08 +0100)]
Rollup merge of #91042 - Kobzol:vec-extend-cleanup, r=nagisa
Use Vec extend instead of repeated pushes on several places
Inspired by https://github.com/rust-lang/rust/pull/90813, I tried to use a simple regex (`for .*in.*\{\n.*push\(.*\);\n\s+}`) to search for more places that would use `Vec::push` in a loop and replace them with `Vec::extend`.
These probably won't have as much perf. impact as the original PR (if any), but it would probably be better to do a perf run to see if there are not any regressions.
Matthias Krüger [Thu, 9 Dec 2021 04:08:31 +0000 (05:08 +0100)]
Rollup merge of #89999 - talagrand:GetTempPath2, r=m-ou-se
Update std::env::temp_dir to use GetTempPath2 on Windows when available.
As a security measure, Windows 11 introduces a new temporary directory API, GetTempPath2.
When the calling process is running as SYSTEM, a separate temporary directory
will be returned inaccessible to non-SYSTEM processes. For non-SYSTEM processes
the behavior will be the same as before.
This can help mitigate against attacks such as this one:
https://medium.com/csis-techblog/cve-2020-1088-yet-another-arbitrary-delete-eop-a00b97d8c3e2
Compatibility risk: Software which relies on temporary files to communicate between SYSTEM and non-SYSTEM
processes may be affected by this change. In many cases, such patterns may be vulnerable to the very
attacks the new API was introduced to harden against.
I'm unclear on the Rust project's tolerance for such change-of-behavior in the standard library. If anything,
this PR is meant to raise awareness of the issue and hopefully start the conversation.
How tested: Taking the example code from the documentation and running it through psexec (from SysInternals) on
Win10 and Win11.
On Win10:
C:\test>psexec -s C:\test\main.exe
<...>
Temporary directory: C:\WINDOWS\TEMP\
On Win11:
C:\test>psexec -s C:\test\main.exe
<...>
Temporary directory: C:\Windows\SystemTemp\
Matthias Krüger [Thu, 9 Dec 2021 04:08:30 +0000 (05:08 +0100)]
Rollup merge of #87599 - Smittyvb:concat_bytes, r=Mark-Simulacrum
Implement concat_bytes!
This implements the unstable `concat_bytes!` macro, which has tracking issue #87555. It can be used like:
```rust
#![feature(concat_bytes)]
fn main() {
assert_eq!(concat_bytes!(), &[]);
assert_eq!(concat_bytes!(b'A', b"BC", [68, b'E', 70]), b"ABCDEF");
}
```
If strings or characters are used where byte strings or byte characters are required, it suggests adding a `b` prefix. If a number is used outside of an array it suggests arrayifying it. If a boolean is used it suggests replacing it with the numeric value of that number. Doubly nested arrays of bytes are disallowed.
Notice that `return` and `yield` already follow the same approach as this PR of printing the space *before* each additional piece following the keyword, rather than *after* each thing.
Matthias Krüger [Thu, 9 Dec 2021 04:02:19 +0000 (05:02 +0100)]
Rollup merge of #91042 - Kobzol:vec-extend-cleanup, r=nagisa
Use Vec extend instead of repeated pushes on several places
Inspired by https://github.com/rust-lang/rust/pull/90813, I tried to use a simple regex (`for .*in.*\{\n.*push\(.*\);\n\s+}`) to search for more places that would use `Vec::push` in a loop and replace them with `Vec::extend`.
These probably won't have as much perf. impact as the original PR (if any), but it would probably be better to do a perf run to see if there are not any regressions.
bors [Thu, 9 Dec 2021 00:55:49 +0000 (00:55 +0000)]
Auto merge of #91677 - matthiaskrgr:rollup-yiczced, r=matthiaskrgr
Rollup of 5 pull requests
Successful merges:
- #91245 (suggest casting between i/u32 and char)
- #91337 (Add a suggestion if `macro_rules` is misspelled)
- #91534 (Make rustdoc headings black, and markdown blue)
- #91637 (Add test for packed drops in generators)
- #91667 (Fix indent of itemTypes in search.js)
Failed merges:
- #91568 (Pretty print break and continue without redundant space)
Aaron Hill [Thu, 9 Dec 2021 00:35:53 +0000 (18:35 -0600)]
Install llvm tools to sysroot when assembling local toolchain
Some projects (e.g. the `bootimage` crate) may require the
user to install the `llvm-tools-preview` rustup component.
However, this cannot be easily done with a locally built toolchain.
To allow a local toolchain to be used a drop-in replacement for
a normal rustup toolchain in more cases, this PR copies the built
LLVM tools to the sysoot. From the perspective a tool looking
at the sysroot, this is equivalent to installing `llvm-tools-preview`.
(Note: we may want to make rustdoc headings and markdown headings the same color -- #90245 -- but we would want to do that intentionally; this is fixing up a change that did so accidentally)
Matthias Krüger [Wed, 8 Dec 2021 22:18:03 +0000 (23:18 +0100)]
Rollup merge of #91245 - cameron1024:suggest-i32-u32-char-cast, r=nagisa
suggest casting between i/u32 and char
As discussed in https://github.com/rust-lang/rust/issues/91063 , this adds a suggestion for converting between i32/u32 <-> char with `as`, and a short explanation for why this is safe
bors [Wed, 8 Dec 2021 18:45:03 +0000 (18:45 +0000)]
Auto merge of #91665 - matthiaskrgr:rollup-o3wnkam, r=matthiaskrgr
Rollup of 7 pull requests
Successful merges:
- #90709 (Only shown relevant type params in E0283 label)
- #91551 (Allow for failure of subst_normalize_erasing_regions in const_eval)
- #91570 (Evaluate inline const pat early and report error if too generic)
- #91571 (Remove unneeded access to pretty printer's `s` field in favor of deref)
- #91610 (Link to rustdoc_json_types docs instead of rustdoc-json RFC)
- #91619 (Update cargo)
- #91630 (Add missing whitespace before disabled HTML attribute)
Matthias Krüger [Wed, 8 Dec 2021 15:08:11 +0000 (16:08 +0100)]
Rollup merge of #91619 - ehuss:update-cargo, r=ehuss
Update cargo
8 commits in 294967c53f0c70d598fc54ca189313c86c576ea7..40dc281755137ee804bc9b3b08e782773b726e44
2021-11-29 19:04:22 +0000 to 2021-12-06 21:54:44 +0000
- Unify the description of quiet flag (rust-lang/cargo#10168)
- Stabilize future-incompat-report (rust-lang/cargo#10165)
- Support abbreviating `--release` as `-r` (rust-lang/cargo#10133)
- doc: nudge towards simple version requirements (rust-lang/cargo#10158)
- Upgrade clap to 2.34.0 (rust-lang/cargo#10164)
- Treat EOPNOTSUPP the same as ENOTSUP when ignoring failed flock calls. (rust-lang/cargo#10157)
- Add note about RUSTFLAGS removal from build scripts. (rust-lang/cargo#10141)
- Make clippy happy (rust-lang/cargo#10139)
Matthias Krüger [Wed, 8 Dec 2021 15:08:10 +0000 (16:08 +0100)]
Rollup merge of #91610 - aDotInTheVoid:patch-2, r=GuillaumeGomez
Link to rustdoc_json_types docs instead of rustdoc-json RFC
The JSON format has had [many changes](https://github.com/rust-lang/rust/commits/master/src/rustdoc-json-types) since the RFC, so the rustdoc output is the only up to date reference
Matthias Krüger [Wed, 8 Dec 2021 15:08:09 +0000 (16:08 +0100)]
Rollup merge of #91571 - dtolnay:printerderef, r=Mark-Simulacrum
Remove unneeded access to pretty printer's `s` field in favor of deref
I found it taxing in some of my recent PRs touching the pretty printer to maintain consistency with the surrounding code, since the current code is all over the place about whether it uses `self.s.…()` or `self.…()` for invoking methods of `rustc_ast_pretty::pp::Printer`.
This PR standardizes on `self.…()` — relying on the `Deref` and `DerefMut` impls introduced by [#62532](https://github.com/rust-lang/rust/pull/62532/commits/cab453250a3ceae5cf0cf7eac836c03b37e4ca8e).
Using associated types that cannot be normalized previously resulted in an ICE. We now allow for normalization failure and return a "TooGeneric" error in that case.
Matthias Krüger [Wed, 8 Dec 2021 15:08:06 +0000 (16:08 +0100)]
Rollup merge of #90709 - estebank:erase-known-type-params, r=nagisa
Only shown relevant type params in E0283 label
When we point at a binding to suggest giving it a type, erase all the
type for ADTs that have been resolved, leaving only the ones that could
not be inferred. For small shallow types this is not a problem, but for
big nested types with lots of params, this can otherwise cause a lot of
unnecessary visual output.
bors [Wed, 8 Dec 2021 14:58:48 +0000 (14:58 +0000)]
Auto merge of #91604 - nikic:section-flags, r=nagisa
Use object crate for .rustc metadata generation
We already use the object crate for generating uncompressed .rmeta
metadata object files. This switches the generation of compressed
.rustc object files to use the object crate as well. These have
slightly different requirements in that .rmeta should be completely
excluded from any final compilation artifacts, while .rustc should
be part of shared objects, but not loaded into memory.
The primary motivation for this change is #90326: In LLVM 14, the
current way of setting section flags (and in particular, preventing
the setting of SHF_ALLOC) will no longer work. There are other ways
we could work around this, but switching to the object crate seems
like the most elegant, as we already use it for .rmeta, and as it
makes this independent of the codegen backend. In particular, we
don't need separate handling in codegen_llvm and codegen_gcc.
codegen_cranelift should be able to reuse the implementation as
well, though I have omitted that here, as it is not based on
codegen_ssa.
This change mostly extracts the existing code for .rmeta handling
to allow using it for .rustc as well, and adjusts the codegen
infrastructure to handle the metadata object file separately: We
no longer create a backend-specific module for it, and directly
produce the compiled module instead.
This does not `fix` #90326 by itself yet, as .llvmbc will need to be
handled separately.
bors [Wed, 8 Dec 2021 11:22:02 +0000 (11:22 +0000)]
Auto merge of #91656 - matthiaskrgr:rollup-lk96y6d, r=matthiaskrgr
Rollup of 7 pull requests
Successful merges:
- #83744 (Deprecate crate_type and crate_name nested inside #![cfg_attr])
- #90550 (Update certificates in some Ubuntu 16 images.)
- #91272 (Print a suggestion when comparing references to primitive types in `const fn`)
- #91467 (Emphasise that an OsStr[ing] is not necessarily a platform string)
- #91531 (Do not add `;` to expected tokens list when it's wrong)
- #91577 (Address some FIXMEs left over from #91475)
- #91638 (Remove `in_band_lifetimes` from `rustc_mir_transform`)
Matthias Krüger [Wed, 8 Dec 2021 10:09:01 +0000 (11:09 +0100)]
Rollup merge of #91638 - scottmcm:less-inband-2-of-28, r=petrochenkov
Remove `in_band_lifetimes` from `rustc_mir_transform`
Like #91580, this was inspired by the conversation in #44524 about possibly removing the feature from the compiler. This crate is a heavy `'tcx` user, so is a nice case study.
r? ``@petrochenkov``
Three interesting ones:
This one had the `'tcx` declared on the function, despite the trait taking a `'tcx`:
```diff
-impl Visitor<'_> for UsedLocals {
+impl<'tcx> Visitor<'tcx> for UsedLocals {
fn visit_statement(&mut self, statement: &Statement<'tcx>, location: Location) {
```
This one use in-band for one, and underscore for the other:
```diff
-pub fn remove_dead_blocks(tcx: TyCtxt<'tcx>, body: &mut Body<'_>) {
+pub fn remove_dead_blocks<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
```
Matthias Krüger [Wed, 8 Dec 2021 10:09:00 +0000 (11:09 +0100)]
Rollup merge of #91577 - ecstatic-morse:mir-pass-manager-cleanup, r=oli-obk
Address some FIXMEs left over from #91475
This shouldn't change behavior, only clarify what we're currently doing. I filed #91576 to see if the treatment of generator drop shims is intentional.
Matthias Krüger [Wed, 8 Dec 2021 10:08:58 +0000 (11:08 +0100)]
Rollup merge of #91467 - ChrisDenton:confusing-os-string, r=Mark-Simulacrum
Emphasise that an OsStr[ing] is not necessarily a platform string
Fixes #53261
Since that issue was filed, #56141 added a further clarification to the `OsString` docs. However the ffi docs may still leave the impression that an `OsStr` is in the platform native form. This PR aims to further emphasise that an `OsStr` is not necessarily a platform string.
Matthias Krüger [Wed, 8 Dec 2021 10:08:56 +0000 (11:08 +0100)]
Rollup merge of #90550 - ehuss:update-ca, r=Mark-Simulacrum
Update certificates in some Ubuntu 16 images.
These images use crosstool-ng, which needs to download various things off the internet. The certificate for `www.kernel.org` no longer works with the ca-certificates in Ubuntu 16. This resolves the issue by grabbing from a newer image a certificate bundle from https://curl.se/ca/cacert.pem, which is usually somewhat up to date.
Matthias Krüger [Wed, 8 Dec 2021 10:08:55 +0000 (11:08 +0100)]
Rollup merge of #83744 - bjorn3:deprecate_cfg_attr_crate_type_name, r=Mark-Simulacrum
Deprecate crate_type and crate_name nested inside #![cfg_attr]
This implements the proposal in https://github.com/rust-lang/rust/pull/83676#issuecomment-811213956, with a future compatibility lint imposed on usage of crate_type/crate_name inside cfg's.
This is a compromise between removing `#![crate_type]` and `#![crate_name]` completely and keeping them as a whole, which requires somewhat of a hack in rustc and is impossible to support by gcc-rust. By only removing `#![crate_type]` and `#![crate_name]` nested inside `#![cfg_attr]` it becomes possible to parse them before a big chunk of the compiler has started.
```rust
#![crate_type = "lib"] // remains working
#![cfg_attr(foo, crate_type = "bin")] // will stop working
```
# Rationale
As it currently is it is possible to try to access the stable crate id before it is actually set, which will panic. The fact that the Session contains mutable state beyond debugging things also doesn't completely sit well with me. Especially once parallel rustc becomes the default.
I think there is currently also a cyclic dependency where you need to set the stable crate id to be able to load crates, but you need to load crates to expand proc macro attributes that may define #![crate_name] or #![crate_type]. Currently crate level proc macro attributes are unstable or completely unsupported (can't remember which), so this is not a problem, but it may become an issue in the future.
Finally if we want to add incremental compilation to macro expansion or even parsing, we need the StableCrateId to be created together with the Session or even earlier as incremental compilation determines the incremental compilation session dir based on the StableCrateId.
Scott McMurray [Mon, 6 Dec 2021 08:48:37 +0000 (00:48 -0800)]
Remove `in_band_lifetimes` from `rustc_mir_transform`
This one is a heavy `'tcx` user.
Two interesting ones:
This one had the `'tcx` declared on the function, despite the trait taking a `'tcx`:
```diff
-impl Visitor<'_> for UsedLocals {
+impl<'tcx> Visitor<'tcx> for UsedLocals {
fn visit_statement(&mut self, statement: &Statement<'tcx>, location: Location) {
```
This one use in-band for one, and underscore for the other:
```diff
-pub fn remove_dead_blocks(tcx: TyCtxt<'tcx>, body: &mut Body<'_>) {
+pub fn remove_dead_blocks<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
```
bors [Wed, 8 Dec 2021 04:46:39 +0000 (04:46 +0000)]
Auto merge of #91500 - ehuss:update-mdbook, r=Mark-Simulacrum
Update mdbook
This just includes a few minor fixes:
* https://github.com/rust-lang/mdBook/blob/master/CHANGELOG.md#mdbook-0413
* https://github.com/rust-lang/mdBook/blob/master/CHANGELOG.md#mdbook-0414
bors [Wed, 8 Dec 2021 01:37:59 +0000 (01:37 +0000)]
Auto merge of #91484 - workingjubilee:simd-remove-autosplats, r=Mark-Simulacrum
Sync portable-simd to remove autosplats
This PR syncs portable-simd in up to https://github.com/rust-lang/portable-simd/commit/a8385522ade6f67853edac730b5bf164ddb298fd in order to address the type inference breakages documented on nightly in https://github.com/rust-lang/rust/issues/90904 by removing the vector + scalar binary operations (called "autosplats", "broadcasting", or "rank promotion", depending on who you ask) that allow `{scalar} + &'_ {scalar}` to fail in some cases, because it becomes possible the programmer may have meant `{scalar} + &'_ {vector}`.
A few quality-of-life improvements make their way in as well:
- Lane counts can now go to 64, as LLVM seems to have fixed their miscompilation for those.
- `{i,u}8x64` to `__m512i` is now available.
- a bunch of `#[must_use]` notes appear throughout the module.
- Some implementations, mostly instances of `impl core::ops::{Op}<Simd> for Simd` that aren't `{vector} + {vector}` (e.g. `{vector} + &'_ {vector}`), leverage some generics and `where` bounds now to make them easier to understand by reducing a dozen implementations into one (and make it possible for people to open the docs on less burly devices).
- And some internal-only improvements.
None of these changes should affect a beta backport, only actual users of `core::simd` (and most aren't even visible in the programmatic sense), though I can extract an even more minimal changeset for beta if necessary. It seemed simpler to just keep moving forward.
bors [Tue, 7 Dec 2021 21:50:46 +0000 (21:50 +0000)]
Auto merge of #91407 - the8472:deserialize-unchecked-utf8, r=michaelwoerister
Avoid string validation in rustc_serialize, check a marker byte instead
Since the serialization format isn't self-describing we need a way to detect when encoder and decoder don't match up. But for strings it doesn't have to be utf8 validation, which currently does cost a few percent of performance.
Instead we can use a marker byte at the end to be reasonably sure that we're dealing with a string and it wasn't overwritten in some way.
2 commits in c0f222da23568477155991d391c9ce918e381351..954f3d441ad880737a13e241108f791a4d2a38cd
2021-11-22 10:30:57 -0800 to 2021-11-29 11:11:30 -0800
- Say that bare trait objects are rejected in the 2021 edition (rust-lang/reference#1111)
- Update 'Subtyping and Variance' example to use `dyn Trait` syntax (rust-lang/reference#1110)
## book
5 commits in a5e0c5b2c5f9054be3b961aea2c7edfeea591de8..5f9358faeb1f46e19b8a23a21e79fd7fe150491e
2021-11-19 17:06:19 -0500 to 2021-12-05 21:33:16 -0500
- 1.57
- Update to 1.56
- Snapshot of ch 11 for nostarch
- Clarify how to check for an error in tests returning Result
- Update book repo links for default branch rename
10 commits in a2fc9635029c04e692474965a6606f8e286d539a..a374e7d8bb6b79de45b92295d06b4ac0ef35bc09
2021-11-18 13:31:13 -0500 to 2021-12-03 09:26:47 -0800
- Update LLVM coverage mapping format version supported by rustc (rust-lang/rustc-dev-guide#1267)
- Improve 'Running tests manually' section
- Fix some links
- Update for review comments.
- Document rustfix-only-machine-applicable
- Apply suggestions from pierwill
- Document more compiletest headers.
- make it compile with 1.56.0 no warning
- make it compile with 1.56.0
- make it compile with 1.56.0
the `print_capture_clause` and `word_nbsp`/`word_space` calls already put a space after the `async` and `move` keywords being printed. The extra `self.s.space()` call removed by this PR resulted in the redundant double space.
Matthias Krüger [Tue, 7 Dec 2021 10:05:04 +0000 (11:05 +0100)]
Rollup merge of #91547 - TennyZhuang:suggest_try_reserve, r=scottmcm
Suggest try_reserve in try_reserve_exact
During developing #91529 , I found that `try_reserve_exact` suggests `reserve` for further insertions. I think it's a mistake by copy&paste, `try_reserve` is better here.
Matthias Krüger [Tue, 7 Dec 2021 10:04:59 +0000 (11:04 +0100)]
Rollup merge of #91341 - scottmcm:array-iter-frp, r=kennytm
Add `array::IntoIter::{empty, from_raw_parts}`
`array::IntoIter` has a bunch of really handy logic for dealing with partial arrays, but it's currently hamstrung by only being creatable from a fully-initialized array.
This PR adds two new constructors:
- a safe & const `empty`, since `[].into_iter()` can only give `IntoIter<T, 0>`, not `IntoIter<T, N>`.
- an unsafe `from_raw_parts`, to allow experimentation with new uses.
(Slice & vec iterators don't need `from_raw_parts` because you `from_raw_parts` the slice or vec instead, but there's no useful way to made a `<[T; N]>::from_raw_parts`, so I think this is a reasonable place to have one.)
Matthias Krüger [Tue, 7 Dec 2021 10:04:58 +0000 (11:04 +0100)]
Rollup merge of #91312 - terrarier2111:anon-const-ice, r=jackh726
Fix AnonConst ICE
I am not sure if this is even the correct place to fix this issue, but i went down the path where the generic args came from and i wasn't able to find a clear cause for this down there. But if anybody has a suggestion what i should do, just tell me.
This fixes: https://github.com/rust-lang/rust/issues/91267
Matthias Krüger [Tue, 7 Dec 2021 10:04:57 +0000 (11:04 +0100)]
Rollup merge of #91065 - wesleywiser:add_incr_test, r=jackh726
Add test for evaluate_obligation: Ok(EvaluatedToOkModuloRegions) ICE
Adds the minimial repro test case from #85360. The fix for #85360 was
supposed to be #85868 however the repro was resolved in the 2021-07-05
nightly while #85868 didn't land until 2021-09-03. The reason for that
is d34a3a401b4e44f289a4d5bf53da83367cbb6aa7 **also** resolves that
issue.
Nikita Popov [Thu, 2 Dec 2021 11:24:25 +0000 (12:24 +0100)]
Use object crate for .rustc metadata generation
We already use the object crate for generating uncompressed .rmeta
metadata object files. This switches the generation of compressed
.rustc object files to use the object crate as well. These have
slightly different requirements in that .rmeta should be completely
excluded from any final compilation artifacts, while .rustc should
be part of shared objects, but not loaded into memory.
The primary motivation for this change is #90326: In LLVM 14, the
current way of setting section flags (and in particular, preventing
the setting of SHF_ALLOC) will no longer work. There are other ways
we could work around this, but switching to the object crate seems
like the most elegant, as we already use it for .rmeta, and as it
makes this independent of the codegen backend. In particular, we
don't need separate handling in codegen_llvm and codegen_gcc.
codegen_cranelift should be able to reuse the implementation as
well, though I have omitted that here, as it is not based on
codegen_ssa.
This change mostly extracts the existing code for .rmeta handling
to allow using it for .rustc as well, and adjust the codegen
infrastructure to handle the metadata object file separately: We
no longer create a backend-specific module for it, and directly
produce the compiled module instead.
This does not fix #90326 by itself yet, as .llvmbc will need to be
handled separately.
bors [Tue, 7 Dec 2021 08:12:47 +0000 (08:12 +0000)]
Auto merge of #85013 - Mark-Simulacrum:dominators-bitset, r=pnkfelix
Replace dominators algorithm with simple Lengauer-Tarjan
This PR replaces our dominators implementation with that of the simple Lengauer-Tarjan algorithm, which is (to my knowledge and research) the currently accepted 'best' algorithm. The more complex variant has higher constant time overheads, and Semi-NCA (which is arguably a variant of Lengauer-Tarjan too) is not the preferred variant by the first paper cited in the documentation comments: simple Lengauer-Tarjan "is less sensitive to pathological instances, we think it should be preferred where performance guarantees are important" - which they are for us.
This work originally arose from noting that the keccak benchmark spent a considerable portion of its time (both instructions and cycles) in the dominator computations, which sparked an interest in potentially optimizing that code. The current algorithm largely proves slow on long "parallel" chains where the nearest common ancestor lookup (i.e., the intersect function) does not quickly identify a root; it is also inherently a pointer-chasing algorithm so is relatively slow on modern CPUs due to needing to hit memory - though usually in cache - in a tight loop, which still costs several cycles.
This was replaced with a bitset-based algorithm, previously studied in literature but implemented directly from dataflow equations in our case, which proved to be a significant speed up on the keccak benchmark: 20% instruction count wins, as can be seen in [this performance report](https://perf.rust-lang.org/compare.html?start=377d1a984cd2a53327092b90aa1d8b7e22d1e347&end=542da47ff78aa462384062229dad0675792f2638). This algorithm is also relatively simple in comparison to other algorithms and is easy to understand. However, these performance results showed a regression on a number of other benchmarks, and I was unable to get the bitsets to perform well enough that those regressions could be fully mitigated. The implementation "attempt" is seen here in the first commit, and is intended to be kept primarily so that future optimizers do not repeat that path (or can easily refer to the attempt).
The final version of this PR chooses the simple Lengauer-Tarjan algorithm, and implements it along with a number of optimizations found in literature. The current implementation is a slight improvement for many benchmarks, with keccak still being an outlier at ~20%. The implementation in this PR first implements the most basic variant of the algorithm directly from the pseudocode on page 16, physical, or 28 in the PDF of the first paper ("Linear-Time Algorithms for Dominators and Related Problems"). This is then followed by a number of commits which update the implementation to apply various performance improvements, as suggested by the paper. Finally, the last commit annotates the implementation with a number of comments, mostly drawn from the paper, which intend to help readers understand what is going on - these are incomplete without the paper, but writing them certainly helped my understanding. They may be helpful if future optimization attempts are attempted, so I chose to add them in.
Esteban Kuber [Mon, 8 Nov 2021 19:15:54 +0000 (19:15 +0000)]
Only shown relevant type params in E0283 label
When we point at a binding to suggest giving it a type, erase all the
type for ADTs that have been resolved, leaving only the ones that could
not be inferred. For small shallow types this is not a problem, but for
big nested types with lots of params, this can otherwise cause a lot of
unnecessary visual output.
Mark Rousskov [Sat, 8 May 2021 23:13:11 +0000 (19:13 -0400)]
Use preorder indices for data structures
This largely avoids remapping from and to the 'real' indices, with the exception
of predecessor lookup and the final merge back, and is conceptually better.