Rollup merge of #95320 - JakobDegen:mir-docs, r=oli-obk
Document the current MIR semantics that are clear from existing code
This PR adds documentation to places, operands, rvalues, statementkinds, and terminatorkinds that describes their existing semantics and requirements. In many places the semantics depend on the Rust memory model or other T-Lang decisions - when this is the case, it is just noted as such with links to UCG issues where possible. I'm hopeful that none of the documentation added here can be used to justify optimizations that depend on the memory model. The documentation for places and operands probably comes closest to running afoul of this - if people think that it cannot be merged as is, it can definitely also be taken out.
The goal here is to only document parts of MIR that seem to be decided already, or are at least depended on by existing code. That leaves quite a number of open questions - those are marked as "needs clarification." I'm not sure what to do with those in this PR - we obviously can't decide all these questions here. Should I just leave them in as is? Take them out? Keep them in but as `//` instead of `///` comments?
If this is too big to review at once, I can split this up.
Auto merge of #95893 - luqmana:no-prepopulate-passes-tweaks, r=nikic
Respect -Z verify-llvm-ir and other flags that add extra passes when combined with -C no-prepopulate-passes in the new LLVM Pass Manager.
As part of the switch to the new LLVM Pass Manager the behaviour of flags such as `-Z verify-llvm-ir` (e.g. sanitizer, instrumentation) was modified when combined with `-C no-prepopulate-passes`. With the old PM, rustc was the one manually constructing the pipeline and respected those flags but in the new pass manager, those flags are used to build a list of callbacks that get invoked at certain extension points in the pipeline. Unfortunately, `-C no-prepopulate-passes` would skip building the pipeline altogether meaning we'd never add the corresponding passes. The fix here is to just manually invoke those callbacks as needed.
Fixes #95874
Demonstrating the current vs fixed behaviour using the bug in #95864
```console
$ rustc +nightly asm-miscompile.rs --edition 2021 --emit=llvm-ir -C no-prepopulate-passes -Z verify-llvm-ir
$ echo $?
0
$ rustc +stage1 asm-miscompile.rs --edition 2021 --emit=llvm-ir -C no-prepopulate-passes -Z verify-llvm-ir
Basic Block in function '_ZN14asm_miscompile3foo28_$u7b$$u7b$closure$u7d$$u7d$17h360e2f7eee1275c5E' does not have terminator!
label %bb1
LLVM ERROR: Broken module found, compilation aborted!
```
Auto merge of #95944 - Dylan-DPC:rollup-idggkrh, r=Dylan-DPC
Rollup of 7 pull requests
Successful merges:
- #95008 ([`let_chains`] Forbid `let` inside parentheses)
- #95801 (Replace RwLock by a futex based one on Linux)
- #95864 (Fix miscompilation of inline assembly with outputs in cases where we emit an invoke instead of call instruction.)
- #95894 (Fix formatting error in pin.rs docs)
- #95895 (Clarify str::from_utf8_unchecked's invariants)
- #95901 (Remove duplicate aliases for `check codegen_{cranelift,gcc}` and fix `build codegen_gcc`)
- #95927 (CI: do not compile libcore twice when performing LLVM PGO)
Auto merge of #95796 - bzEq:bzEq/curl-redirect, r=Dylan-DPC
[bootstrap.py] Instruct curl to follow redirect
Some mirror RUSTUP_DIST_SERVER (like https://mirrors.sjtug.sjtu.edu.cn/rust-static) perform redirection when downloading
stage0 compiler. Curl should be able to follow that.
Rollup merge of #95901 - jyn514:remove-duplicate-aliases, r=Mark-Simulacrum
Remove duplicate aliases for `check codegen_{cranelift,gcc}` and fix `build codegen_gcc`
* Remove duplicate aliases
Bootstrap already allows selecting these in `PathSet::has`, which allows
any string that matches the end of a full path.
I found these by adding `assert!(path.exists())` in `StepDescription::paths`.
I think ideally we wouldn't have any aliases that aren't paths, but I've held
off on enforcing that here since it may be controversial, I'll open a separate PR.
* Add `build compiler/rustc_codegen_gcc` as an alias for `CodegenBackend`
These paths (`_cranelift` and `_gcc`) are somewhat misleading, since they
actually tell bootstrap to build *all* codegen backends. But this seems like
a useful improvement in the meantime.
Rollup merge of #95895 - CAD97:patch-2, r=Dylan-DPC
Clarify str::from_utf8_unchecked's invariants
Specifically, make it clear that it is immediately UB to pass ill-formed UTF-8 into the function. The previous wording left space to interpret that the UB only occurred when calling another function, which "assumes that `&str`s are valid UTF-8."
This does not change whether str being UTF-8 is a safety or a validity invariant. (As per previous discussion, it is a safety invariant, not a validity invariant.) It just makes it clear that valid UTF-8 is a precondition of str::from_utf8_unchecked, and that emitting an Abstract Machine fault (e.g. UB or a sanitizer error) on invalid UTF-8 is a valid thing to do.
If user code wants to create an unsafe `&str` pointing to ill-formed UTF-8, it must be done via transmutes. Also, just, don't.
Rollup merge of #95864 - luqmana:inline-asm-unwind-store-miscompile, r=Amanieu
Fix miscompilation of inline assembly with outputs in cases where we emit an invoke instead of call instruction.
We ran into this bug where rustc would segfault while trying to compile certain uses of inline assembly.
Here is a simple repro that demonstrates the issue:
```rust
#![feature(asm_unwind)]
fn main() {
let _x = String::from("string here just cause we need something with a non-trivial drop");
let foo: u64;
unsafe {
std::arch::asm!(
"mov {}, 1",
out(reg) foo,
options(may_unwind)
);
}
println!("{}", foo);
}
```
([playground link](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=7d6641e83370d2536a07234aca2498ff))
But crucially `feature(asm_unwind)` is not actually needed and this can be triggered on stable as a result of the way async functions/generators are handled in the compiler. e.g.:
Rollup merge of #95801 - m-ou-se:futex-rwlock, r=Amanieu
Replace RwLock by a futex based one on Linux
This replaces the pthread-based RwLock on Linux by a futex based one.
This implementation is similar to [the algorithm](https://gist.github.com/kprotty/3042436aa55620d8ebcddf2bf25668bc) suggested by `@kprotty,` but modified to prefer writers and spin before sleeping. It uses two futexes: One for the readers to wait on, and one for the writers to wait on. The readers futex contains the state of the RwLock: The number of readers, a bit indicating whether writers are waiting, and a bit indicating whether readers are waiting. The writers futex is used as a simple condition variable and its contents are meaningless; it just needs to be changed on every notification.
Using two futexes rather than one has the obvious advantage of allowing a separate queue for readers and writers, but it also means we avoid the problem a single-futex RwLock would have of making it hard for a writer to go to sleep while the number of readers is rapidly changing up and down, as the writers futex is only changed when we actually want to wake up a writer.
It always prefers writers, as we decided [here](https://github.com/rust-lang/rust/issues/93740#issuecomment-1070696128).
To be able to prefer writers, it relies on futex_wake to return the number of awoken threads to be able to handle write-unlocking while both the readers-waiting and writers-waiting bits are set. Instead of waking both and letting them race, it first wakes writers and only continues to wake the readers too if futex_wake reported there were no writers to wake up.
Rollup merge of #95008 - c410-f3r:let-chains-paren, r=wesleywiser
[`let_chains`] Forbid `let` inside parentheses
Parenthesizes are mostly a no-op in let chains, in other words, they are mostly ignored.
```rust
let opt = Some(Some(1i32));
if (let Some(a) = opt && (let Some(b) = a)) && b == 1 {
println!("`b` is declared inside but used outside");
}
```
As seen above, such behavior can lead to confusion.
A proper fix or nested encapsulation would probably require research, time and a modified MIR graph so in this PR I simply denied any `let` inside parentheses. Non-let stuff are still allowed.
```rust
fn main() {
let fun = || true;
if let true = (true && fun()) && (true) {
println!("Allowed");
}
}
```
It is worth noting that `let ...` is not an expression and the RFC did not mention this specific situation.
Auto merge of #95125 - JakobDegen:uninit-variant-rvalue, r=oli-obk
Add new `Deinit` statement
This rvalue replaces `SetDiscriminant` for ADTs. This PR is an alternative to #94590 , which only specifies that the behavior of `SetDiscriminant` is the same as what this rvalue would do. The motivation for this change are discussed in that PR and [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/SetDiscriminant.20and.20aggregate.20initialization.20.2394590)
Auto merge of #95931 - matthiaskrgr:rollup-1c5zhit, r=matthiaskrgr
Rollup of 7 pull requests
Successful merges:
- #95743 (Update binary_search example to instead redirect to partition_point)
- #95771 (Update linker-plugin-lto.md to 1.60)
- #95861 (Note that CI tests Windows 10)
- #95875 (bootstrap: show available paths help text for aliased subcommands)
- #95876 (Add a note for unsatisfied `~const Drop` bounds)
- #95907 (address fixme for diagnostic variable name)
- #95917 (thin_box test: import from std, not alloc)
Rollup merge of #95917 - RalfJung:thin-box-test, r=dtolnay
thin_box test: import from std, not alloc
Importing from `alloc` makes [Miri fail](https://github.com/rust-lang/miri-test-libstd/runs/5964922742?check_suite_focus=true), probably due to the hack that we used to resolve https://github.com/rust-lang/miri-test-libstd/issues/4. There might be better ways around this, but for now this is the easiest thing to do -- no other alloc integration test is importing from `alloc::`.
Rollup merge of #95875 - aswild:pr/alias-cmd-paths, r=Mark-Simulacrum
bootstrap: show available paths help text for aliased subcommands
Running `./x.py build -h -v` shows a list of available build targets,
but the short alias `./x.py b -h -v` does not. Fix so that the aliases
behave the same as their spelled out counterparts.
Rollup merge of #95743 - yaahc:binary-search-clarification, r=Mark-Simulacrum
Update binary_search example to instead redirect to partition_point
Inspired by discussion in the tracking issue for `Result::into_ok_or_err`: https://github.com/rust-lang/rust/issues/82223#issuecomment-1067098167
People are surprised by us not providing a `Result<T, T> -> T` conversion, and the main culprit for this confusion seems to be the `binary_search` API. We should instead redirect people to the equivalent API that implicitly does that `Result<T, T> -> T` conversion internally which should obviate the need for the `into_ok_or_err` function and give us time to work towards a more general solution that applies to all enums rather than just `Result` such as making or_patterns usable for situations like this via postfix `match`.
I choose to duplicate the example rather than simply moving it from `binary_search` to partition point because most of the confusion seems to arise when people are looking at `binary_search`. It makes sense to me to have the example presented immediately rather than requiring people to click through to even realize there is an example. If I had to put it in only one place I'd leave it in `binary_search` and remove it from `partition_point` but it seems pretty obviously relevant to `partition_point` so I figured the best option would be to duplicate it.
Auto merge of #95758 - compiler-errors:issue-54771, r=estebank
Only suggest removing semicolon when expression is compatible with `impl Trait`
https://github.com/rust-lang/rust/issues/54771#issuecomment-476423690
> It still needs checking that the last statement's expr can actually conform to the trait, but the naïve behavior is there.
Only suggest removing a semicolon when the type behind the semicolon actually implements the trait in an RPIT `-> impl Trait`. Also upgrade the label that suggests removing the semicolon to a suggestion (should it be verbose?).
Auto merge of #95754 - compiler-errors:binder-assoc-ty, r=nagisa
Better error for `for<...>` on associated type bound
With GATs just around the corner, we'll probably see more people trying out `Trait<for<'a> Assoc<'a> = ..>`.
This PR improves the syntax error slightly, and also makes it slightly easier to make this into real syntax in the future.
Feel free to push back if the reviewer thinks this should have a suggestion on how to fix it (i.e. push the `for<'a>` outside of the angle brackets), but that can also be handled in a follow-up PR.
Auto merge of #94243 - compiler-errors:compiler-flags-typo, r=Mark-Simulacrum
`s/compiler-flags/compile-flags` in compiletest
Also make compiletest panic so this doesn't happen in the future! I literally always forget which it's called, so I wanted to make my life easier in the future.
Joshua Nelson [Sun, 10 Apr 2022 21:41:21 +0000 (16:41 -0500)]
Add `build compiler/rustc_codegen_gcc` as an alias for `CodegenBackend`
These paths (`_cranelift` and `_gcc`) are somewhat misleading, since they
actually tell bootstrap to build *all* codegen backends. But this seems like
a useful improvement in the meantime.
Joshua Nelson [Sun, 10 Apr 2022 21:37:24 +0000 (16:37 -0500)]
Remove duplicate aliases for `codegen_{cranelift,gcc}`
Bootstrap already allows selecting these in `PathSet::has`, which allows
any string that matches the end of a full path.
I found these by adding `assert!(path.exists())` in `StepDescription::paths`.
I think ideally we wouldn't have any aliases that aren't paths, but I've held
off on enforcing that here since it may be controversial, I'll open a separate PR.
Auto merge of #95889 - Dylan-DPC:rollup-1cmywu4, r=Dylan-DPC
Rollup of 7 pull requests
Successful merges:
- #95566 (Avoid duplication of doc comments in `std::char` constants and functions)
- #95784 (Suggest replacing `typeof(...)` with an actual type)
- #95807 (Suggest adding a local for vector to fix borrowck errors)
- #95849 (Check for git submodules in non-git source tree.)
- #95852 (Fix missing space in lossy provenance cast lint)
- #95857 (Allow multiple derefs to be splitted in deref_separator)
- #95868 (rustdoc: Reduce allocations in a `html::markdown` function)
Specifically, make it clear that it is immediately UB to pass ill-formed UTF-8 into the function. The previous wording left space to interpret that the UB only occurred when calling another function, which "assumes that `&str`s are valid UTF-8."
This does not change whether str being UTF-8 is a safety or a validity invariant. (As per previous discussion, it is a safety invariant, not a validity invariant.) It just makes it clear that valid UTF-8 is a precondition of str::from_utf8_unchecked, and that emitting an Abstract Machine fault (e.g. UB or a sanitizer error) on invalid UTF-8 is a valid thing to do.
If user code wants to create an unsafe `&str` pointing to ill-formed UTF-8, it must be done via transmutes. Also, just, don't.
Rollup merge of #95849 - ehuss:check-submodules, r=Mark-Simulacrum
Check for git submodules in non-git source tree.
People occasionally download the source from https://github.com/rust-lang/rust/releases, but those source distributions will not work because they are missing the submodules. They will get a confusing `failed to load manifest for workspace member` error.
Unfortunately AFAIK there is no way to disable the GitHub source links. This change tries to detect this scenario and provide an error message that guides them toward a solution.
Rollup merge of #95784 - WaffleLapkin:typeof_cool_suggestion, r=compiler-errors
Suggest replacing `typeof(...)` with an actual type
This PR adds suggestion to replace `typeof(...)` with an actual type of `...`, for example in case of `typeof(1)` we suggest replacing it with `i32`.
If the expression
1. Is not const (`{ let a = 1; let _: typeof(a); }`)
2. Can't be found (`let _: typeof(this_variable_does_not_exist)`)
3. Or has non-suggestable type (closure, generator, error, etc)
we don't suggest anything.
The 1 one is sad, but it's not clear how to support non-consts expressions for `typeof`.
Rollup merge of #95566 - eduardosm:std_char_consts_and_methods, r=Mark-Simulacrum
Avoid duplication of doc comments in `std::char` constants and functions
For those consts and functions, only the summary is kept and a reference to the `char` associated const/method is included.
Additionaly, re-exported functions have been converted to function definitions that call the previously re-exported function. This makes it easier to add a deprecated attribute to these functions in the future.
Auto merge of #95621 - saethlin:remove-mpsc-transmute, r=RalfJung
Remove ptr-int transmute in std::sync::mpsc
Since https://github.com/rust-lang/rust/pull/95340 landed, Miri with `-Zmiri-check-number-validity` produces an error on the test suites of some crates which implement concurrency tools<sup>*</sup>, because it seems like such crates tend to use `std::sync::mpsc` in their tests. This fixes the problem by storing pointer bytes in a pointer.
<sup>*</sup> I have so far seen errors in the test suites of `once_cell`, `parking_lot`, and `crossbeam-utils`.
(just updating the list for fun, idk)
Also `threadpool`, `async-lock`, `futures-timer`, `fragile`, `scoped_threadpool`, `procfs`, `slog-async`, `scheduled-thread-pool`, `tokio-threadpool`, `mac`, `futures-cpupool`, `ntest`, `actix`, `zbus`, `jsonrpc-client-transports`, `fail`, `libp2p-gossipsub`, `parity-send-wrapper`, `async-broadcast,` `libp2p-relay`, `http-client`, `mockito`, `simple-mutex`, `surf`, `pollster`, and `pulse`. Then I turned the bot off.
Allen Wild [Sun, 10 Apr 2022 07:10:04 +0000 (03:10 -0400)]
bootstrap: show available paths help text for aliased subcommands
Running `./x.py build -h -v` shows a list of available build targets,
but the short alias `./x.py b -h -v` does not. Fix so that the aliases
behave the same as their spelled out counterparts.
Auto merge of #95502 - jyn514:doc-rustc, r=Mark-Simulacrum
Fix `x doc compiler/rustc`
This also has a few cleanups to `doc.rs`. The last two commits I don't care about, but the first commit I'd like to keep - it will be very useful for https://github.com/rust-lang/rust/issues/44293.
--- stderr
thread 'main' panicked at 'assertion failed: rustc.is_absolute()', src\bootstrap\build.rs:22:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
error: build failed
```
The problem was that the `dir.join` check only works with `rustc.exe`, not `rustc`.
Thanks `@Walther` for the help testing the fix!
Helps with https://github.com/rust-lang/rust/issues/94829.
Auto merge of #95435 - cjgillot:one-name, r=oli-obk
Make def names and HIR names consistent.
The name in the `DefKey` is interned to create the `DefId`, so it does not
require any query to access. This can be leveraged to avoid a few useless
HIR accesses for names.
~In order to achieve that, generic parameters created from universal
impl-trait are given the pretty-printed ast as a name, instead of
`{{opaque}}`.~
~Drive-by: the `TyCtxt::opt_item_name` used a dummy span for non-local
definitions. We have access to `def_ident_span`, so we use it.~
Switch to the 'normal' basic block for writing asm outputs if needed.
We may sometimes emit an `invoke` instead of a `call` for inline
assembly during the MIR -> LLVM IR lowering. But we failed to update
the IR builder's current basic block before writing the results to the
outputs. This would result in invalid IR because the basic block would
end in a `store` instruction, which isn't a valid terminator.
Rollup merge of #95831 - redzic:xor-uppercase, r=workingjubilee
Use bitwise XOR in to_ascii_uppercase
This saves an instruction compared to the previous approach, which
was to unset the fifth bit with bitwise OR.
Comparison of generated assembly on x86: https://godbolt.org/z/GdfvdGs39
This can also affect autovectorization, saving SIMD instructions as well: https://godbolt.org/z/cnPcz75T9
Not sure if `u8::to_ascii_lowercase` should also be changed, since using bitwise OR for that function does not require an extra bitwise negate since the code is setting a bit rather than unsetting a bit. `char::to_ascii_uppercase` already uses XOR, so no change seems to be required there.
It was used for deduplicating some errors for legacy code which are mostly deduplicated even without that, but at cost of global mutable state, which is not a good tradeoff.
cc https://github.com/rust-lang/rust/pull/95747#issuecomment-1091619403
r? ``@nnethercote``
Rollup merge of #95805 - c410-f3r:meta-vars, r=petrochenkov
Left overs of #95761
These are just nits. Feel free to close this PR if all modifications are not worth merging.
* `#![feature(decl_macro)]` is not needed anymore in `rustc_expand`
* `tuple_impls` does not require `$Tuple:ident`. I guess it is there to enhance readability?
Rollup merge of #95369 - jyn514:test-rustdoc, r=Mark-Simulacrum
Fix `x test src/librustdoc` with `download-rustc` enabled
The problem was two-fold:
- Bootstrap was hard-coding that unit tests should always run with stage1, not stage2, and
- It hard-coded the sysroot layout in stage1, which puts libLLVM.so in `lib/rustlib/` instead of just `lib/`.
This also takes the liberty of fixing `test src/librustdoc --no-doc`, which has been broken since it was first added. It would be nice at some point to unify this logic with other tests; I opened a Zulip thread: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Inconsistency.20in.20.60x.20test.60
Rollup merge of #95361 - scottmcm:valid-align, r=Mark-Simulacrum
Make non-power-of-two alignments a validity error in `Layout`
Inspired by the zulip conversation about how `Layout` should better enforce `size <= isize::MAX as usize`, this uses an N-variant enum on N-bit platforms to require at the validity level that the existing invariant of "must be a power of two" is upheld.
This was MIRI can catch it, and means there's a more-specific type for `Layout` to store than just `NonZeroUsize`.
It's left as `pub(crate)` here; a future PR could consider giving it a tracking issue for non-internal usage.
Rollup merge of #94794 - mlodato517:mlodato517-clarify-string-indexing-docs, r=Mark-Simulacrum
Clarify indexing into Strings
**This Commit**
Adds some clarity around indexing into Strings.
**Why?**
I was reading through the `Range` documentation and saw an
implementation for `SliceIndex<str>`. I was surprised to see this and
went to read the [`String`][0] documentation and, to me, it seemed to
say (at least) three things:
1. you cannot index into a `String`
2. indexing into a `String` could not be constant-time
3. indexing into a `String` does not have an obvious return type
I absolutely agree with the last point but the first two seemed
contradictory to the documentation around [`SliceIndex<str>`][1]
which mention:
1. you can do substring slicing (which is probably different than
"indexing" but, because the method is called `index` and I associate
anything with square brackets with "indexing" it was enough to
confuse me)
2. substring slicing is constant-time (this may be algorithmic ignorance
on my part but if `&s[i..i+1]` is O(1) then it seems confusing that
`&s[i]` _could not possibly_ be O(1))
So I was hoping to clarify a couple things and, hopefully, in this PR
review learn a little more about the nuances here that confused me in
the first place.
Mark Lodato [Thu, 10 Mar 2022 02:43:04 +0000 (21:43 -0500)]
Rework String UTF-8 Documentation
**This Commit**
Adds some clarity around indexing into Strings and the constraints
driving various decisions there.
**Why?**
The [`String` documentation][0] mentions how `String`s can't be indexed
but `Range` has an implementation for `SliceIndex<str>`. This can be
confusing. There are also several statements to explain the lack of
`String` indexing:
- the inability to index into a `String` is an implication of UTF-8
encoding
- indexing into a `String` could not be constant-time with UTF-8
encoding
- indexing into a `String` does not have an obvious return type
This last statement made sense but the first two seemed contradictory to
the documentation around [`SliceIndex<str>`][1] which mention:
- one can index into a `String` with a `Range` (also called substring
slicing but it uses the same syntax and the method name is `index`)
- `Range` indexing into a `String` is constant-time
To resolve this seeming contradiction the documentation is reworked to
more clearly explain what factors drive the decision to disallow
indexing into a `String` with a single number.
Auto merge of #95697 - klensy:no-strings, r=petrochenkov
refactor: simplify few string related interactions
Few small optimizations:
check_doc_keyword: don't alloc string for emptiness check
check_doc_alias_value: get argument as Symbol to prevent needless string convertions
check_doc_attrs: don't alloc vec, iterate over slice.
replace as_str() check with symbol check
get_single_str_from_tts: don't prealloc string
trivial string to str replace
LifetimeScopeForPath::NonElided use Vec<Symbol> instead of Vec<String>
AssertModuleSource use FxHashSet<Symbol> instead of BTreeSet<String>
CrateInfo.crate_name replace FxHashMap<CrateNum, String> with FxHashMap<CrateNum, Symbol>
It was used for deduplicating some errors for legacy code which are mostly deduplicated even without that, but at cost of global mutable state, which is not a good tradeoff.