bors [Fri, 9 Dec 2016 15:45:41 +0000 (15:45 +0000)]
Auto merge of #38197 - mneumann:dragonfly-fixes-2016-12-06, r=alexcrichton
Fix current_exe() on DragonFly (again)
This is a follow up on [this pull request][1].
Since DragonFly 4.6.1 ([this commit][2]), the ```kern.proc.pathname```
sysctl works correctly, i.e. it does not return paths including a ```:```
(see [here][3]). Use it and don't try to fix old versions of DragonFly!
There are not many DragonFly installations out there that we can't
control and no one is compiling Rust from source. If someone wants to
run Rust on a pre-4.6.1 DragonFly system, the ports system should
provide a patch.
bors [Fri, 9 Dec 2016 10:00:25 +0000 (10:00 +0000)]
Auto merge of #38192 - stjepang:faster-sort-algorithm, r=bluss
Implement a faster sort algorithm
Hi everyone, this is my first PR.
I've made some changes to the standard sort algorithm, starting out with a few tweaks here and there, but in the end this endeavour became a complete rewrite of it.
#### Summary
Changes:
* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.
First, let me just do a quick brain dump to discuss what I learned along the way.
The official documentation says that the standard sort in Rust is a stable sort. This constraint is thus set in stone and immediately rules out many popular sorting algorithms. Essentially, the only algorithms we might even take into consideration are:
1. [Merge sort](https://en.wikipedia.org/wiki/Merge_sort)
2. [Block sort](https://en.wikipedia.org/wiki/Block_sort) (famous implementations are [WikiSort](https://github.com/BonzaiThePenguin/WikiSort) and [GrailSort](https://github.com/Mrrl/GrailSort))
3. [TimSort](https://en.wikipedia.org/wiki/Timsort)
Actually, all of those are just merge sort flavors. :) The current standard sort in Rust is a simple iterative merge sort. It has three problems. First, it's slow on partially sorted inputs (even though #29675 helped quite a bit). Second, it always makes around `log(n)` iterations copying the entire array between buffers, no matter what. Third, it allocates huge amounts of temporary memory (a buffer of size `2*n`, where `n` is the size of input).
The problem of auxilliary memory allocation is a tough one. Ideally, it would be best for our sort to allocate `O(1)` additional memory. This is what block sort (and it's variants) does. However, it's often very complicated (look at [this](https://github.com/BonzaiThePenguin/WikiSort/blob/master/WikiSort.cpp)) and even then performs rather poorly. The author of WikiSort claims good performance, but that must be taken with a grain of salt. It performs well in comparison to `std::stable_sort` in C++. It can even beat `std::sort` on partially sorted inputs, but on random inputs it's always far worse. My rule of thumb is: high performance, low memory overhead, stability - choose two.
TimSort is another option. It allocates a buffer of size `n/2`, which is not great, but acceptable. Performs extremelly well on partially sorted inputs. However, it seems pretty much all implementations suck on random inputs. I benchmarked implementations in [Rust](https://github.com/notriddle/rust-timsort), [C++](https://github.com/gfx/cpp-TimSort), and [D](https://github.com/dlang/phobos/blob/fd518eb310a9494cccf28c54892542b052c49669/std/algorithm/sorting.d#L2062). The results were a bit disappointing. It seems bad performance is due to complex galloping procedures in hot loops. Galloping noticeably improves performance on partially sorted inputs, but worsens it on random ones.
#### The new algorithm
Choosing the best algorithm is not easy. Plain merge sort is bad on partially sorted inputs. TimSort is bad on random inputs and block sort is even worse. However, if we take the main ideas from TimSort (intelligent merging strategy of sorted runs) and drop galloping, then we'll have great performance on random inputs and it won't be bad on partially sorted inputs either.
That is exactly what this new algorithm does. I can't call it TimSort, since it steals just a few of it's ideas. Complete TimSort would be a much more complex and elaborate implementation. In case we in the future figure out how to incorporate more of it's ideas into this implementation without crippling performance on random inputs, it's going to be very easy to extend. I also did several other minor improvements, like reworked insertion sort to make it faster.
There are also new, more thorough benchmarks and panic safety tests.
The final code is not terribly complex and has less unsafe code than I anticipated, but there's still plenty of it that should be carefully reviewed. I did my best at documenting non-obvious code.
I'd like to notify several people of this PR, since they might be interested and have useful insights:
1. @huonw because he wrote the [original merge sort](https://github.com/rust-lang/rust/pull/11064).
2. @alexcrichton because he was involved in multiple discussions of it.
3. @veddan because he wrote [introsort](https://github.com/veddan/rust-introsort) in Rust.
4. @notriddle because he wrote [TimSort](https://github.com/notriddle/rust-timsort) in Rust.
5. @bluss because he had an attempt at writing WikiSort in Rust.
6. @gnzlbg, @rkruppe, and @mark-i-m because they were involved in discussion #36318.
**P.S.** [quickersort](https://github.com/notriddle/quickersort) describes itself as being universally [faster](https://github.com/notriddle/quickersort/blob/master/perf.txt) than the standard sort, which is true. However, if this PR gets merged, things might [change](https://gist.github.com/stjepang/b9f0c3eaa0e1f1280b61b963dae19a30) a bit. ;)
bors [Fri, 9 Dec 2016 07:08:29 +0000 (07:08 +0000)]
Auto merge of #38256 - alexcrichton:distcheck, r=brson
rustbuild: Implement distcheck
This commit implements the `distcheck` target for rustbuild which is only ever
run on our nightly bots. This essentially just creates a tarball, un-tars it,
and then runs a full build, validating that the release tarballs do indeed have
everything they need to build Rust.
Alex Crichton [Fri, 9 Dec 2016 01:13:55 +0000 (17:13 -0800)]
rustbuild: Implement distcheck
This commit implements the `distcheck` target for rustbuild which is only ever
run on our nightly bots. This essentially just creates a tarball, un-tars it,
and then runs a full build, validating that the release tarballs do indeed have
everything they need to build Rust.
Stjepan Glavina [Thu, 8 Dec 2016 21:37:36 +0000 (22:37 +0100)]
Inline nested fn collapse
Since merge_sort is generic and collapse isn't, that means calls to
collapse won't be inlined. inlined. Therefore, we must stick an
`#[inline]` above `fn collapse`.
bors [Thu, 8 Dec 2016 07:05:19 +0000 (07:05 +0000)]
Auto merge of #38146 - kali:master, r=alexcrichton
fix objc ABI in std::env::args
iOS use different calling convention for `objc_msgSend` depending on the platform. armv7 expect good old variadic arguments, but aarch64 wants "normal" convention: `objc_msgSend` has to be called mimicking the actual callee prototype.
This currently breaks std::env:args() on aarch64 iOS devices. As far as I can tell, in the standard library, this is the only occurrence of ObjectiveC dispatching.
bors [Thu, 8 Dec 2016 03:03:51 +0000 (03:03 +0000)]
Auto merge of #38076 - alexcrichton:another-rustbuild-bug, r=japaric
rustbuild: Use src/rustc for assembled compilers
The `src/rustc` path is intended for assembling a compiler (e.g. the bare bones)
not actually compiling the whole compiler itself. This path was accidentally
getting hijacked to represent the whole compiler being compiled, so let's
redirect that elsewhere for that particular cargo project.
bors [Wed, 7 Dec 2016 23:06:10 +0000 (23:06 +0000)]
Auto merge of #38191 - oli-obk:clippy_is_sad, r=eddyb
annotate stricter lifetimes on LateLintPass methods to allow them to forward to a Visitor
this unblocks clippy (rustup blocked after #37918)
clippy has lots of lints that internally call an `intravisit::Visitor`, but the current lifetimes on `LateLintPass` methods conflicted with the required lifetimes (there was no connection between the HIR elements and the `TyCtxt`)
Stjepan Glavina [Tue, 6 Dec 2016 11:05:16 +0000 (12:05 +0100)]
Implement a faster sort algorithm
This is a complete rewrite of the standard sort algorithm. The new algorithm
is a simplified variant of TimSort. In summary, the changes are:
* Improved performance, especially on partially sorted inputs.
* Performs less comparisons on both random and partially sorted inputs.
* Decreased the size of temporary memory: the new sort allocates 4x less.
Guillaume Gomez [Wed, 7 Dec 2016 18:42:52 +0000 (10:42 -0800)]
Rollup merge of #38225 - Cobrand:patch-1, r=GuillaumeGomez
Update book/ffi to use catch_unwind
r? @GuillaumeGomez
The doc mentioned to spawn a new thread instead of using catch_unwind, which has been the recommended way to catch panics for foreign function interfaces for a few releases now.
Guillaume Gomez [Wed, 7 Dec 2016 18:42:52 +0000 (10:42 -0800)]
Rollup merge of #38163 - durka:patch-33, r=bluss
reference: fix definition of :tt
The reference says that $x:tt matches "either side of the `=>` in macro_rules` which is technically true but completely uninformative. This changes that bullet point to what the book says (a single token or sequence of token trees inside brackets).
Cobrand [Wed, 7 Dec 2016 17:43:07 +0000 (18:43 +0100)]
Update book/ffi to use catch_unwind
r? @GuillaumeGomez
The doc mentioned to spawn a new thread instead of using catch_unwind, which has been the recommended way to catch panics for foreign function interfaces for a few releases now.
bors [Wed, 7 Dec 2016 12:09:11 +0000 (12:09 +0000)]
Auto merge of #37817 - alexcrichton:rustbuild-default, r=brson
mk: Switch rustbuild to the default build system
This commit switches the default build system for Rust from the makefiles to
rustbuild. The rustbuild build system has been in development for almost a year
now and has become quite mature over time. This commit is an implementation of
the proposal on [internals] which slates deletion of the makefiles on
2017-02-02.
This commit also updates various documentation in `README.md`,
`CONTRIBUTING.md`, `src/bootstrap/README.md`, and throughout the source code of
rustbuild itself.
Alex Crichton [Wed, 16 Nov 2016 20:31:19 +0000 (12:31 -0800)]
mk: Switch rustbuild to the default build system
This commit switches the default build system for Rust from the makefiles to
rustbuild. The rustbuild build system has been in development for almost a year
now and has become quite mature over time. This commit is an implementation of
the proposal on [internals] which slates deletion of the makefiles on
2016-01-02.
This commit also updates various documentation in `README.md`,
`CONTRIBUTING.md`, `src/bootstrap/README.md`, and throughout the source code of
rustbuild itself.
bors [Wed, 7 Dec 2016 07:15:31 +0000 (07:15 +0000)]
Auto merge of #38149 - bluss:is-empty, r=alexcrichton
Forward more ExactSizeIterator methods and `is_empty` edits
- Forward ExactSizeIterator methods in more places, like `&mut I` and `Box<I>` iterator impls.
- Improve `VecDeque::is_empty` itself (see commit 4)
- All the collections iterators now have `len` or `is_empty` forwarded if doing so is a benefit. In the remaining cases, they already use a simple size hint (using something like a stored `usize` value), which is sufficient for the default implementation of len and is_empty.
bors [Wed, 7 Dec 2016 00:30:25 +0000 (00:30 +0000)]
Auto merge of #38134 - bluss:iter-nth, r=aturon
Remove Self: Sized from Iterator::nth
It is an unnecessary restriction; nth neither needs self to be sized
nor needs to be exempted from the trait object.
It increases the utility of the nth method, because type specific
implementations are available through `&mut I` or through an iterator
trait object.
It is a backwards compatible change due to the special cases of the
`where Self: Sized` bound; it was already optional to include this bound
in `Iterator` implementations.
bors [Tue, 6 Dec 2016 21:05:31 +0000 (21:05 +0000)]
Auto merge of #38017 - arthurprs:hm-extend, r=bluss
Smarter HashMap/HashSet pre-allocation for extend/from_iter
HashMap/HashSet from_iter and extend are making totally different assumptions.
A more balanced decision may allocate half the lower hint (rounding up). For "well defined" iterators this effectively limits the worst case to two resizes (the initial reserve + one resize).
Michael Neumann [Tue, 6 Dec 2016 18:31:48 +0000 (19:31 +0100)]
Fix current_exe() on DragonFly (again)
This is a follow up on [this pull request][1].
Since DragonFly 4.6.1 ([this commit][2]), the "kern.proc.pathname"
sysctl works correctly, i.e. it does not return paths including a ":"
(see [here][3]). Use it and don't try to fix old versions of DragonFly!
There are not many DragonFly installations out there that we can't
control and no one is compiling Rust from source. If someone wants to
run Rust on a pre-4.6.1 DragonFly system, the ports system should
provide a patch.
bors [Tue, 6 Dec 2016 17:38:26 +0000 (17:38 +0000)]
Auto merge of #38036 - Mark-Simulacrum:polish-2, r=nagisa,eddyb
Simplify calling find_implied_output_region.
@nnethercote added the optimization that find_implied_output_region
takes a closure as an optimization in #37014, but passing an iterator is
simpler, and more ergonomic for callers.
bors [Tue, 6 Dec 2016 14:16:49 +0000 (14:16 +0000)]
Auto merge of #37994 - upsuper:msvc-link-opt, r=alexcrichton
Don't apply msvc link opts for non-opt build
`/OPT:REF,ICF` sometimes takes lots of time. It makes no sense to apply them when doing debug build. MSVC's linker by default disables these optimizations when `/DEBUG` is specified, unless they are explicitly passed.
Mark-Simulacrum [Sun, 27 Nov 2016 16:52:44 +0000 (09:52 -0700)]
Simplify calling find_implied_output_region.
@nnethercote added the optimization that find_implied_output_region
takes a closure as an optimization in #37014, but passing an iterator is
simpler, and more ergonomic for callers.
bors [Tue, 6 Dec 2016 07:35:21 +0000 (07:35 +0000)]
Auto merge of #38097 - Mark-Simulacrum:fn-sig-slice, r=eddyb
Refactor ty::FnSig to contain a &'tcx Slice<Ty<'tcx>>
We refactor this in order to achieve the following wins:
- Decrease the size of `FnSig` (`Vec` + `bool`: 32, `&Slice` + `bool`: 24).
- Potentially decrease total allocated memory due to arena-allocating `FnSig` inputs/output; since they are allocated in the type list arena, other users of type lists can reuse the same allocation for an equivalent type list.
- Remove the last part of the type system which needs drop glue (#37965 removed the other remaining part). This makes arenas containing `FnSig` faster to drop (since we don't need to drop a Vec for each one), and makes reusing them without clearing/dropping potentially possible.
bors [Tue, 6 Dec 2016 00:17:24 +0000 (00:17 +0000)]
Auto merge of #38121 - jonathandturner:better_e0061, r=nikomatsakis
Point arg num mismatch errors back to their definition
This PR updates the arg num errors (like E0061) to point back at the function definition where they were defined.
Before:
```
error[E0061]: this function takes 2 parameters but 1 parameter was supplied
--> E0061.rs:18:7
|
18 | f(0);
| ^
|
= note: the following parameter types were expected:
= note: u16, &str
```
Now:
```
error[E0061]: this function takes 2 parameters but 1 parameter was supplied
--> E0061.rs:18:7
|
11 | fn f(a: u16, b: &str) {}
| ------------------------ defined here
...
18 | f(0);
| ^ expected 2 parameters
```
This is an incremental improvement. We probably want to underline only the function name and also have support for functions defined in crates outside of the current crate.
This adds a CommandExt trait for Windows along with an implementation of it
for std::process::Command with methods to set the process creation flags that
are passed to CreateProcess.
bors [Sun, 4 Dec 2016 23:36:50 +0000 (23:36 +0000)]
Auto merge of #38092 - pnkfelix:mir-stats, r=nikomatsakis
Adds `-Z mir-stats`, which is similar to `-Z hir-stats`.
Adds `-Z mir-stats`, which is similar to `-Z hir-stats`.
Some notes:
* This code attempts to present the breakdown of each variant for
every enum in the MIR. This is meant to guide decisions about how to
revise representations e.g. when to box payloads for rare variants
to shrink the size of the enum overall.
* I left out the "Total:" line that hir-stats presents, because this
implementation uses the MIR Visitor infrastructure, and the memory
usage of structures directly embedded in other structures (e.g. the
`func: Operand` in a `TerminatorKind:Call`) is not distinguished
from similar structures allocated in a `Vec` (e.g. the `args:
Vec<Operand>` in a `TerminatorKind::Call`). This means that a naive
summation of all the accumulated sizes is misleading, because it
will double-count the contribution of the `Operand` of the `func` as
well as the size of the whole `TerminatorKind`.
* I did consider abandoning the MIR Visitor and instead hand-coding
a traversal that distinguished embedded storage from indirect
storage. But such code would be fragile; better to just require
people to take care when interpreting the presented results.
* This traverses the `mir.promoted` rvalues to capture stats for MIR
stored there, even though the MIR visitor super_mir method does not
do so. (I did not observe any promoted mir being newly traversed when
compiling the rustc crate, however.)
* It might be nice to try to unify this code with hir-stats. Then
again, the reporting portion is the only common code (I think), and
it is small compared to the visitors in hir-stats and mir-stats.
Alex Burka [Sun, 4 Dec 2016 20:56:51 +0000 (15:56 -0500)]
reference: fix definition of :tt
The reference says that $x:tt matches "either side of the `=>` in macro_rules` which is technically true but completely uninformative. This changes that bullet point to what the book says (a single token or sequence of token trees inside brackets).
Ulrik Sverdrup [Sat, 3 Dec 2016 21:11:06 +0000 (22:11 +0100)]
collections: Simplify VecDeque::is_empty
Improve is_empty on the VecDeque and its iterators by just comparing
tail and head; this saves a few instructions (to be able to remove the
`& (size - 1)` computation, it would have to know that size is a power of two).
bors [Sun, 4 Dec 2016 06:38:38 +0000 (06:38 +0000)]
Auto merge of #37920 - nikomatsakis:compile-time-regression-37864, r=mw
in region, treat current (and future) item-likes alike
The `visit_fn` code mutates its surrounding context. Between *items*,
this was saved/restored, but between impl items it was not. This meant
that we wound up with `CallSiteScope` entries with two parents (or
more!). As far as I can tell, this is harmless in actual type-checking,
since the regions you interact with are always from at most one of those
branches. But it can slow things down.
Before, the effect was limited, since it only applied to impl items
within an impl. After #37660, impl items are visisted all together at
the end, and hence this could create a very messed up
hierarchy. Isolating impl item properly solves both issues.
I cannot come up with a way to unit-test this; for posterity, however,
you can observe the messed up hierarchies with a test as simple as the
following, which would create a callsite scope with two parents both
before and after
```
struct Foo {
}
impl Foo {
fn bar(&self) -> usize {
22
}
fn baz(&self) -> usize {
22
}
}
fn main() { }
```
Fixes #37864.
r? @michaelwoerister
cc @pnkfelix -- can you think of a way to make a regr test?
Corey Farwell [Sat, 3 Dec 2016 20:39:53 +0000 (15:39 -0500)]
Rollup merge of #38113 - nikomatsakis:incremental-dump-hash, r=michaelwoerister
add a `-Z incremental-dump-hash` flag
This causes us to dump a bunch of has information to stdout that can be
useful in tracking down incremental compilation invalidations,
particularly across crates.