auto merge of #15336 : jakub-/rust/diagnostics, r=brson
This is a continuation of @brson's work from https://github.com/rust-lang/rust/pull/12144.
This implements the minimal scaffolding that allows mapping diagnostic messages to alpha-numeric codes, which could improve the searchability of errors. In addition, there's a new compiler option, `--explain {code}` which takes an error code and prints out a somewhat detailed explanation of the error. Example:
```shell
[~/rust]$ ./build/x86_64-apple-darwin/stage2/bin/rustc ./diagnostics.rs --crate-type dylib
diagnostics.rs:5:3: 5:13 error: unreachable pattern [E0001] (pass `--explain E0001` to see a detailed explanation)
diagnostics.rs:5 Some(true) => ()
^~~~~~~~~~
error: aborting due to previous error
[~/rust]$ ./build/x86_64-apple-darwin/stage2/bin/rustc --explain E0001
This error suggests that the expression arm corresponding to the noted pattern
will never be reached as for all possible values of the expression being matched,
one of the preceeding patterns will match.
This means that perhaps some of the preceeding patterns are too general, this
one is too specific or the ordering is incorrect.
```
I've refrained from migrating many errors to actually use the new macros as it can be done in an incremental fashion but if we're happy with the approach, it'd be good to do all of them sooner rather than later.
Originally, I was going to make libdiagnostics a separate crate but that's posing some interesting challenges with semi-circular dependencies. In particular, librustc would have a plugin-phase dependency on libdiagnostics, which itself depends on librustc. Per my conversation with @alexcrichton, it seems like the snapshotting process would also have to change. So for now the relevant modules from libdiagnostics are included using `#[path = ...] mod`.
auto merge of #15353 : aturon/rust/env-hashmap, r=alexcrichton
This commit adds `env_insert` and `env_remove` methods to the `Command`
builder, easing updates to the environment variables for the child
process. The existing method, `env`, is still available for overriding
the entire environment in one shot (after which the `env_insert` and
`env_remove` methods can be used to make further adjustments).
To support these new methods, the internal `env` representation for
`Command` has been changed to an optional `HashMap` holding owned
`CString`s (to support non-utf8 data). The `HashMap` is only
materialized if the environment is updated. The implementation does not
try hard to avoid allocation, since the cost of launching a process will
dwarf any allocation cost.
This patch also adds `PartialOrd`, `Eq`, and `Hash` implementations for
`CString`.
This commit changes the `io::process::Command` API to provide
fine-grained control over the environment:
* The `env` method now inserts/updates a key/value pair.
* The `env_remove` method removes a key from the environment.
* The old `env` method, which sets the entire environment in one shot,
is renamed to `env_set_all`. It can be used in conjunction with the
finer-grained methods. This renaming is a breaking change.
To support these new methods, the internal `env` representation for
`Command` has been changed to an optional `HashMap` holding owned
`CString`s (to support non-utf8 data). The `HashMap` is only
materialized if the environment is updated. The implementation does not
try hard to avoid allocation, since the cost of launching a process will
dwarf any allocation cost.
This patch also adds `PartialOrd`, `Eq`, and `Hash` implementations for
`CString`.
auto merge of #15561 : huonw/rust/must-use-iterators, r=alexcrichton
Similar to the stability attributes, a type annotated with `#[must_use =
"informative snippet"]` will print the normal warning message along with
"informative snippet". This allows the type author to provide some
guidance about why the type should be used.
---
It can be a little unintuitive that something like `v.iter().map(|x|
println!("{}", x));` does nothing: the majority of the iterator adaptors
are lazy and do not execute anything until something calls `next`, e.g.
a `for` loop, `collect`, `fold`, etc.
The majority of such errors can be seen by someone writing something
like the above, i.e. just calling an iterator adaptor and doing nothing
with it (and doing this is certainly useless), so we can co-opt the
`must_use` lint, using the message functionality to give a hint to the
reason why.
auto merge of #15550 : alexcrichton/rust/install-script, r=brson
This adds detection of the relevant LD_LIBRARY_PATH-like environment variable
and appropriately sets it when testing whether binaries can run or not.
Additionally, the installation prints a recommended value if one is necessary.
Huon Wilson [Wed, 9 Jul 2014 12:13:14 +0000 (22:13 +1000)]
core: add `#[must_use]` attributes to iterator adaptor structs.
It can be a little unintuitive that something like `v.iter().map(|x|
println!("{}", x));` does nothing: the majority of the iterator adaptors
are lazy and do not execute anything until something calls `next`, e.g.
a `for` loop, `collect`, `fold`, etc.
The majority of such errors can be seen by someone writing something
like the above, i.e. just calling an iterator adaptor and doing nothing
with it (and doing this is certainly useless), so we can co-opt the
`must_use` lint, using the message functionality to give a hint to the
reason why.
Huon Wilson [Wed, 9 Jul 2014 12:02:19 +0000 (22:02 +1000)]
lint: extend `#[must_use]` to handle a message.
Similar to the stability attributes, a type annotated with `#[must_use =
"informative snippet"]` will print the normal warning message along with
"informative snippet". This allows the type author to provide some
guidance about why the type should be used.
auto merge of #15471 : erickt/rust/push_all, r=acrichto
llvm is currently not able to conver `Vec::extend` into a memcpy for `Copy` types, which results in methods like `Vec::push_all` to run twice as slow as it should be running. This patch takes the unsafe `Vec::clone` optimization to speed up all the operations that are cloning a slice into a `Vec`.
auto merge of #15283 : kwantam/rust/master, r=alexcrichton
Add libunicode; move unicode functions from core
- created new crate, libunicode, below libstd
- split `Char` trait into `Char` (libcore) and `UnicodeChar` (libunicode)
- Unicode-aware functions now live in libunicode
- `is_alphabetic`, `is_XID_start`, `is_XID_continue`, `is_lowercase`,
`is_uppercase`, `is_whitespace`, `is_alphanumeric`, `is_control`, `is_digit`,
`to_uppercase`, `to_lowercase`
- added `width` method in UnicodeChar trait
- determines printed width of character in columns, or None if it is a non-NULL control character
- takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise)
- split `StrSlice` into `StrSlice` (libcore) and `UnicodeStrSlice` (libunicode)
- functionality formerly in `StrSlice` that relied upon Unicode functionality from `Char` is now in `UnicodeStrSlice`
- `words`, `is_whitespace`, `is_alphanumeric`, `trim`, `trim_left`, `trim_right`
- also moved `Words` type alias into libunicode because `words` method is in `UnicodeStrSlice`
- unified Unicode tables from libcollections, libcore, and libregex into libunicode
- updated `unicode.py` in `src/etc` to generate aforementioned tables
- generated new tables based on latest Unicode data
- added `UnicodeChar` and `UnicodeStrSlice` traits to prelude
- libunicode is now the collection point for the `std::char` module, combining the libunicode functionality with the `Char` functionality from libcore
- thus, moved doc comment for `char` from `core::char` to `unicode::char`
- libcollections remains the collection point for `std::str`
The Unicode-aware functions that previously lived in the `Char` and `StrSlice` traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and `use` the `UnicodeChar` and/or `UnicodeStrSlice` traits:
extern crate unicode;
use unicode::UnicodeChar;
use unicode::UnicodeStrSlice;
use unicode::Words; // if you want to use the words() method
NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude.
auto merge of #15220 : vhbit/rust/treemap-str-equiv, r=alexcrichton
- it allows to lookup using any str-equiv object, making TreeMaps finally usable (for example, it is much easier to work with JSON with lookup values being static strs)
- actually provides pretty flexible solution which could be extended to other equivalent types (although it might be not that performant)
Alex Crichton [Wed, 9 Jul 2014 14:44:49 +0000 (07:44 -0700)]
etc: Fix install script for rpath removal
This adds detection of the relevant LD_LIBRARY_PATH-like environment variable
and appropriately sets it when testing whether binaries can run or not.
Additionally, the installation prints a recommended value if one is necessary.
- unicode tests live in coretest crate
- libcollections str tests need UnicodeChar trait.
- libregex perlw tests were checking a char in the Alphabetic category,
\x2161. Confirmed perl 5.18 considers this a \w character. Changed to
\x2961, which is not \w as the test expects.
auto merge of #15540 : Gankro/rust/master, r=huonw
Removing recursion from TreeMap implementation, because we don't have TCO. No need to add ```O(logn)``` extra stack frames to search in a tree.
I find it curious that ```find_mut``` and ```find``` basically duplicated the same logic, but in different ways (iterative vs recursive), possibly to maneuvre around mutability rules, but that's a more fundamental issue to deal with elsewhere.
Thanks to acrichto for the magic trick to appease borrowck (another issue to deal with elsewhere).
auto merge of #15530 : adrientetar/rust/proper-fonts, r=alexcrichton
- Treat WOFF as binary files so that git does not perform newline normalization.
- Replace corrupt Heuristica files with Source Serif Pro — italics are [almost in production](https://github.com/adobe/source-serif-pro/issues/2) so I left Heuristica Italic which makes a good pair with SSP. Overall, Source Serif Pro is I think a better fit for rustdoc (cc @TheHydroImpulse). This ought to fix #15527.
- Store Source Code Pro locally in order to make offline docs freestanding. Fixes #14778.
lexer: lex WS/COMMENT/SHEBANG rather than skipping
Now, the lexer will categorize every byte in its input according to the
grammar. The parser skips over these while parsing, thus avoiding their
presence in the input to syntax extensions.
Corey Richardson [Wed, 18 Jun 2014 17:44:20 +0000 (10:44 -0700)]
syntax: don't parse numeric literals in the lexer
This removes a bunch of token types. Tokens now store the original, unaltered
numeric literal (that is still checked for correctness), which is parsed into
an actual number later, as needed, when creating the AST.
This can change how syntax extensions work, but otherwise poses no visible
changes.
syntax: don't process string/char/byte/binary lits
This shuffles things around a bit so that LIT_CHAR and co store an Ident
which is the original, unaltered literal in the source. When creating the AST,
unescape and postprocess them.
This changes how syntax extensions can work, slightly, but otherwise poses no
visible changes. To get a useful value out of one of these tokens, call
`parse::{char_lit, byte_lit, bin_lit, str_lit}`
Corey Richardson [Tue, 17 Jun 2014 06:00:49 +0000 (23:00 -0700)]
syntax: use a better Show impl for Ident
Rather than just dumping the id in the interner, which is useless, actually
print the interned string. Adjust the lexer logging to use Show instead of
Poly.
auto merge of #15537 : jbclements/rust/hygiene-for-methods, r=pcwalton
This patch adds hygiene for methods. This one was more difficult than the others, due principally to issues surrounding `self`. Specifically, there were a whole bunch of places in the code that assumed that a `self` identifier could be discarded and then made up again later, causing the discard of contexts and hygiene breakage.
auto merge of #15374 : steveklabnik/rust/comments, r=brson
I'm leaving off `rustdoc` usage because it won't work unless this is a `pub fn`, and I want to talk about public/private in the context of modules. I'm also not mentioning `//!` because it is exclusively used to provide the overview of a module.
John Clements [Sun, 6 Jul 2014 22:10:57 +0000 (15:10 -0700)]
carry self ident forward through re-parsing
formerly, the self identifier was being discarded during parsing, which
stymies hygiene. The best fix here seems to be to attach a self identifier
to ExplicitSelf_, a change that rippled through the rest of the compiler,
but without any obvious damage.
John Clements [Mon, 7 Jul 2014 16:54:08 +0000 (09:54 -0700)]
introducing let-syntax
The let-syntax expander is different in that it doesn't apply
a mark to its token trees before expansion. This is used
for macro_rules, and it's because macro_rules is essentially
MTWT's let-syntax. You don't want to mark before expand sees
let-syntax, because there's no "after" syntax to mark again.
In some sense, the cleaner approach might be to introduce a new
AST node that macro_rules expands into; this would make it clearer
that the expansion of a macro is distinct from the addition of a
new macro binding.
auto merge of #14832 : alexcrichton/rust/no-rpath, r=brson
This commit disables rustc's emission of rpath attributes into dynamic libraries
and executables by default. The functionality is still preserved, but it must
now be manually enabled via a `-C rpath` flag.
This involved a few changes to the local build system:
* --disable-rpath is now the default configure option
* Makefiles now prefer our own LD_LIBRARY_PATH over the user's LD_LIBRARY_PATH
in order to support building rust with rust already installed.
* The compiletest program was taught to correctly pass through the aux dir as a
component of LD_LIBRARY_PATH in more situations.
The major impact of this change is that neither rustdoc nor rustc will work
out-of-the-box in all situations because they are dynamically linked. It must be
arranged to ensure that the libraries of a rust installation are part of the
LD_LIBRARY_PATH. The default installation paths for all platforms ensure this,
but if an installation is in a nonstandard location, then configuration may be
necessary.
Additionally, for all developers of rustc, it will no longer be possible to run
$target/stageN/bin/rustc out-of-the-box. The old behavior can be regained
through the `--enable-rpath` option to the configure script.
This change brings linux/mac installations in line with windows installations
where rpath is not possible.
auto merge of #15493 : brson/rust/tostr, r=pcwalton
This updates https://github.com/rust-lang/rust/pull/15075.
Rename `ToStr::to_str` to `ToString::to_string`. The naive renaming ends up with two `to_string` functions defined on strings in the prelude (the other defined via `collections::str::StrAllocating`). To remedy this I removed `StrAllocating::to_string`, making all conversions from `&str` to `String` go through `Show`. This has a measurable impact on the speed of this conversion, but the sense I get from others is that it's best to go ahead and unify `to_string` and address performance for all `to_string` conversions in `core::fmt`. `String::from_str(...)` still works as a manual fast-path.
Note that the patch was done with a script, and ended up renaming a number of other `*_to_str` functions, particularly inside of rustc. All the ones I saw looked correct, and I didn't notice any additional API breakage.