1 // Copyright 2012-2014 The Rust Project Developers. See the COPYRIGHT
2 // file at the top-level directory of this distribution and at
3 // http://rust-lang.org/COPYRIGHT.
5 // Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
6 // http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
7 // <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
8 // option. This file may not be copied, modified, or distributed
9 // except according to those terms.
11 //! Finds crate binaries and loads their metadata
13 //! Might I be the first to welcome you to a world of platform differences,
14 //! version requirements, dependency graphs, conflicting desires, and fun! This
15 //! is the major guts (along with metadata::creader) of the compiler for loading
16 //! crates and resolving dependencies. Let's take a tour!
20 //! Each invocation of the compiler is immediately concerned with one primary
21 //! problem, to connect a set of crates to resolved crates on the filesystem.
22 //! Concretely speaking, the compiler follows roughly these steps to get here:
24 //! 1. Discover a set of `extern crate` statements.
25 //! 2. Transform these directives into crate names. If the directive does not
26 //! have an explicit name, then the identifier is the name.
27 //! 3. For each of these crate names, find a corresponding crate on the
30 //! Sounds easy, right? Let's walk into some of the nuances.
32 //! ## Transitive Dependencies
34 //! Let's say we've got three crates: A, B, and C. A depends on B, and B depends
35 //! on C. When we're compiling A, we primarily need to find and locate B, but we
36 //! also end up needing to find and locate C as well.
38 //! The reason for this is that any of B's types could be composed of C's types,
39 //! any function in B could return a type from C, etc. To be able to guarantee
40 //! that we can always typecheck/translate any function, we have to have
41 //! complete knowledge of the whole ecosystem, not just our immediate
44 //! So now as part of the "find a corresponding crate on the filesystem" step
45 //! above, this involves also finding all crates for *all upstream
46 //! dependencies*. This includes all dependencies transitively.
48 //! ## Rlibs and Dylibs
50 //! The compiler has two forms of intermediate dependencies. These are dubbed
51 //! rlibs and dylibs for the static and dynamic variants, respectively. An rlib
52 //! is a rustc-defined file format (currently just an ar archive) while a dylib
53 //! is a platform-defined dynamic library. Each library has a metadata somewhere
56 //! When translating a crate name to a crate on the filesystem, we all of a
57 //! sudden need to take into account both rlibs and dylibs! Linkage later on may
58 //! use either one of these files, as each has their pros/cons. The job of crate
59 //! loading is to discover what's possible by finding all candidates.
61 //! Most parts of this loading systems keep the dylib/rlib as just separate
66 //! We can't exactly scan your whole hard drive when looking for dependencies,
67 //! so we need to places to look. Currently the compiler will implicitly add the
68 //! target lib search path ($prefix/lib/rustlib/$target/lib) to any compilation,
69 //! and otherwise all -L flags are added to the search paths.
71 //! ## What criterion to select on?
73 //! This a pretty tricky area of loading crates. Given a file, how do we know
74 //! whether it's the right crate? Currently, the rules look along these lines:
76 //! 1. Does the filename match an rlib/dylib pattern? That is to say, does the
77 //! filename have the right prefix/suffix?
78 //! 2. Does the filename have the right prefix for the crate name being queried?
79 //! This is filtering for files like `libfoo*.rlib` and such.
80 //! 3. Is the file an actual rust library? This is done by loading the metadata
81 //! from the library and making sure it's actually there.
82 //! 4. Does the name in the metadata agree with the name of the library?
83 //! 5. Does the target in the metadata agree with the current target?
84 //! 6. Does the SVH match? (more on this later)
86 //! If the file answers `yes` to all these questions, then the file is
87 //! considered as being *candidate* for being accepted. It is illegal to have
88 //! more than two candidates as the compiler has no method by which to resolve
89 //! this conflict. Additionally, rlib/dylib candidates are considered
92 //! After all this has happened, we have 1 or two files as candidates. These
93 //! represent the rlib/dylib file found for a library, and they're returned as
96 //! ### What about versions?
98 //! A lot of effort has been put forth to remove versioning from the compiler.
99 //! There have been forays in the past to have versioning baked in, but it was
100 //! largely always deemed insufficient to the point that it was recognized that
101 //! it's probably something the compiler shouldn't do anyway due to its
102 //! complicated nature and the state of the half-baked solutions.
104 //! With a departure from versioning, the primary criterion for loading crates
105 //! is just the name of a crate. If we stopped here, it would imply that you
106 //! could never link two crates of the same name from different sources
107 //! together, which is clearly a bad state to be in.
109 //! To resolve this problem, we come to the next section!
113 //! A number of flags have been added to the compiler to solve the "version
114 //! problem" in the previous section, as well as generally enabling more
115 //! powerful usage of the crate loading system of the compiler. The goal of
116 //! these flags and options are to enable third-party tools to drive the
117 //! compiler with prior knowledge about how the world should look.
119 //! ## The `--extern` flag
121 //! The compiler accepts a flag of this form a number of times:
124 //! --extern crate-name=path/to/the/crate.rlib
127 //! This flag is basically the following letter to the compiler:
131 //! > When you are attempting to load the immediate dependency `crate-name`, I
132 //! > would like you too assume that the library is located at
133 //! > `path/to/the/crate.rlib`, and look nowhere else. Also, please do not
134 //! > assume that the path I specified has the name `crate-name`.
136 //! This flag basically overrides most matching logic except for validating that
137 //! the file is indeed a rust library. The same `crate-name` can be specified
138 //! twice to specify the rlib/dylib pair.
140 //! ## Enabling "multiple versions"
142 //! This basically boils down to the ability to specify arbitrary packages to
143 //! the compiler. For example, if crate A wanted to use Bv1 and Bv2, then it
144 //! would look something like:
153 //! and the compiler would be invoked as:
156 //! rustc a.rs --extern b1=path/to/libb1.rlib --extern b2=path/to/libb2.rlib
159 //! In this scenario there are two crates named `b` and the compiler must be
160 //! manually driven to be informed where each crate is.
162 //! ## Frobbing symbols
164 //! One of the immediate problems with linking the same library together twice
165 //! in the same problem is dealing with duplicate symbols. The primary way to
166 //! deal with this in rustc is to add hashes to the end of each symbol.
168 //! In order to force hashes to change between versions of a library, if
169 //! desired, the compiler exposes an option `-C metadata=foo`, which is used to
170 //! initially seed each symbol hash. The string `foo` is prepended to each
171 //! string-to-hash to ensure that symbols change over time.
173 //! ## Loading transitive dependencies
175 //! Dealing with same-named-but-distinct crates is not just a local problem, but
176 //! one that also needs to be dealt with for transitive dependencies. Note that
177 //! in the letter above `--extern` flags only apply to the *local* set of
178 //! dependencies, not the upstream transitive dependencies. Consider this
179 //! dependency graph:
191 //! In this scenario, when we compile `D`, we need to be able to distinctly
192 //! resolve `A.1` and `A.2`, but an `--extern` flag cannot apply to these
193 //! transitive dependencies.
195 //! Note that the key idea here is that `B` and `C` are both *already compiled*.
196 //! That is, they have already resolved their dependencies. Due to unrelated
197 //! technical reasons, when a library is compiled, it is only compatible with
198 //! the *exact same* version of the upstream libraries it was compiled against.
199 //! We use the "Strict Version Hash" to identify the exact copy of an upstream
202 //! With this knowledge, we know that `B` and `C` will depend on `A` with
203 //! different SVH values, so we crawl the normal `-L` paths looking for
204 //! `liba*.rlib` and filter based on the contained SVH.
206 //! In the end, this ends up not needing `--extern` to specify upstream
207 //! transitive dependencies.
211 //! That's the general overview of loading crates in the compiler, but it's by
212 //! no means all of the necessary details. Take a look at the rest of
213 //! metadata::loader or metadata::creader for all the juicy details!
215 use back::archive::{METADATA_FILENAME};
217 use session::Session;
218 use session::search_paths::PathKind;
220 use llvm::{False, ObjectFile, mk_section_iter};
221 use llvm::archive_ro::ArchiveRO;
222 use metadata::cstore::{MetadataBlob, MetadataVec, MetadataArchive};
223 use metadata::decoder;
224 use metadata::encoder;
225 use metadata::filesearch::{FileSearch, FileMatches, FileDoesntMatch};
226 use syntax::codemap::Span;
227 use syntax::diagnostic::SpanHandler;
229 use rustc_back::target::Target;
231 use std::ffi::CString;
233 use std::collections::HashMap;
234 use std::old_io::fs::PathExtensions;
238 use std::time::Duration;
242 pub struct CrateMismatch {
247 pub struct Context<'a> {
248 pub sess: &'a Session,
251 pub crate_name: &'a str,
252 pub hash: Option<&'a Svh>,
253 // points to either self.sess.target.target or self.sess.host, must match triple
254 pub target: &'a Target,
256 pub filesearch: FileSearch<'a>,
257 pub root: &'a Option<CratePaths>,
258 pub rejected_via_hash: Vec<CrateMismatch>,
259 pub rejected_via_triple: Vec<CrateMismatch>,
260 pub should_match_name: bool,
264 pub dylib: Option<(Path, PathKind)>,
265 pub rlib: Option<(Path, PathKind)>,
266 pub metadata: MetadataBlob,
269 pub struct ArchiveMetadata {
271 // points into self._archive
275 pub struct CratePaths {
277 pub dylib: Option<Path>,
278 pub rlib: Option<Path>
282 fn paths(&self) -> Vec<Path> {
283 match (&self.dylib, &self.rlib) {
284 (&None, &None) => vec!(),
285 (&Some(ref p), &None) |
286 (&None, &Some(ref p)) => vec!(p.clone()),
287 (&Some(ref p1), &Some(ref p2)) => vec!(p1.clone(), p2.clone()),
292 impl<'a> Context<'a> {
293 pub fn maybe_load_library_crate(&mut self) -> Option<Library> {
294 self.find_library_crate()
297 pub fn load_library_crate(&mut self) -> Library {
298 match self.find_library_crate() {
301 self.report_load_errs();
307 pub fn report_load_errs(&mut self) {
308 let message = if self.rejected_via_hash.len() > 0 {
309 format!("found possibly newer version of crate `{}`",
311 } else if self.rejected_via_triple.len() > 0 {
312 format!("couldn't find crate `{}` with expected target triple {}",
313 self.ident, self.triple)
315 format!("can't find crate for `{}`", self.ident)
317 let message = match self.root {
319 &Some(ref r) => format!("{} which `{}` depends on",
322 self.sess.span_err(self.span, &message[]);
324 if self.rejected_via_triple.len() > 0 {
325 let mismatches = self.rejected_via_triple.iter();
326 for (i, &CrateMismatch{ ref path, ref got }) in mismatches.enumerate() {
327 self.sess.fileline_note(self.span,
328 &format!("crate `{}`, path #{}, triple {}: {}",
329 self.ident, i+1, got, path.display())[]);
332 if self.rejected_via_hash.len() > 0 {
333 self.sess.span_note(self.span, "perhaps this crate needs \
335 let mismatches = self.rejected_via_hash.iter();
336 for (i, &CrateMismatch{ ref path, .. }) in mismatches.enumerate() {
337 self.sess.fileline_note(self.span,
338 &format!("crate `{}` path {}{}: {}",
339 self.ident, "#", i+1, path.display())[]);
344 for (i, path) in r.paths().iter().enumerate() {
345 self.sess.fileline_note(self.span,
346 &format!("crate `{}` path #{}: {}",
347 r.ident, i+1, path.display())[]);
352 self.sess.abort_if_errors();
355 fn find_library_crate(&mut self) -> Option<Library> {
356 // If an SVH is specified, then this is a transitive dependency that
357 // must be loaded via -L plus some filtering.
358 if self.hash.is_none() {
359 self.should_match_name = false;
360 match self.find_commandline_library() {
361 Some(l) => return Some(l),
364 self.should_match_name = true;
367 let dypair = self.dylibname();
369 // want: crate_name.dir_part() + prefix + crate_name.file_part + "-"
370 let dylib_prefix = format!("{}{}", dypair.0, self.crate_name);
371 let rlib_prefix = format!("lib{}", self.crate_name);
373 let mut candidates = HashMap::new();
375 // First, find all possible candidate rlibs and dylibs purely based on
376 // the name of the files themselves. We're trying to match against an
377 // exact crate name and a possibly an exact hash.
379 // During this step, we can filter all found libraries based on the
380 // name and id found in the crate id (we ignore the path portion for
381 // filename matching), as well as the exact hash (if specified). If we
382 // end up having many candidates, we must look at the metadata to
383 // perform exact matches against hashes/crate ids. Note that opening up
384 // the metadata is where we do an exact match against the full contents
385 // of the crate id (path/name/id).
387 // The goal of this step is to look at as little metadata as possible.
388 self.filesearch.search(|path, kind| {
389 let file = match path.filename_str() {
390 None => return FileDoesntMatch,
393 let (hash, rlib) = if file.starts_with(&rlib_prefix[]) &&
394 file.ends_with(".rlib") {
395 (&file[(rlib_prefix.len()) .. (file.len() - ".rlib".len())],
397 } else if file.starts_with(dylib_prefix.as_slice()) &&
398 file.ends_with(dypair.1.as_slice()) {
399 (&file[(dylib_prefix.len()) .. (file.len() - dypair.1.len())],
402 return FileDoesntMatch
404 info!("lib candidate: {}", path.display());
406 let hash_str = hash.to_string();
407 let slot = candidates.entry(hash_str).get().unwrap_or_else(
408 |vacant_entry| vacant_entry.insert((HashMap::new(), HashMap::new())));
409 let (ref mut rlibs, ref mut dylibs) = *slot;
411 rlibs.insert(fs::realpath(path).unwrap(), kind);
413 dylibs.insert(fs::realpath(path).unwrap(), kind);
419 // We have now collected all known libraries into a set of candidates
420 // keyed of the filename hash listed. For each filename, we also have a
421 // list of rlibs/dylibs that apply. Here, we map each of these lists
422 // (per hash), to a Library candidate for returning.
424 // A Library candidate is created if the metadata for the set of
425 // libraries corresponds to the crate id and hash criteria that this
426 // search is being performed for.
427 let mut libraries = Vec::new();
428 for (_hash, (rlibs, dylibs)) in candidates {
429 let mut metadata = None;
430 let rlib = self.extract_one(rlibs, "rlib", &mut metadata);
431 let dylib = self.extract_one(dylibs, "dylib", &mut metadata);
434 libraries.push(Library {
444 // Having now translated all relevant found hashes into libraries, see
445 // what we've got and figure out if we found multiple candidates for
447 match libraries.len() {
449 1 => Some(libraries.into_iter().next().unwrap()),
451 self.sess.span_err(self.span,
452 &format!("multiple matching crates for `{}`",
454 self.sess.note("candidates:");
455 for lib in &libraries {
457 Some((ref p, _)) => {
458 self.sess.note(&format!("path: {}",
464 Some((ref p, _)) => {
465 self.sess.note(&format!("path: {}",
470 let data = lib.metadata.as_slice();
471 let name = decoder::get_crate_name(data);
472 note_crate_name(self.sess.diagnostic(), &name[]);
479 // Attempts to extract *one* library from the set `m`. If the set has no
480 // elements, `None` is returned. If the set has more than one element, then
481 // the errors and notes are emitted about the set of libraries.
483 // With only one library in the set, this function will extract it, and then
484 // read the metadata from it if `*slot` is `None`. If the metadata couldn't
485 // be read, it is assumed that the file isn't a valid rust library (no
486 // errors are emitted).
487 fn extract_one(&mut self, m: HashMap<Path, PathKind>, flavor: &str,
488 slot: &mut Option<MetadataBlob>) -> Option<(Path, PathKind)> {
489 let mut ret = None::<(Path, PathKind)>;
493 // FIXME(#10786): for an optimization, we only read one of the
494 // library's metadata sections. In theory we should
495 // read both, but reading dylib metadata is quite
499 } else if m.len() == 1 {
500 return Some(m.into_iter().next().unwrap())
504 for (lib, kind) in m {
505 info!("{} reading metadata from: {}", flavor, lib.display());
506 let metadata = match get_metadata_section(self.target.options.is_like_osx,
509 if self.crate_matches(blob.as_slice(), &lib) {
512 info!("metadata mismatch");
517 info!("no metadata found");
522 self.sess.span_err(self.span,
523 &format!("multiple {} candidates for `{}` \
527 self.sess.span_note(self.span,
528 &format!(r"candidate #1: {}",
529 ret.as_ref().unwrap().0
536 self.sess.span_note(self.span,
537 &format!(r"candidate #{}: {}", error,
541 *slot = Some(metadata);
542 ret = Some((lib, kind));
544 return if error > 0 {None} else {ret}
547 fn crate_matches(&mut self, crate_data: &[u8], libpath: &Path) -> bool {
548 if self.should_match_name {
549 match decoder::maybe_get_crate_name(crate_data) {
550 Some(ref name) if self.crate_name == *name => {}
551 _ => { info!("Rejecting via crate name"); return false }
554 let hash = match decoder::maybe_get_crate_hash(crate_data) {
555 Some(hash) => hash, None => {
556 info!("Rejecting via lack of crate hash");
561 let triple = match decoder::get_crate_triple(crate_data) {
562 None => { debug!("triple not present"); return false }
565 if triple != self.triple {
566 info!("Rejecting via crate triple: expected {} got {}", self.triple, triple);
567 self.rejected_via_triple.push(CrateMismatch {
568 path: libpath.clone(),
569 got: triple.to_string()
578 info!("Rejecting via hash: expected {} got {}", *myhash, hash);
579 self.rejected_via_hash.push(CrateMismatch {
580 path: libpath.clone(),
581 got: myhash.as_str().to_string()
592 // Returns the corresponding (prefix, suffix) that files need to have for
594 fn dylibname(&self) -> (String, String) {
595 let t = &self.target;
596 (t.options.dll_prefix.clone(), t.options.dll_suffix.clone())
599 fn find_commandline_library(&mut self) -> Option<Library> {
600 let locs = match self.sess.opts.externs.get(self.crate_name) {
605 // First, filter out all libraries that look suspicious. We only accept
606 // files which actually exist that have the correct naming scheme for
608 let sess = self.sess;
609 let dylibname = self.dylibname();
610 let mut rlibs = HashMap::new();
611 let mut dylibs = HashMap::new();
613 let mut locs = locs.iter().map(|l| Path::new(&l[])).filter(|loc| {
615 sess.err(&format!("extern location for {} does not exist: {}",
616 self.crate_name, loc.display())[]);
619 let file = match loc.filename_str() {
622 sess.err(&format!("extern location for {} is not a file: {}",
623 self.crate_name, loc.display())[]);
627 if file.starts_with("lib") && file.ends_with(".rlib") {
630 let (ref prefix, ref suffix) = dylibname;
631 if file.starts_with(&prefix[]) &&
632 file.ends_with(&suffix[]) {
636 sess.err(&format!("extern location for {} is of an unknown type: {}",
637 self.crate_name, loc.display())[]);
641 // Now that we have an iterator of good candidates, make sure
642 // there's at most one rlib and at most one dylib.
644 if loc.filename_str().unwrap().ends_with(".rlib") {
645 rlibs.insert(fs::realpath(&loc).unwrap(),
646 PathKind::ExternFlag);
648 dylibs.insert(fs::realpath(&loc).unwrap(),
649 PathKind::ExternFlag);
654 // Extract the rlib/dylib pair.
655 let mut metadata = None;
656 let rlib = self.extract_one(rlibs, "rlib", &mut metadata);
657 let dylib = self.extract_one(dylibs, "dylib", &mut metadata);
659 if rlib.is_none() && dylib.is_none() { return None }
661 Some(metadata) => Some(Library {
671 pub fn note_crate_name(diag: &SpanHandler, name: &str) {
672 diag.handler().note(&format!("crate name: {}", name)[]);
675 impl ArchiveMetadata {
676 fn new(ar: ArchiveRO) -> Option<ArchiveMetadata> {
677 let data = match ar.read(METADATA_FILENAME) {
678 Some(data) => data as *const [u8],
680 debug!("didn't find '{}' in the archive", METADATA_FILENAME);
685 Some(ArchiveMetadata {
691 pub fn as_slice<'a>(&'a self) -> &'a [u8] { unsafe { &*self.data } }
694 // Just a small wrapper to time how long reading metadata takes.
695 fn get_metadata_section(is_osx: bool, filename: &Path) -> Result<MetadataBlob, String> {
697 let dur = Duration::span(|| {
698 ret = Some(get_metadata_section_imp(is_osx, filename));
700 info!("reading {} => {}ms", filename.filename_display(),
701 dur.num_milliseconds());
702 return ret.unwrap();;
705 fn get_metadata_section_imp(is_osx: bool, filename: &Path) -> Result<MetadataBlob, String> {
706 if !filename.exists() {
707 return Err(format!("no such file: '{}'", filename.display()));
709 if filename.filename_str().unwrap().ends_with(".rlib") {
710 // Use ArchiveRO for speed here, it's backed by LLVM and uses mmap
711 // internally to read the file. We also avoid even using a memcpy by
712 // just keeping the archive along while the metadata is in use.
713 let archive = match ArchiveRO::open(filename) {
716 debug!("llvm didn't like `{}`", filename.display());
717 return Err(format!("failed to read rlib metadata: '{}'",
718 filename.display()));
721 return match ArchiveMetadata::new(archive).map(|ar| MetadataArchive(ar)) {
723 return Err((format!("failed to read rlib metadata: '{}'",
724 filename.display())))
726 Some(blob) => return Ok(blob)
730 let buf = CString::from_slice(filename.as_vec());
731 let mb = llvm::LLVMRustCreateMemoryBufferWithContentsOfFile(buf.as_ptr());
733 return Err(format!("error reading library: '{}'",
736 let of = match ObjectFile::new(mb) {
739 return Err((format!("provided path not an object file: '{}'",
740 filename.display())))
743 let si = mk_section_iter(of.llof);
744 while llvm::LLVMIsSectionIteratorAtEnd(of.llof, si.llsi) == False {
745 let mut name_buf = ptr::null();
746 let name_len = llvm::LLVMRustGetSectionName(si.llsi, &mut name_buf);
747 let name = slice::from_raw_buf(&(name_buf as *const u8),
748 name_len as uint).to_vec();
749 let name = String::from_utf8(name).unwrap();
750 debug!("get_metadata_section: name {}", name);
751 if read_meta_section_name(is_osx) == name {
752 let cbuf = llvm::LLVMGetSectionContents(si.llsi);
753 let csz = llvm::LLVMGetSectionSize(si.llsi) as uint;
754 let cvbuf: *const u8 = cbuf as *const u8;
755 let vlen = encoder::metadata_encoding_version.len();
756 debug!("checking {} bytes of metadata-version stamp",
758 let minsz = cmp::min(vlen, csz);
759 let buf0 = slice::from_raw_buf(&cvbuf, minsz);
760 let version_ok = buf0 == encoder::metadata_encoding_version;
762 return Err((format!("incompatible metadata version found: '{}'",
763 filename.display())));
766 let cvbuf1 = cvbuf.offset(vlen as int);
767 debug!("inflating {} bytes of compressed metadata",
769 let bytes = slice::from_raw_buf(&cvbuf1, csz-vlen);
770 match flate::inflate_bytes(bytes) {
771 Some(inflated) => return Ok(MetadataVec(inflated)),
775 llvm::LLVMMoveToNextSection(si.llsi);
777 return Err(format!("metadata not found: '{}'", filename.display()));
781 pub fn meta_section_name(is_osx: bool) -> &'static str {
783 "__DATA,__note.rustc"
789 pub fn read_meta_section_name(is_osx: bool) -> &'static str {
797 // A diagnostic function for dumping crate metadata to an output stream
798 pub fn list_file_metadata(is_osx: bool, path: &Path,
799 out: &mut old_io::Writer) -> old_io::IoResult<()> {
800 match get_metadata_section(is_osx, path) {
801 Ok(bytes) => decoder::list_crate_metadata(bytes.as_slice(), out),
803 write!(out, "{}\n", msg)