1 // Copyright 2012-2014 The Rust Project Developers. See the COPYRIGHT
2 // file at the top-level directory of this distribution and at
3 // http://rust-lang.org/COPYRIGHT.
5 // Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
6 // http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
7 // <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
8 // option. This file may not be copied, modified, or distributed
9 // except according to those terms.
11 //! Finds crate binaries and loads their metadata
13 //! Might I be the first to welcome you to a world of platform differences,
14 //! version requirements, dependency graphs, conflicting desires, and fun! This
15 //! is the major guts (along with metadata::creader) of the compiler for loading
16 //! crates and resolving dependencies. Let's take a tour!
20 //! Each invocation of the compiler is immediately concerned with one primary
21 //! problem, to connect a set of crates to resolved crates on the filesystem.
22 //! Concretely speaking, the compiler follows roughly these steps to get here:
24 //! 1. Discover a set of `extern crate` statements.
25 //! 2. Transform these directives into crate names. If the directive does not
26 //! have an explicit name, then the identifier is the name.
27 //! 3. For each of these crate names, find a corresponding crate on the
30 //! Sounds easy, right? Let's walk into some of the nuances.
32 //! ## Transitive Dependencies
34 //! Let's say we've got three crates: A, B, and C. A depends on B, and B depends
35 //! on C. When we're compiling A, we primarily need to find and locate B, but we
36 //! also end up needing to find and locate C as well.
38 //! The reason for this is that any of B's types could be composed of C's types,
39 //! any function in B could return a type from C, etc. To be able to guarantee
40 //! that we can always typecheck/translate any function, we have to have
41 //! complete knowledge of the whole ecosystem, not just our immediate
44 //! So now as part of the "find a corresponding crate on the filesystem" step
45 //! above, this involves also finding all crates for *all upstream
46 //! dependencies*. This includes all dependencies transitively.
48 //! ## Rlibs and Dylibs
50 //! The compiler has two forms of intermediate dependencies. These are dubbed
51 //! rlibs and dylibs for the static and dynamic variants, respectively. An rlib
52 //! is a rustc-defined file format (currently just an ar archive) while a dylib
53 //! is a platform-defined dynamic library. Each library has a metadata somewhere
56 //! When translating a crate name to a crate on the filesystem, we all of a
57 //! sudden need to take into account both rlibs and dylibs! Linkage later on may
58 //! use either one of these files, as each has their pros/cons. The job of crate
59 //! loading is to discover what's possible by finding all candidates.
61 //! Most parts of this loading systems keep the dylib/rlib as just separate
66 //! We can't exactly scan your whole hard drive when looking for dependencies,
67 //! so we need to places to look. Currently the compiler will implicitly add the
68 //! target lib search path ($prefix/lib/rustlib/$target/lib) to any compilation,
69 //! and otherwise all -L flags are added to the search paths.
71 //! ## What criterion to select on?
73 //! This a pretty tricky area of loading crates. Given a file, how do we know
74 //! whether it's the right crate? Currently, the rules look along these lines:
76 //! 1. Does the filename match an rlib/dylib pattern? That is to say, does the
77 //! filename have the right prefix/suffix?
78 //! 2. Does the filename have the right prefix for the crate name being queried?
79 //! This is filtering for files like `libfoo*.rlib` and such.
80 //! 3. Is the file an actual rust library? This is done by loading the metadata
81 //! from the library and making sure it's actually there.
82 //! 4. Does the name in the metadata agree with the name of the library?
83 //! 5. Does the target in the metadata agree with the current target?
84 //! 6. Does the SVH match? (more on this later)
86 //! If the file answers `yes` to all these questions, then the file is
87 //! considered as being *candidate* for being accepted. It is illegal to have
88 //! more than two candidates as the compiler has no method by which to resolve
89 //! this conflict. Additionally, rlib/dylib candidates are considered
92 //! After all this has happened, we have 1 or two files as candidates. These
93 //! represent the rlib/dylib file found for a library, and they're returned as
96 //! ### What about versions?
98 //! A lot of effort has been put forth to remove versioning from the compiler.
99 //! There have been forays in the past to have versioning baked in, but it was
100 //! largely always deemed insufficient to the point that it was recognized that
101 //! it's probably something the compiler shouldn't do anyway due to its
102 //! complicated nature and the state of the half-baked solutions.
104 //! With a departure from versioning, the primary criterion for loading crates
105 //! is just the name of a crate. If we stopped here, it would imply that you
106 //! could never link two crates of the same name from different sources
107 //! together, which is clearly a bad state to be in.
109 //! To resolve this problem, we come to the next section!
113 //! A number of flags have been added to the compiler to solve the "version
114 //! problem" in the previous section, as well as generally enabling more
115 //! powerful usage of the crate loading system of the compiler. The goal of
116 //! these flags and options are to enable third-party tools to drive the
117 //! compiler with prior knowledge about how the world should look.
119 //! ## The `--extern` flag
121 //! The compiler accepts a flag of this form a number of times:
124 //! --extern crate-name=path/to/the/crate.rlib
127 //! This flag is basically the following letter to the compiler:
131 //! > When you are attempting to load the immediate dependency `crate-name`, I
132 //! > would like you too assume that the library is located at
133 //! > `path/to/the/crate.rlib`, and look nowhere else. Also, please do not
134 //! > assume that the path I specified has the name `crate-name`.
136 //! This flag basically overrides most matching logic except for validating that
137 //! the file is indeed a rust library. The same `crate-name` can be specified
138 //! twice to specify the rlib/dylib pair.
140 //! ## Enabling "multiple versions"
142 //! This basically boils down to the ability to specify arbitrary packages to
143 //! the compiler. For example, if crate A wanted to use Bv1 and Bv2, then it
144 //! would look something like:
153 //! and the compiler would be invoked as:
156 //! rustc a.rs --extern b1=path/to/libb1.rlib --extern b2=path/to/libb2.rlib
159 //! In this scenario there are two crates named `b` and the compiler must be
160 //! manually driven to be informed where each crate is.
162 //! ## Frobbing symbols
164 //! One of the immediate problems with linking the same library together twice
165 //! in the same problem is dealing with duplicate symbols. The primary way to
166 //! deal with this in rustc is to add hashes to the end of each symbol.
168 //! In order to force hashes to change between versions of a library, if
169 //! desired, the compiler exposes an option `-C metadata=foo`, which is used to
170 //! initially seed each symbol hash. The string `foo` is prepended to each
171 //! string-to-hash to ensure that symbols change over time.
173 //! ## Loading transitive dependencies
175 //! Dealing with same-named-but-distinct crates is not just a local problem, but
176 //! one that also needs to be dealt with for transitive dependencies. Note that
177 //! in the letter above `--extern` flags only apply to the *local* set of
178 //! dependencies, not the upstream transitive dependencies. Consider this
179 //! dependency graph:
191 //! In this scenario, when we compile `D`, we need to be able to distinctly
192 //! resolve `A.1` and `A.2`, but an `--extern` flag cannot apply to these
193 //! transitive dependencies.
195 //! Note that the key idea here is that `B` and `C` are both *already compiled*.
196 //! That is, they have already resolved their dependencies. Due to unrelated
197 //! technical reasons, when a library is compiled, it is only compatible with
198 //! the *exact same* version of the upstream libraries it was compiled against.
199 //! We use the "Strict Version Hash" to identify the exact copy of an upstream
202 //! With this knowledge, we know that `B` and `C` will depend on `A` with
203 //! different SVH values, so we crawl the normal `-L` paths looking for
204 //! `liba*.rlib` and filter based on the contained SVH.
206 //! In the end, this ends up not needing `--extern` to specify upstream
207 //! transitive dependencies.
211 //! That's the general overview of loading crates in the compiler, but it's by
212 //! no means all of the necessary details. Take a look at the rest of
213 //! metadata::loader or metadata::creader for all the juicy details!
215 use back::archive::{METADATA_FILENAME};
217 use session::Session;
219 use llvm::{False, ObjectFile, mk_section_iter};
220 use llvm::archive_ro::ArchiveRO;
221 use metadata::cstore::{MetadataBlob, MetadataVec, MetadataArchive};
222 use metadata::decoder;
223 use metadata::encoder;
224 use metadata::filesearch::{FileSearch, FileMatches, FileDoesntMatch};
225 use syntax::codemap::Span;
226 use syntax::diagnostic::SpanHandler;
229 use std::ffi::CString;
231 use std::collections::{HashMap, HashSet};
232 use std::io::fs::PathExtensions;
236 use std::time::Duration;
240 pub struct CrateMismatch {
245 pub struct Context<'a> {
246 pub sess: &'a Session,
249 pub crate_name: &'a str,
250 pub hash: Option<&'a Svh>,
252 pub filesearch: FileSearch<'a>,
253 pub root: &'a Option<CratePaths>,
254 pub rejected_via_hash: Vec<CrateMismatch>,
255 pub rejected_via_triple: Vec<CrateMismatch>,
256 pub should_match_name: bool,
260 pub dylib: Option<Path>,
261 pub rlib: Option<Path>,
262 pub metadata: MetadataBlob,
265 pub struct ArchiveMetadata {
267 // points into self._archive
271 pub struct CratePaths {
273 pub dylib: Option<Path>,
274 pub rlib: Option<Path>
278 fn paths(&self) -> Vec<Path> {
279 match (&self.dylib, &self.rlib) {
280 (&None, &None) => vec!(),
281 (&Some(ref p), &None) |
282 (&None, &Some(ref p)) => vec!(p.clone()),
283 (&Some(ref p1), &Some(ref p2)) => vec!(p1.clone(), p2.clone()),
288 impl<'a> Context<'a> {
289 pub fn maybe_load_library_crate(&mut self) -> Option<Library> {
290 self.find_library_crate()
293 pub fn load_library_crate(&mut self) -> Library {
294 match self.find_library_crate() {
297 self.report_load_errs();
303 pub fn report_load_errs(&mut self) {
304 let message = if self.rejected_via_hash.len() > 0 {
305 format!("found possibly newer version of crate `{}`",
307 } else if self.rejected_via_triple.len() > 0 {
308 format!("couldn't find crate `{}` with expected target triple {}",
309 self.ident, self.triple)
311 format!("can't find crate for `{}`", self.ident)
313 let message = match self.root {
315 &Some(ref r) => format!("{} which `{}` depends on",
318 self.sess.span_err(self.span, message[]);
320 if self.rejected_via_triple.len() > 0 {
321 let mismatches = self.rejected_via_triple.iter();
322 for (i, &CrateMismatch{ ref path, ref got }) in mismatches.enumerate() {
323 self.sess.fileline_note(self.span,
324 format!("crate `{}`, path #{}, triple {}: {}",
325 self.ident, i+1, got, path.display())[]);
328 if self.rejected_via_hash.len() > 0 {
329 self.sess.span_note(self.span, "perhaps this crate needs \
331 let mismatches = self.rejected_via_hash.iter();
332 for (i, &CrateMismatch{ ref path, .. }) in mismatches.enumerate() {
333 self.sess.fileline_note(self.span,
334 format!("crate `{}` path {}{}: {}",
335 self.ident, "#", i+1, path.display())[]);
340 for (i, path) in r.paths().iter().enumerate() {
341 self.sess.fileline_note(self.span,
342 format!("crate `{}` path #{}: {}",
343 r.ident, i+1, path.display())[]);
348 self.sess.abort_if_errors();
351 fn find_library_crate(&mut self) -> Option<Library> {
352 // If an SVH is specified, then this is a transitive dependency that
353 // must be loaded via -L plus some filtering.
354 if self.hash.is_none() {
355 self.should_match_name = false;
356 match self.find_commandline_library() {
357 Some(l) => return Some(l),
360 self.should_match_name = true;
363 let dypair = self.dylibname();
365 // want: crate_name.dir_part() + prefix + crate_name.file_part + "-"
366 let dylib_prefix = format!("{}{}", dypair.0, self.crate_name);
367 let rlib_prefix = format!("lib{}", self.crate_name);
369 let mut candidates = HashMap::new();
371 // First, find all possible candidate rlibs and dylibs purely based on
372 // the name of the files themselves. We're trying to match against an
373 // exact crate name and a possibly an exact hash.
375 // During this step, we can filter all found libraries based on the
376 // name and id found in the crate id (we ignore the path portion for
377 // filename matching), as well as the exact hash (if specified). If we
378 // end up having many candidates, we must look at the metadata to
379 // perform exact matches against hashes/crate ids. Note that opening up
380 // the metadata is where we do an exact match against the full contents
381 // of the crate id (path/name/id).
383 // The goal of this step is to look at as little metadata as possible.
384 self.filesearch.search(|path| {
385 let file = match path.filename_str() {
386 None => return FileDoesntMatch,
389 let (hash, rlib) = if file.starts_with(rlib_prefix[]) &&
390 file.ends_with(".rlib") {
391 (file.slice(rlib_prefix.len(), file.len() - ".rlib".len()),
393 } else if file.starts_with(dylib_prefix.as_slice()) &&
394 file.ends_with(dypair.1.as_slice()) {
395 (file.slice(dylib_prefix.len(), file.len() - dypair.1.len()),
398 return FileDoesntMatch
400 info!("lib candidate: {}", path.display());
402 let hash_str = hash.to_string();
403 let slot = candidates.entry(&hash_str).get().unwrap_or_else(
404 |vacant_entry| vacant_entry.insert((HashSet::new(), HashSet::new())));
405 let (ref mut rlibs, ref mut dylibs) = *slot;
407 rlibs.insert(fs::realpath(path).unwrap());
409 dylibs.insert(fs::realpath(path).unwrap());
415 // We have now collected all known libraries into a set of candidates
416 // keyed of the filename hash listed. For each filename, we also have a
417 // list of rlibs/dylibs that apply. Here, we map each of these lists
418 // (per hash), to a Library candidate for returning.
420 // A Library candidate is created if the metadata for the set of
421 // libraries corresponds to the crate id and hash criteria that this
422 // search is being performed for.
423 let mut libraries = Vec::new();
424 for (_hash, (rlibs, dylibs)) in candidates.into_iter() {
425 let mut metadata = None;
426 let rlib = self.extract_one(rlibs, "rlib", &mut metadata);
427 let dylib = self.extract_one(dylibs, "dylib", &mut metadata);
430 libraries.push(Library {
440 // Having now translated all relevant found hashes into libraries, see
441 // what we've got and figure out if we found multiple candidates for
443 match libraries.len() {
445 1 => Some(libraries.into_iter().next().unwrap()),
447 self.sess.span_err(self.span,
448 format!("multiple matching crates for `{}`",
450 self.sess.note("candidates:");
451 for lib in libraries.iter() {
454 self.sess.note(format!("path: {}",
461 self.sess.note(format!("path: {}",
466 let data = lib.metadata.as_slice();
467 let name = decoder::get_crate_name(data);
468 note_crate_name(self.sess.diagnostic(), name[]);
475 // Attempts to extract *one* library from the set `m`. If the set has no
476 // elements, `None` is returned. If the set has more than one element, then
477 // the errors and notes are emitted about the set of libraries.
479 // With only one library in the set, this function will extract it, and then
480 // read the metadata from it if `*slot` is `None`. If the metadata couldn't
481 // be read, it is assumed that the file isn't a valid rust library (no
482 // errors are emitted).
483 fn extract_one(&mut self, m: HashSet<Path>, flavor: &str,
484 slot: &mut Option<MetadataBlob>) -> Option<Path> {
485 let mut ret = None::<Path>;
489 // FIXME(#10786): for an optimization, we only read one of the
490 // library's metadata sections. In theory we should
491 // read both, but reading dylib metadata is quite
495 } else if m.len() == 1 {
496 return Some(m.into_iter().next().unwrap())
500 for lib in m.into_iter() {
501 info!("{} reading metadata from: {}", flavor, lib.display());
502 let metadata = match get_metadata_section(self.sess.target.target.options.is_like_osx,
505 if self.crate_matches(blob.as_slice(), &lib) {
508 info!("metadata mismatch");
513 info!("no metadata found");
518 self.sess.span_err(self.span,
519 format!("multiple {} candidates for `{}` \
523 self.sess.span_note(self.span,
524 format!(r"candidate #1: {}",
525 ret.as_ref().unwrap()
532 self.sess.span_note(self.span,
533 format!(r"candidate #{}: {}", error,
537 *slot = Some(metadata);
540 return if error > 0 {None} else {ret}
543 fn crate_matches(&mut self, crate_data: &[u8], libpath: &Path) -> bool {
544 if self.should_match_name {
545 match decoder::maybe_get_crate_name(crate_data) {
546 Some(ref name) if self.crate_name == *name => {}
547 _ => { info!("Rejecting via crate name"); return false }
550 let hash = match decoder::maybe_get_crate_hash(crate_data) {
551 Some(hash) => hash, None => {
552 info!("Rejecting via lack of crate hash");
557 let triple = match decoder::get_crate_triple(crate_data) {
558 None => { debug!("triple not present"); return false }
561 if triple != self.triple {
562 info!("Rejecting via crate triple: expected {} got {}", self.triple, triple);
563 self.rejected_via_triple.push(CrateMismatch {
564 path: libpath.clone(),
565 got: triple.to_string()
574 info!("Rejecting via hash: expected {} got {}", *myhash, hash);
575 self.rejected_via_hash.push(CrateMismatch {
576 path: libpath.clone(),
577 got: myhash.as_str().to_string()
588 // Returns the corresponding (prefix, suffix) that files need to have for
590 fn dylibname(&self) -> (String, String) {
591 let t = &self.sess.target.target;
592 (t.options.dll_prefix.clone(), t.options.dll_suffix.clone())
595 fn find_commandline_library(&mut self) -> Option<Library> {
596 let locs = match self.sess.opts.externs.get(self.crate_name) {
601 // First, filter out all libraries that look suspicious. We only accept
602 // files which actually exist that have the correct naming scheme for
604 let sess = self.sess;
605 let dylibname = self.dylibname();
606 let mut rlibs = HashSet::new();
607 let mut dylibs = HashSet::new();
609 let mut locs = locs.iter().map(|l| Path::new(l[])).filter(|loc| {
611 sess.err(format!("extern location for {} does not exist: {}",
612 self.crate_name, loc.display())[]);
615 let file = match loc.filename_str() {
618 sess.err(format!("extern location for {} is not a file: {}",
619 self.crate_name, loc.display())[]);
623 if file.starts_with("lib") && file.ends_with(".rlib") {
626 let (ref prefix, ref suffix) = dylibname;
627 if file.starts_with(prefix[]) && file.ends_with(suffix[]) {
631 sess.err(format!("extern location for {} is of an unknown type: {}",
632 self.crate_name, loc.display())[]);
636 // Now that we have an iterator of good candidates, make sure there's at
637 // most one rlib and at most one dylib.
639 if loc.filename_str().unwrap().ends_with(".rlib") {
640 rlibs.insert(fs::realpath(&loc).unwrap());
642 dylibs.insert(fs::realpath(&loc).unwrap());
647 // Extract the rlib/dylib pair.
648 let mut metadata = None;
649 let rlib = self.extract_one(rlibs, "rlib", &mut metadata);
650 let dylib = self.extract_one(dylibs, "dylib", &mut metadata);
652 if rlib.is_none() && dylib.is_none() { return None }
654 Some(metadata) => Some(Library {
664 pub fn note_crate_name(diag: &SpanHandler, name: &str) {
665 diag.handler().note(format!("crate name: {}", name)[]);
668 impl ArchiveMetadata {
669 fn new(ar: ArchiveRO) -> Option<ArchiveMetadata> {
670 let data = match ar.read(METADATA_FILENAME) {
671 Some(data) => data as *const [u8],
673 debug!("didn't find '{}' in the archive", METADATA_FILENAME);
678 Some(ArchiveMetadata {
684 pub fn as_slice<'a>(&'a self) -> &'a [u8] { unsafe { &*self.data } }
687 // Just a small wrapper to time how long reading metadata takes.
688 fn get_metadata_section(is_osx: bool, filename: &Path) -> Result<MetadataBlob, String> {
690 let dur = Duration::span(|| {
691 ret = Some(get_metadata_section_imp(is_osx, filename));
693 info!("reading {} => {}ms", filename.filename_display(),
694 dur.num_milliseconds());
695 return ret.unwrap();;
698 fn get_metadata_section_imp(is_osx: bool, filename: &Path) -> Result<MetadataBlob, String> {
699 if !filename.exists() {
700 return Err(format!("no such file: '{}'", filename.display()));
702 if filename.filename_str().unwrap().ends_with(".rlib") {
703 // Use ArchiveRO for speed here, it's backed by LLVM and uses mmap
704 // internally to read the file. We also avoid even using a memcpy by
705 // just keeping the archive along while the metadata is in use.
706 let archive = match ArchiveRO::open(filename) {
709 debug!("llvm didn't like `{}`", filename.display());
710 return Err(format!("failed to read rlib metadata: '{}'",
711 filename.display()));
714 return match ArchiveMetadata::new(archive).map(|ar| MetadataArchive(ar)) {
716 return Err((format!("failed to read rlib metadata: '{}'",
717 filename.display())))
719 Some(blob) => return Ok(blob)
723 let buf = CString::from_slice(filename.as_vec());
724 let mb = llvm::LLVMRustCreateMemoryBufferWithContentsOfFile(buf.as_ptr());
726 return Err(format!("error reading library: '{}'",
729 let of = match ObjectFile::new(mb) {
732 return Err((format!("provided path not an object file: '{}'",
733 filename.display())))
736 let si = mk_section_iter(of.llof);
737 while llvm::LLVMIsSectionIteratorAtEnd(of.llof, si.llsi) == False {
738 let mut name_buf = ptr::null();
739 let name_len = llvm::LLVMRustGetSectionName(si.llsi, &mut name_buf);
740 let name = slice::from_raw_buf(&(name_buf as *const u8),
741 name_len as uint).to_vec();
742 let name = String::from_utf8(name).unwrap();
743 debug!("get_metadata_section: name {}", name);
744 if read_meta_section_name(is_osx) == name {
745 let cbuf = llvm::LLVMGetSectionContents(si.llsi);
746 let csz = llvm::LLVMGetSectionSize(si.llsi) as uint;
747 let cvbuf: *const u8 = cbuf as *const u8;
748 let vlen = encoder::metadata_encoding_version.len();
749 debug!("checking {} bytes of metadata-version stamp",
751 let minsz = cmp::min(vlen, csz);
752 let buf0 = slice::from_raw_buf(&cvbuf, minsz);
753 let version_ok = buf0 == encoder::metadata_encoding_version;
755 return Err((format!("incompatible metadata version found: '{}'",
756 filename.display())));
759 let cvbuf1 = cvbuf.offset(vlen as int);
760 debug!("inflating {} bytes of compressed metadata",
762 let bytes = slice::from_raw_buf(&cvbuf1, csz-vlen);
763 match flate::inflate_bytes(bytes) {
764 Some(inflated) => return Ok(MetadataVec(inflated)),
768 llvm::LLVMMoveToNextSection(si.llsi);
770 return Err(format!("metadata not found: '{}'", filename.display()));
774 pub fn meta_section_name(is_osx: bool) -> &'static str {
776 "__DATA,__note.rustc"
782 pub fn read_meta_section_name(is_osx: bool) -> &'static str {
790 // A diagnostic function for dumping crate metadata to an output stream
791 pub fn list_file_metadata(is_osx: bool, path: &Path,
792 out: &mut io::Writer) -> io::IoResult<()> {
793 match get_metadata_section(is_osx, path) {
794 Ok(bytes) => decoder::list_crate_metadata(bytes.as_slice(), out),
796 write!(out, "{}\n", msg)