1 // Copyright 2012-2014 The Rust Project Developers. See the COPYRIGHT
2 // file at the top-level directory of this distribution and at
3 // http://rust-lang.org/COPYRIGHT.
5 // Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
6 // http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
7 // <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
8 // option. This file may not be copied, modified, or distributed
9 // except according to those terms.
11 //! Finds crate binaries and loads their metadata
13 //! Might I be the first to welcome you to a world of platform differences,
14 //! version requirements, dependency graphs, conflicting desires, and fun! This
15 //! is the major guts (along with metadata::creader) of the compiler for loading
16 //! crates and resolving dependencies. Let's take a tour!
20 //! Each invocation of the compiler is immediately concerned with one primary
21 //! problem, to connect a set of crates to resolved crates on the filesystem.
22 //! Concretely speaking, the compiler follows roughly these steps to get here:
24 //! 1. Discover a set of `extern crate` statements.
25 //! 2. Transform these directives into crate names. If the directive does not
26 //! have an explicit name, then the identifier is the name.
27 //! 3. For each of these crate names, find a corresponding crate on the
30 //! Sounds easy, right? Let's walk into some of the nuances.
32 //! ## Transitive Dependencies
34 //! Let's say we've got three crates: A, B, and C. A depends on B, and B depends
35 //! on C. When we're compiling A, we primarily need to find and locate B, but we
36 //! also end up needing to find and locate C as well.
38 //! The reason for this is that any of B's types could be composed of C's types,
39 //! any function in B could return a type from C, etc. To be able to guarantee
40 //! that we can always typecheck/translate any function, we have to have
41 //! complete knowledge of the whole ecosystem, not just our immediate
44 //! So now as part of the "find a corresponding crate on the filesystem" step
45 //! above, this involves also finding all crates for *all upstream
46 //! dependencies*. This includes all dependencies transitively.
48 //! ## Rlibs and Dylibs
50 //! The compiler has two forms of intermediate dependencies. These are dubbed
51 //! rlibs and dylibs for the static and dynamic variants, respectively. An rlib
52 //! is a rustc-defined file format (currently just an ar archive) while a dylib
53 //! is a platform-defined dynamic library. Each library has a metadata somewhere
56 //! When translating a crate name to a crate on the filesystem, we all of a
57 //! sudden need to take into account both rlibs and dylibs! Linkage later on may
58 //! use either one of these files, as each has their pros/cons. The job of crate
59 //! loading is to discover what's possible by finding all candidates.
61 //! Most parts of this loading systems keep the dylib/rlib as just separate
66 //! We can't exactly scan your whole hard drive when looking for dependencies,
67 //! so we need to places to look. Currently the compiler will implicitly add the
68 //! target lib search path ($prefix/lib/rustlib/$target/lib) to any compilation,
69 //! and otherwise all -L flags are added to the search paths.
71 //! ## What criterion to select on?
73 //! This a pretty tricky area of loading crates. Given a file, how do we know
74 //! whether it's the right crate? Currently, the rules look along these lines:
76 //! 1. Does the filename match an rlib/dylib pattern? That is to say, does the
77 //! filename have the right prefix/suffix?
78 //! 2. Does the filename have the right prefix for the crate name being queried?
79 //! This is filtering for files like `libfoo*.rlib` and such.
80 //! 3. Is the file an actual rust library? This is done by loading the metadata
81 //! from the library and making sure it's actually there.
82 //! 4. Does the name in the metadata agree with the name of the library?
83 //! 5. Does the target in the metadata agree with the current target?
84 //! 6. Does the SVH match? (more on this later)
86 //! If the file answers `yes` to all these questions, then the file is
87 //! considered as being *candidate* for being accepted. It is illegal to have
88 //! more than two candidates as the compiler has no method by which to resolve
89 //! this conflict. Additionally, rlib/dylib candidates are considered
92 //! After all this has happened, we have 1 or two files as candidates. These
93 //! represent the rlib/dylib file found for a library, and they're returned as
96 //! ### What about versions?
98 //! A lot of effort has been put forth to remove versioning from the compiler.
99 //! There have been forays in the past to have versioning baked in, but it was
100 //! largely always deemed insufficient to the point that it was recognized that
101 //! it's probably something the compiler shouldn't do anyway due to its
102 //! complicated nature and the state of the half-baked solutions.
104 //! With a departure from versioning, the primary criterion for loading crates
105 //! is just the name of a crate. If we stopped here, it would imply that you
106 //! could never link two crates of the same name from different sources
107 //! together, which is clearly a bad state to be in.
109 //! To resolve this problem, we come to the next section!
113 //! A number of flags have been added to the compiler to solve the "version
114 //! problem" in the previous section, as well as generally enabling more
115 //! powerful usage of the crate loading system of the compiler. The goal of
116 //! these flags and options are to enable third-party tools to drive the
117 //! compiler with prior knowledge about how the world should look.
119 //! ## The `--extern` flag
121 //! The compiler accepts a flag of this form a number of times:
124 //! --extern crate-name=path/to/the/crate.rlib
127 //! This flag is basically the following letter to the compiler:
131 //! > When you are attempting to load the immediate dependency `crate-name`, I
132 //! > would like you too assume that the library is located at
133 //! > `path/to/the/crate.rlib`, and look nowhere else. Also, please do not
134 //! > assume that the path I specified has the name `crate-name`.
136 //! This flag basically overrides most matching logic except for validating that
137 //! the file is indeed a rust library. The same `crate-name` can be specified
138 //! twice to specify the rlib/dylib pair.
140 //! ## Enabling "multiple versions"
142 //! This basically boils down to the ability to specify arbitrary packages to
143 //! the compiler. For example, if crate A wanted to use Bv1 and Bv2, then it
144 //! would look something like:
153 //! and the compiler would be invoked as:
156 //! rustc a.rs --extern b1=path/to/libb1.rlib --extern b2=path/to/libb2.rlib
159 //! In this scenario there are two crates named `b` and the compiler must be
160 //! manually driven to be informed where each crate is.
162 //! ## Frobbing symbols
164 //! One of the immediate problems with linking the same library together twice
165 //! in the same problem is dealing with duplicate symbols. The primary way to
166 //! deal with this in rustc is to add hashes to the end of each symbol.
168 //! In order to force hashes to change between versions of a library, if
169 //! desired, the compiler exposes an option `-C metadata=foo`, which is used to
170 //! initially seed each symbol hash. The string `foo` is prepended to each
171 //! string-to-hash to ensure that symbols change over time.
173 //! ## Loading transitive dependencies
175 //! Dealing with same-named-but-distinct crates is not just a local problem, but
176 //! one that also needs to be dealt with for transitive dependencies. Note that
177 //! in the letter above `--extern` flags only apply to the *local* set of
178 //! dependencies, not the upstream transitive dependencies. Consider this
179 //! dependency graph:
191 //! In this scenario, when we compile `D`, we need to be able to distinctly
192 //! resolve `A.1` and `A.2`, but an `--extern` flag cannot apply to these
193 //! transitive dependencies.
195 //! Note that the key idea here is that `B` and `C` are both *already compiled*.
196 //! That is, they have already resolved their dependencies. Due to unrelated
197 //! technical reasons, when a library is compiled, it is only compatible with
198 //! the *exact same* version of the upstream libraries it was compiled against.
199 //! We use the "Strict Version Hash" to identify the exact copy of an upstream
202 //! With this knowledge, we know that `B` and `C` will depend on `A` with
203 //! different SVH values, so we crawl the normal `-L` paths looking for
204 //! `liba*.rlib` and filter based on the contained SVH.
206 //! In the end, this ends up not needing `--extern` to specify upstream
207 //! transitive dependencies.
211 //! That's the general overview of loading crates in the compiler, but it's by
212 //! no means all of the necessary details. Take a look at the rest of
213 //! metadata::loader or metadata::creader for all the juicy details!
215 use back::archive::{METADATA_FILENAME};
217 use session::Session;
219 use llvm::{False, ObjectFile, mk_section_iter};
220 use llvm::archive_ro::ArchiveRO;
221 use metadata::cstore::{MetadataBlob, MetadataVec, MetadataArchive};
222 use metadata::decoder;
223 use metadata::encoder;
224 use metadata::filesearch::{FileSearch, FileMatches, FileDoesntMatch};
225 use syntax::codemap::Span;
226 use syntax::diagnostic::SpanHandler;
229 use std::c_str::ToCStr;
231 use std::collections::hash_map::Entry::{Occupied, Vacant};
232 use std::collections::{HashMap, HashSet};
233 use std::io::fs::PathExtensions;
237 use std::time::Duration;
241 pub struct CrateMismatch {
246 pub struct Context<'a> {
247 pub sess: &'a Session,
250 pub crate_name: &'a str,
251 pub hash: Option<&'a Svh>,
253 pub filesearch: FileSearch<'a>,
254 pub root: &'a Option<CratePaths>,
255 pub rejected_via_hash: Vec<CrateMismatch>,
256 pub rejected_via_triple: Vec<CrateMismatch>,
257 pub should_match_name: bool,
261 pub dylib: Option<Path>,
262 pub rlib: Option<Path>,
263 pub metadata: MetadataBlob,
266 pub struct ArchiveMetadata {
268 // points into self._archive
272 pub struct CratePaths {
274 pub dylib: Option<Path>,
275 pub rlib: Option<Path>
279 fn paths(&self) -> Vec<Path> {
280 match (&self.dylib, &self.rlib) {
281 (&None, &None) => vec!(),
282 (&Some(ref p), &None) |
283 (&None, &Some(ref p)) => vec!(p.clone()),
284 (&Some(ref p1), &Some(ref p2)) => vec!(p1.clone(), p2.clone()),
289 impl<'a> Context<'a> {
290 pub fn maybe_load_library_crate(&mut self) -> Option<Library> {
291 self.find_library_crate()
294 pub fn load_library_crate(&mut self) -> Library {
295 match self.find_library_crate() {
298 self.report_load_errs();
304 pub fn report_load_errs(&mut self) {
305 let message = if self.rejected_via_hash.len() > 0 {
306 format!("found possibly newer version of crate `{}`",
308 } else if self.rejected_via_triple.len() > 0 {
309 format!("couldn't find crate `{}` with expected target triple {}",
310 self.ident, self.triple)
312 format!("can't find crate for `{}`", self.ident)
314 let message = match self.root {
316 &Some(ref r) => format!("{} which `{}` depends on",
319 self.sess.span_err(self.span, message[]);
321 if self.rejected_via_triple.len() > 0 {
322 let mismatches = self.rejected_via_triple.iter();
323 for (i, &CrateMismatch{ ref path, ref got }) in mismatches.enumerate() {
324 self.sess.fileline_note(self.span,
325 format!("crate `{}`, path #{}, triple {}: {}",
326 self.ident, i+1, got, path.display())[]);
329 if self.rejected_via_hash.len() > 0 {
330 self.sess.span_note(self.span, "perhaps this crate needs \
332 let mismatches = self.rejected_via_hash.iter();
333 for (i, &CrateMismatch{ ref path, .. }) in mismatches.enumerate() {
334 self.sess.fileline_note(self.span,
335 format!("crate `{}` path {}{}: {}",
336 self.ident, "#", i+1, path.display())[]);
341 for (i, path) in r.paths().iter().enumerate() {
342 self.sess.fileline_note(self.span,
343 format!("crate `{}` path #{}: {}",
344 r.ident, i+1, path.display())[]);
349 self.sess.abort_if_errors();
352 fn find_library_crate(&mut self) -> Option<Library> {
353 // If an SVH is specified, then this is a transitive dependency that
354 // must be loaded via -L plus some filtering.
355 if self.hash.is_none() {
356 self.should_match_name = false;
357 match self.find_commandline_library() {
358 Some(l) => return Some(l),
361 self.should_match_name = true;
364 let dypair = self.dylibname();
366 // want: crate_name.dir_part() + prefix + crate_name.file_part + "-"
367 let dylib_prefix = format!("{}{}", dypair.0, self.crate_name);
368 let rlib_prefix = format!("lib{}", self.crate_name);
370 let mut candidates = HashMap::new();
372 // First, find all possible candidate rlibs and dylibs purely based on
373 // the name of the files themselves. We're trying to match against an
374 // exact crate name and a possibly an exact hash.
376 // During this step, we can filter all found libraries based on the
377 // name and id found in the crate id (we ignore the path portion for
378 // filename matching), as well as the exact hash (if specified). If we
379 // end up having many candidates, we must look at the metadata to
380 // perform exact matches against hashes/crate ids. Note that opening up
381 // the metadata is where we do an exact match against the full contents
382 // of the crate id (path/name/id).
384 // The goal of this step is to look at as little metadata as possible.
385 self.filesearch.search(|path| {
386 let file = match path.filename_str() {
387 None => return FileDoesntMatch,
390 let (hash, rlib) = if file.starts_with(rlib_prefix[]) &&
391 file.ends_with(".rlib") {
392 (file.slice(rlib_prefix.len(), file.len() - ".rlib".len()),
394 } else if file.starts_with(dylib_prefix.as_slice()) &&
395 file.ends_with(dypair.1.as_slice()) {
396 (file.slice(dylib_prefix.len(), file.len() - dypair.1.len()),
399 return FileDoesntMatch
401 info!("lib candidate: {}", path.display());
403 let slot = match candidates.entry(hash.to_string()) {
404 Occupied(entry) => entry.into_mut(),
405 Vacant(entry) => entry.set((HashSet::new(), HashSet::new())),
407 let (ref mut rlibs, ref mut dylibs) = *slot;
409 rlibs.insert(fs::realpath(path).unwrap());
411 dylibs.insert(fs::realpath(path).unwrap());
417 // We have now collected all known libraries into a set of candidates
418 // keyed of the filename hash listed. For each filename, we also have a
419 // list of rlibs/dylibs that apply. Here, we map each of these lists
420 // (per hash), to a Library candidate for returning.
422 // A Library candidate is created if the metadata for the set of
423 // libraries corresponds to the crate id and hash criteria that this
424 // search is being performed for.
425 let mut libraries = Vec::new();
426 for (_hash, (rlibs, dylibs)) in candidates.into_iter() {
427 let mut metadata = None;
428 let rlib = self.extract_one(rlibs, "rlib", &mut metadata);
429 let dylib = self.extract_one(dylibs, "dylib", &mut metadata);
432 libraries.push(Library {
442 // Having now translated all relevant found hashes into libraries, see
443 // what we've got and figure out if we found multiple candidates for
445 match libraries.len() {
447 1 => Some(libraries.into_iter().next().unwrap()),
449 self.sess.span_err(self.span,
450 format!("multiple matching crates for `{}`",
452 self.sess.note("candidates:");
453 for lib in libraries.iter() {
456 self.sess.note(format!("path: {}",
463 self.sess.note(format!("path: {}",
468 let data = lib.metadata.as_slice();
469 let name = decoder::get_crate_name(data);
470 note_crate_name(self.sess.diagnostic(), name[]);
477 // Attempts to extract *one* library from the set `m`. If the set has no
478 // elements, `None` is returned. If the set has more than one element, then
479 // the errors and notes are emitted about the set of libraries.
481 // With only one library in the set, this function will extract it, and then
482 // read the metadata from it if `*slot` is `None`. If the metadata couldn't
483 // be read, it is assumed that the file isn't a valid rust library (no
484 // errors are emitted).
485 fn extract_one(&mut self, m: HashSet<Path>, flavor: &str,
486 slot: &mut Option<MetadataBlob>) -> Option<Path> {
487 let mut ret = None::<Path>;
491 // FIXME(#10786): for an optimization, we only read one of the
492 // library's metadata sections. In theory we should
493 // read both, but reading dylib metadata is quite
497 } else if m.len() == 1 {
498 return Some(m.into_iter().next().unwrap())
502 for lib in m.into_iter() {
503 info!("{} reading metadata from: {}", flavor, lib.display());
504 let metadata = match get_metadata_section(self.sess.target.target.options.is_like_osx,
507 if self.crate_matches(blob.as_slice(), &lib) {
510 info!("metadata mismatch");
515 info!("no metadata found");
520 self.sess.span_err(self.span,
521 format!("multiple {} candidates for `{}` \
525 self.sess.span_note(self.span,
526 format!(r"candidate #1: {}",
527 ret.as_ref().unwrap()
534 self.sess.span_note(self.span,
535 format!(r"candidate #{}: {}", error,
539 *slot = Some(metadata);
542 return if error > 0 {None} else {ret}
545 fn crate_matches(&mut self, crate_data: &[u8], libpath: &Path) -> bool {
546 if self.should_match_name {
547 match decoder::maybe_get_crate_name(crate_data) {
548 Some(ref name) if self.crate_name == *name => {}
549 _ => { info!("Rejecting via crate name"); return false }
552 let hash = match decoder::maybe_get_crate_hash(crate_data) {
553 Some(hash) => hash, None => {
554 info!("Rejecting via lack of crate hash");
559 let triple = match decoder::get_crate_triple(crate_data) {
560 None => { debug!("triple not present"); return false }
563 if triple != self.triple {
564 info!("Rejecting via crate triple: expected {} got {}", self.triple, triple);
565 self.rejected_via_triple.push(CrateMismatch {
566 path: libpath.clone(),
567 got: triple.to_string()
576 info!("Rejecting via hash: expected {} got {}", *myhash, hash);
577 self.rejected_via_hash.push(CrateMismatch {
578 path: libpath.clone(),
579 got: myhash.as_str().to_string()
590 // Returns the corresponding (prefix, suffix) that files need to have for
592 fn dylibname(&self) -> (String, String) {
593 let t = &self.sess.target.target;
594 (t.options.dll_prefix.clone(), t.options.dll_suffix.clone())
597 fn find_commandline_library(&mut self) -> Option<Library> {
598 let locs = match self.sess.opts.externs.get(self.crate_name) {
603 // First, filter out all libraries that look suspicious. We only accept
604 // files which actually exist that have the correct naming scheme for
606 let sess = self.sess;
607 let dylibname = self.dylibname();
608 let mut rlibs = HashSet::new();
609 let mut dylibs = HashSet::new();
611 let mut locs = locs.iter().map(|l| Path::new(l[])).filter(|loc| {
613 sess.err(format!("extern location for {} does not exist: {}",
614 self.crate_name, loc.display())[]);
617 let file = match loc.filename_str() {
620 sess.err(format!("extern location for {} is not a file: {}",
621 self.crate_name, loc.display())[]);
625 if file.starts_with("lib") && file.ends_with(".rlib") {
628 let (ref prefix, ref suffix) = dylibname;
629 if file.starts_with(prefix[]) && file.ends_with(suffix[]) {
633 sess.err(format!("extern location for {} is of an unknown type: {}",
634 self.crate_name, loc.display())[]);
638 // Now that we have an iterator of good candidates, make sure there's at
639 // most one rlib and at most one dylib.
641 if loc.filename_str().unwrap().ends_with(".rlib") {
642 rlibs.insert(fs::realpath(&loc).unwrap());
644 dylibs.insert(fs::realpath(&loc).unwrap());
649 // Extract the rlib/dylib pair.
650 let mut metadata = None;
651 let rlib = self.extract_one(rlibs, "rlib", &mut metadata);
652 let dylib = self.extract_one(dylibs, "dylib", &mut metadata);
654 if rlib.is_none() && dylib.is_none() { return None }
656 Some(metadata) => Some(Library {
666 pub fn note_crate_name(diag: &SpanHandler, name: &str) {
667 diag.handler().note(format!("crate name: {}", name)[]);
670 impl ArchiveMetadata {
671 fn new(ar: ArchiveRO) -> Option<ArchiveMetadata> {
672 let data = match ar.read(METADATA_FILENAME) {
673 Some(data) => data as *const [u8],
675 debug!("didn't find '{}' in the archive", METADATA_FILENAME);
680 Some(ArchiveMetadata {
686 pub fn as_slice<'a>(&'a self) -> &'a [u8] { unsafe { &*self.data } }
689 // Just a small wrapper to time how long reading metadata takes.
690 fn get_metadata_section(is_osx: bool, filename: &Path) -> Result<MetadataBlob, String> {
692 let dur = Duration::span(|| {
693 ret = Some(get_metadata_section_imp(is_osx, filename));
695 info!("reading {} => {}ms", filename.filename_display(),
696 dur.num_milliseconds());
697 return ret.unwrap();;
700 fn get_metadata_section_imp(is_osx: bool, filename: &Path) -> Result<MetadataBlob, String> {
701 if !filename.exists() {
702 return Err(format!("no such file: '{}'", filename.display()));
704 if filename.filename_str().unwrap().ends_with(".rlib") {
705 // Use ArchiveRO for speed here, it's backed by LLVM and uses mmap
706 // internally to read the file. We also avoid even using a memcpy by
707 // just keeping the archive along while the metadata is in use.
708 let archive = match ArchiveRO::open(filename) {
711 debug!("llvm didn't like `{}`", filename.display());
712 return Err(format!("failed to read rlib metadata: '{}'",
713 filename.display()));
716 return match ArchiveMetadata::new(archive).map(|ar| MetadataArchive(ar)) {
718 return Err((format!("failed to read rlib metadata: '{}'",
719 filename.display())))
721 Some(blob) => return Ok(blob)
725 let mb = filename.with_c_str(|buf| {
726 llvm::LLVMRustCreateMemoryBufferWithContentsOfFile(buf)
729 return Err(format!("error reading library: '{}'",
732 let of = match ObjectFile::new(mb) {
735 return Err((format!("provided path not an object file: '{}'",
736 filename.display())))
739 let si = mk_section_iter(of.llof);
740 while llvm::LLVMIsSectionIteratorAtEnd(of.llof, si.llsi) == False {
741 let mut name_buf = ptr::null();
742 let name_len = llvm::LLVMRustGetSectionName(si.llsi, &mut name_buf);
743 let name = String::from_raw_buf_len(name_buf as *const u8,
745 debug!("get_metadata_section: name {}", name);
746 if read_meta_section_name(is_osx) == name {
747 let cbuf = llvm::LLVMGetSectionContents(si.llsi);
748 let csz = llvm::LLVMGetSectionSize(si.llsi) as uint;
749 let cvbuf: *const u8 = cbuf as *const u8;
750 let vlen = encoder::metadata_encoding_version.len();
751 debug!("checking {} bytes of metadata-version stamp",
753 let minsz = cmp::min(vlen, csz);
754 let buf0 = slice::from_raw_buf(&cvbuf, minsz);
755 let version_ok = buf0 == encoder::metadata_encoding_version;
757 return Err((format!("incompatible metadata version found: '{}'",
758 filename.display())));
761 let cvbuf1 = cvbuf.offset(vlen as int);
762 debug!("inflating {} bytes of compressed metadata",
764 let bytes = slice::from_raw_buf(&cvbuf1, csz-vlen);
765 match flate::inflate_bytes(bytes) {
766 Some(inflated) => return Ok(MetadataVec(inflated)),
770 llvm::LLVMMoveToNextSection(si.llsi);
772 return Err(format!("metadata not found: '{}'", filename.display()));
776 pub fn meta_section_name(is_osx: bool) -> &'static str {
778 "__DATA,__note.rustc"
784 pub fn read_meta_section_name(is_osx: bool) -> &'static str {
792 // A diagnostic function for dumping crate metadata to an output stream
793 pub fn list_file_metadata(is_osx: bool, path: &Path,
794 out: &mut io::Writer) -> io::IoResult<()> {
795 match get_metadata_section(is_osx, path) {
796 Ok(bytes) => decoder::list_crate_metadata(bytes.as_slice(), out),
798 write!(out, "{}\n", msg)