1 \input texinfo @c -*-texinfo-*-
4 @settitle Rust Documentation
11 This manual is for the ``Rust'' programming language.
14 @uref{http://www.rust-lang.org}
19 Copyright 2006-2010 Graydon Hoare
21 Copyright 2009-2011 Mozilla Foundation
23 See accompanying LICENSE.txt for terms.
27 @dircategory Programming
29 * rust: (rust). Rust programming language
34 @subtitle A safe, concurrent, practical language.
36 @author Mozilla Foundation
39 @vskip 0pt plus 1filll
42 @uref{http://rust-lang.org}
48 Copyright @copyright{} 2006-2010 Graydon Hoare
50 Copyright @copyright{} 2009-2011 Mozilla Foundation
52 See accompanying LICENSE.txt for terms.
56 @everyfooting @| @emph{-- Draft @today --} @|
67 * Disclaimer:: Notes on a work in progress.
68 * Introduction:: Background, intentions, lineage.
69 * Tutorial:: Gentle introduction to reading Rust code.
70 * Reference:: Systematic reference of language elements.
75 Complete table of contents
80 @c ############################################################
82 @c ############################################################
89 Rust is a work in progress. The language continues to evolve as the design
90 shifts and is fleshed out in working code. Certain parts work, certain parts
91 do not, certain parts will be removed or changed.
93 This manual is a snapshot written in the present tense. Some features
94 described do not yet exist in working code. Some may be temporary. It
95 is a @emph{draft}, and we ask that you not take anything you read here
96 as either definitive or final. The manual is to help you get a sense
97 of the language and its organization, not to serve as a complete
98 specification. At least not yet.
100 If you have suggestions to make, please try to focus them on @emph{reductions}
101 to the language: possible features that can be combined or omitted. At this
102 point, every ``additive'' feature we're likely to support is already on the
103 table. The task ahead involves combining, trimming, and implementing.
106 @c ############################################################
108 @c ############################################################
111 @chapter Introduction
114 We have to fight chaos, and the most effective way of doing that is
115 to prevent its emergence.
122 Rust is a curly-brace, block-structured expression language. It visually
123 resembles the C language family, but differs significantly in syntactic and
124 semantic details. Its design is oriented toward concerns of ``programming in
125 the large'', that is, of creating and maintaining @emph{boundaries} -- both
126 abstract and operational -- that preserve large-system @emph{integrity},
127 @emph{availability} and @emph{concurrency}.
129 It supports a mixture of imperative procedural, concurrent actor,
130 object-oriented and pure functional styles. Rust also supports generic
131 programming and metaprogramming, in both static and dynamic styles.
134 * Goals:: Intentions, motivations.
135 * Sales Pitch:: A summary for the impatient.
136 * Influences:: Relationship to past languages.
143 The language design pursues the following goals:
147 @item Compile-time error detection and prevention.
148 @item Run-time fault tolerance and containment.
149 @item System building, analysis and maintenance affordances.
150 @item Clarity and precision of expression.
151 @item Implementation simplicity.
152 @item Run-time efficiency.
153 @item High concurrency.
157 Note that most of these goals are @emph{engineering} goals, not showcases for
158 sophisticated language technology. Most of the technology in Rust is
159 @emph{old} and has been seen decades earlier in other languages.
161 All new languages are developed in a technological context. Rust's goals arise
162 from the context of writing large programs that interact with the internet --
163 both servers and clients -- and are thus much more concerned with
164 @emph{safety} and @emph{concurrency} than older generations of program. Our
165 experience is that these two forces do not conflict; rather they drive system
166 design decisions toward extensive use of @emph{partitioning} and
167 @emph{statelessness}. Rust aims to make these a more natural part of writing
168 programs, within the niche of lower-level, practical, resource-conscious
176 The following comprises a brief ``sales pitch'' overview of the salient
177 features of Rust, relative to other languages.
182 @item No @code{null} pointers
184 The initialization state of every slot is statically computed as part of the
185 typestate system (see below), and requires that all slots are initialized
186 before use. There is no @code{null} value; uninitialized slots are
187 uninitialized and can only be written to, not read.
189 The common use for @code{null} in other languages -- as a sentinel value -- is
190 subsumed into the more general facility of disjoint union types. A program
191 must explicitly model its use of such types.
194 @item Lightweight tasks with no shared values
196 Like many @emph{actor} languages, Rust provides an isolation (and concurrency)
197 model based on lightweight tasks scheduled by the language runtime. These
198 tasks are very inexpensive and statically unable to manipulate one another's
199 local memory. Breaking the rule of task isolation is possible only by calling
200 external (C/C++) code.
202 Inter-task communication is typed, asynchronous, and simplex, based on passing
203 messages over channels to ports.
206 @item Predictable native code, simple runtime
208 The meaning and cost of every operation within a Rust program is intended to
209 be easy to model for the reader. The code should not ``surprise'' the
210 programmer once it has been compiled.
212 Rust compiles to native code. Rust compilation units are large and the
213 compilation model is designed around multi-file, whole-library or
214 whole-program optimization. The compiled units are standard loadable objects
215 (ELF, PE, Mach-O) containing standard debug information (DWARF) and are
216 compatible with existing, standard low-level tools (disassemblers, debuggers,
217 profilers, dynamic loaders). The compiled units include custom metadata that
218 carries full type and version information.
220 The Rust runtime library is a small collection of support code for scheduling,
221 memory management, inter-task communication, reflection and runtime
222 linkage. This library is written in standard C++ and is quite
223 straightforward. It presents a simple interface to embeddings. No
224 research-level virtual machine, JIT or garbage collection technology is
225 required. It should be relatively easy to adapt a Rust front-end on to many
226 existing native toolchains.
229 @item Integrated system-construction facility
231 The units of compilation of Rust are multi-file amalgamations called
232 @emph{crates}. A crate is described by a separate, declarative type of source
233 file that guides the compilation of the crate, its packaging, its versioning,
234 and its external dependencies. Crates are also the units of distribution and
235 loading. Significantly: the dependency graph of crates is @emph{acyclic} and
236 @emph{anonymous}: there is no global namespace for crates, and module-level
237 recursion cannot cross crate barriers.
239 Unlike many languages, individual modules do @emph{not} carry all the
240 mechanisms or restrictions of crates. Modules and crates serve different
244 @item Static control over memory allocation, packing and aliasing.
246 Many values in Rust are allocated @emph{within} their containing stack-frame
247 or parent structure. Numbers, records, tuples and tags are all allocated this
248 way. To allocate such values in the heap, they must be explicitly
249 @emph{boxed}. A @dfn{box} is a pointer to a heap allocation that holds another
250 value, its @emph{content}. Boxes may be either shared or unique, depending
251 on which sort of storage management is desired.
253 Boxing and unboxing in Rust is explicit, though in some cases (such as
254 name-component dereferencing) Rust will automatically dereference a
255 box to access its content. Box values can be passed and assigned
256 independently, like pointers in C; the difference is that in Rust they always
257 point to live contents, and are not subject to pointer arithmetic.
259 In addition to boxes, Rust supports a kind of pass-by-pointer slot called a
260 reference. Forming or releasing a reference does not perform reference-count
261 operations; references can only be formed on values that will provably outlive
262 the reference. References are not ``general values'', in the sense that they
263 cannot be independently manipulated. They are a lot like C++'s references,
264 except that they are safe: the compiler ensures that they always point to live
267 In addition, every slot (stack-local allocation or reference) has a static
268 initialization state that is calculated by the typestate system. This permits
269 late initialization of slots in functions with complex control-flow, while
270 still guaranteeing that every use of a slot occurs after it has been
274 @item Immutable data by default
276 All types in Rust are immutable by default. A field within a type must be
277 declared as @code{mutable} in order to be modified.
280 @item Move semantics and unique pointers
282 Rust differentiates copying values from moving them, and permits moving and
283 swapping values explicitly rather than copying. Moving can be more efficient and,
284 crucially, represents an indivisible transfer of ownership of a value from its
285 source to its destination.
287 In addition, pointer types in Rust come in several varieties. One important
288 type of pointer related to move semantics is the @emph{unique} pointer,
289 denoted @code{~}, which is statically guaranteed to be the only pointer
290 pointing to its referent at any given time.
292 Combining move-semantics and unique pointers, Rust permits a very lightweight
293 form of inter-task communication: values are sent between tasks by moving, and
294 only types composed of unique pointers can be sent. This statically ensures
295 there can never be sharing of data between tasks, while keeping the costs of
296 transferring data between tasks as cheap as moving a pointer.
299 @item Stack-based iterators
301 Rust provides a type of function-like multiple-invocation iterator that is
302 very efficient: the iterator state lives only on the stack and is tightly
303 coupled to the loop that invoked it.
306 @item Direct interface to C code
308 Rust can load and call many C library functions simply by declaring
309 them. Calling a C function is an ``unsafe'' action, and can only be taken
310 within a block marked with the @code{unsafe} keyword. Every unsafe block
311 in a Rust compilation unit must be explicitly authorized in the crate file.
314 @item Structural algebraic data types
316 The Rust type system is primarily structural, and contains the standard
317 assortment of useful ``algebraic'' type constructors from functional
318 languages, such as function types, tuples, record types, vectors, and
319 nominally-tagged disjoint unions. Such values may be @emph{pattern-matched} in
320 an @code{alt} expression.
325 Rust supports a simple form of parametric polymorphism: functions, iterators,
326 types and objects can be parametrized by other types.
329 @item Argument binding
331 Rust provides a mechanism of partially binding arguments to functions,
332 producing new functions that accept the remaining un-bound arguments. This
333 mechanism combines some of the features of lexical closures with some of the
334 features of currying, in a smaller and simpler package.
337 @item Local type inference
339 To save some quantity of programmer key-pressing, Rust supports local type
340 inference: signatures of functions, objects and iterators always require type
341 annotation, but within the body of a function or iterator many slots can be
342 declared without a type, and Rust will infer the slot's type from its uses.
345 @item Structural object system
347 Rust has a lightweight object system based on structural object types: there
348 is no ``class hierarchy'' nor any concept of inheritance. Method overriding
349 and object restriction are performed explicitly on object values, which are
350 little more than order-insensitive records of methods sharing a common private
354 @item Static metaprogramming (syntactic extension)
356 Rust supports a system for syntactic extensions that can be loaded into the
357 compiler, to implement user-defined notations, macros, program-generators and
358 the like. These notations are @emph{marked} using a special form of
359 bracketing, such that a reader unfamiliar with the extension can still parse
360 the surrounding text by skipping over the bracketed ``extension text''.
363 @item Idempotent failure
365 If a task fails due to a signal, or if it evaluates the special @code{fail}
366 expression, it enters the @emph{failing} state. A failing task unwinds its
367 control stack, frees all of its owned resources (executing destructors) and
368 enters the @emph{dead} state. Failure is idempotent and non-recoverable.
371 @item Supervision hierarchy
373 Rust has a system for propagating task-failures, either directly to a
374 supervisor task, or indirectly by sending a message into a channel.
377 @item Resource types with deterministic destruction
379 Rust includes a type constructor for @emph{resource} types, which have an
380 associated destructor and cannot be moved in memory. Resources types belong to
381 the kind of @emph{pinned} types, and any value that directly contains a
382 resource is implicitly pinned as well.
384 Resources can only contain types from the pinned or unique kinds of type,
385 which means that unlike finalizers, there is always a deterministic, top-down
386 order to run the destructors of a resource and its sub-resources.
389 @item Typestate system
391 Every storage slot in a Rust frame participates in not only a conventional
392 structural static type system, describing the interpretation of memory in the
393 slot, but also a @emph{typestate} system. The static typestates of a program
394 describe the set of @emph{pure, dynamic predicates} that provably hold over
395 some set of slots, at each point in the program's control-flow graph within
396 each frame. The static calculation of the typestates of a program is a
397 function-local dataflow problem, and handles user-defined predicates in a
398 similar fashion to the way the type system permits user-defined types.
400 A short way of thinking of this is: types statically model values,
401 typestates statically model @emph{assertions that hold} before and
402 after statements and expressions.
413 The essential problem that must be solved in making a fault-tolerant
414 software system is therefore that of fault-isolation. Different programmers
415 will write different modules, some modules will be correct, others will have
416 errors. We do not want the errors in one module to adversely affect the
417 behaviour of a module which does not have any errors.
426 In our approach, all data is private to some process, and processes can
427 only communicate through communications channels. @emph{Security}, as used
428 in this paper, is the property which guarantees that processes in a system
429 cannot affect each other except by explicit communication.
431 When security is absent, nothing which can be proven about a single module
432 in isolation can be guaranteed to hold when that module is embedded in a
435 - Robert Strom and Shaula Yemini
441 Concurrent and applicative programming complement each other. The
442 ability to send messages on channels provides I/O without side effects,
443 while the avoidance of shared data helps keep concurrent processes from
452 Rust is not a particularly original language. It may however appear unusual by
453 contemporary standards, as its design elements are drawn from a number of
454 ``historical'' languages that have, with a few exceptions, fallen out of
455 favour. Five prominent lineages contribute the most:
460 The NIL (1981) and Hermes (1990) family. These languages were developed by
461 Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM
462 Watson Research Center (Yorktown Heights, NY, USA).
466 The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes
467 Wikstr@"om, Mike Williams and others in their group at the Ericsson Computer
468 Science Laboratory (@"Alvsj@"o, Stockholm, Sweden) .
472 The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim,
473 Heinz Schmidt and others in their group at The International Computer Science
474 Institute of the University of California, Berkeley (Berkeley, CA, USA).
478 The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These languages
479 were developed by Rob Pike, Phil Winterbottom, Sean Dorward and others in
480 their group at Bell labs Computing Sciences Reserch Center (Murray Hill, NJ,
485 The Napier (1985) and Napier88 (1988) family. These languages were developed
486 by Malcolm Atkinson, Ron Morrison and others in their group at the University
487 of St. Andrews (St. Andrews, Fife, UK).
491 Additional specific influences can be seen from the following languages:
493 @item The structural algebraic types and compilation manager of SML.
494 @item The deterministic destructor system of C++.
497 @c ############################################################
499 @c ############################################################
506 @c ############################################################
508 @c ############################################################
514 * Ref.Lex:: Lexical structure.
515 * Ref.Path:: References to items.
516 * Ref.Gram:: Grammar.
517 * Ref.Comp:: Compilation and component model.
518 * Ref.Mem:: Semantic model of memory.
519 * Ref.Task:: Semantic model of tasks.
520 * Ref.Item:: The components of a module.
521 * Ref.Type:: The types of values held in memory.
522 * Ref.Typestate:: Predicates that hold at points in time.
523 * Ref.Stmt:: Components of an executable block.
524 * Ref.Expr:: Units of execution and evaluation.
525 * Ref.Run:: Organization of runtime services.
530 @c * Ref.Lex:: Lexical structure.
531 @cindex Lexical structure
534 The lexical structure of a Rust source file or crate file is defined in terms
535 of Unicode character codes and character properties.
537 Groups of Unicode character codes and characters are organized into
538 @emph{tokens}. Tokens are defined as the longest contiguous sequence of
539 characters within the same token type (identifier, keyword, literal, symbol),
540 or interrupted by ignored characters.
542 Most tokens in Rust follow rules similar to the C family.
544 Most tokens (including whitespace, keywords, operators and structural symbols)
545 are drawn from the ASCII-compatible range of Unicode. Identifiers are drawn
546 from Unicode characters specified by the @code{XID_start} and
547 @code{XID_continue} rules given by UAX #31@footnote{Unicode Standard Annex
548 #31: Unicode Identifier and Pattern Syntax}. String and character literals may
549 include the full range of Unicode characters.
551 @emph{TODO: formalize this section much more}.
554 * Ref.Lex.Ignore:: Ignored characters.
555 * Ref.Lex.Ident:: Identifier tokens.
556 * Ref.Lex.Key:: Keyword tokens.
557 * Ref.Lex.Res:: Reserved tokens.
558 * Ref.Lex.Num:: Numeric tokens.
559 * Ref.Lex.Text:: String and character tokens.
560 * Ref.Lex.Syntax:: Syntactic extension tokens.
561 * Ref.Lex.Sym:: Special symbol tokens.
565 @subsection Ref.Lex.Ignore
566 @c * Ref.Lex.Ignore:: Ignored tokens.
568 Characters considered to be @emph{whitespace} or @emph{comment} are ignored,
569 and are not considered as tokens. They serve only to delimit tokens. Rust is
570 otherwise a free-form language.
572 @dfn{Whitespace} is any of the following Unicode characters: U+0020 (space),
573 U+0009 (tab, @code{'\t'}), U+000A (LF, @code{'\n'}), U+000D (CR, @code{'\r'}).
575 @dfn{Comments} are @emph{single-line comments} or @emph{multi-line comments}.
577 A @dfn{single-line comment} is any sequence of Unicode characters beginning
578 with U+002F U+002F (@code{"//"}) and extending to the next U+000A character,
579 @emph{excluding} cases in which such a sequence occurs within a string literal
582 A @dfn{multi-line comments} is any sequence of Unicode characters beginning
583 with U+002F U+002A (@code{"/*"}) and ending with U+002A U+002F (@code{"*/"}),
584 @emph{excluding} cases in which such a sequence occurs within a string literal
585 token. Multi-line comments may be nested.
588 @subsection Ref.Lex.Ident
589 @c * Ref.Lex.Ident:: Identifier tokens.
590 @cindex Identifier token
592 Identifiers follow the rules given by Unicode Standard Annex #31, in the form
593 closed under NFKC normalization, @emph{excluding} those tokens that are
594 otherwise defined as keywords or reserved
595 tokens. @xref{Ref.Lex.Key}. @xref{Ref.Lex.Res}.
597 That is: an identifier starts with any character having derived property
598 @code{XID_Start} and continues with zero or more characters having derived
599 property @code{XID_Continue}; and such an identifier is NFKC-normalized during
600 lexing, such that all subsequent comparison of identifiers is performed on the
601 NFKC-normalized forms.
603 @emph{TODO: define relationship between Unicode and Rust versions}.
605 @footnote{This identifier syntax is a superset of the identifier syntaxes of C
606 and Java, and is modeled on Python PEP #3131, which formed the definition of
607 identifiers in Python 3.0 and later.}
610 @subsection Ref.Lex.Key
611 @c * Ref.Lex.Key:: Keyword tokens.
618 @multitable @columnfractions .15 .15 .15 .15 .15
623 @tab @code{unchecked}
685 @subsection Ref.Lex.Res
686 @c * Ref.Lex.Res:: Reserved tokens.
688 The reserved tokens are:
693 @multitable @columnfractions .15 .15 .15 .15 .15
705 At present these tokens have no defined meaning in the Rust language.
707 These tokens may correspond, in some current or future implementation,
708 to additional built-in types for decimal floating-point, extended
709 binary and interchange floating-point formats, as defined in the IEEE
710 754-1985 and IEEE 754-2008 specifications.
714 @subsection Ref.Lex.Num
715 @c * Ref.Lex.Num:: Numeric tokens.
718 @cindex Decimal token
720 @cindex Floating-point token
722 @c FIXME: This discussion isn't quite right since 'f' and 'i' can be used as
725 A @dfn{number literal} is either an @emph{integer literal} or a
726 @emph{floating-point literal}.
729 An @dfn{integer literal} has one of three forms:
731 @item A @dfn{decimal literal} starts with a @emph{decimal digit} and continues
732 with any mixture of @emph{decimal digits} and @emph{underscores}.
734 @item A @dfn{hex literal} starts with the character sequence U+0030
735 U+0078 (@code{"0x"}) and continues as any mixture @emph{hex digits}
736 and @emph{underscores}.
738 @item A @dfn{binary literal} starts with the character sequence U+0030
739 U+0062 (@code{"0b"}) and continues as any mixture @emph{binary digits}
740 and @emph{underscores}.
744 By default, an integer literal is of type @code{int}. An integer literal may
745 be followed (immediately, without any spaces) by a @dfn{integer suffix}, which
746 changes the type of the literal. There are three kinds of integer literal
750 @item The @code{u} suffix gives the literal type @code{uint}.
751 @item The @code{g} suffix gives the literal type @code{big}.
752 @item Each of the signed and unsigned machine types @code{u8}, @code{i8},
753 @code{u16}, @code{i16}, @code{u32}, @code{i32}, @code{u64} and @code{i64}
754 give the literal the corresponding machine type.
758 A @dfn{floating-point literal} has one of two forms:
760 @item Two @emph{decimal literals} separated by a period
761 character U+002E ('.'), with an optional @emph{exponent} trailing after the
762 second @emph{decimal literal}.
763 @item A single @emph{decimal literal} followed by an @emph{exponent}.
766 By default, a floating-point literal is of type @code{float}. A floating-point
767 literal may be followed (immediately, without any spaces) by a
768 @dfn{floating-point suffix}, which changes the type of the literal. There are
769 only two floating-point suffixes: @code{f32} and @code{f64}. Each of these
770 gives the floating point literal the associated type, rather than
773 A set of suffixes are also reserved to accommodate literal support for
774 types corresponding to reserved tokens. The reserved suffixes are @code{f16},
775 @code{f80}, @code{f128}, @code{m}, @code{m32}, @code{m64} and @code{m128}.
778 A @dfn{hex digit} is either a @emph{decimal digit} or else a character in the
779 ranges U+0061-U+0066 and U+0041-U+0046 (@code{'a'}-@code{'f'},
780 @code{'A'}-@code{'F'}).
782 A @dfn{binary digit} is either the character U+0030 or U+0031 (@code{'0'} or
785 An @dfn{exponent} begins with either of the characters U+0065 or U+0045
786 (@code{'e'} or @code{'E'}), followed by an optional @emph{sign character},
787 followed by a trailing @emph{decimal literal}.
789 A @dfn{sign character} is either U+002B or U+002D (@code{'+'} or @code{'-'}).
792 Examples of integer literals of various forms:
799 0b1111_1111_1001_0000_i32; // type i32
800 0xffff_ffff_ffff_ffff_ffff_ffffg; // type big
804 Examples of floating-point literals of various forms:
809 12E+99_f64; // type f64
814 @subsection Ref.Lex.Text
815 @c * Ref.Lex.Key:: String and character tokens.
817 @cindex Character token
818 @cindex Escape sequence
821 A @dfn{character literal} is a single Unicode character enclosed within two
822 U+0027 (single-quote) characters, with the exception of U+0027 itself, which
823 must be @emph{escaped} by a preceding U+005C character ('\').
825 A @dfn{string literal} is a sequence of any Unicode characters enclosed
826 within two U+0022 (double-quote) characters, with the exception of U+0022
827 itself, which must be @emph{escaped} by a preceding U+005C character
830 Some additional @emph{escapes} are available in either character or string
831 literals. An escape starts with a U+005C ('\') and continues with one
832 of the following forms:
834 @item An @dfn{8-bit codepoint escape} escape starts with U+0078 ('x') and is
835 followed by exactly two @dfn{hex digits}. It denotes the Unicode codepoint
836 equal to the provided hex value.
837 @item A @dfn{16-bit codepoint escape} starts with U+0075 ('u') and is followed
838 by exactly four @dfn{hex digits}. It denotes the Unicode codepoint equal to
839 the provided hex value.
840 @item A @dfn{32-bit codepoint escape} starts with U+0055 ('U') and is followed
841 by exactly eight @dfn{hex digits}. It denotes the Unicode codepoint equal to
842 the provided hex value.
843 @item A @dfn{whitespace escape} is one of the characters U+006E, U+0072, or
844 U+0074, denoting the unicode values U+000A (LF), U+000D (CR) or U+0009 (HT)
846 @item The @dfn{backslash escape} is the character U+005C ('\') which must be
847 escaped in order to denote @emph{itself}.
851 @subsection Ref.Lex.Syntax
852 @c * Ref.Lex.Syntax:: Syntactic extension tokens.
854 Syntactic extensions are marked with the @emph{pound} sigil U+0023 (@code{#}),
855 followed by an identifier, one of @code{fmt}, @code{env},
856 @code{concat_idents}, @code{ident_to_str}, @code{log_syntax}, @code{macro}, or
857 the name of a user-defined macro. This is followed by a vector literal. (Its
858 value will be interpreted syntactically; in particular, it need not be
861 @emph{TODO: formalize those terms more}.
864 @subsection Ref.Lex.Sym
865 @c * Ref.Lex.Sym:: Special symbol tokens.
870 The special symbols are:
874 @multitable @columnfractions .1 .1 .1 .1 .1 .1
933 @c * Ref.Path:: References to items.
934 @cindex Names of items or slots
936 @cindex Type parameters
938 A @dfn{path} is a sequence of one or more path components separated by a
939 namespace qualifier (@code{::}). If a path consists of only one component, it
940 may refer to either an item or a slot in a local control
941 scope. @xref{Ref.Mem.Slot}. @xref{Ref.Item}. If a path has multiple
942 components, it refers to an item.
944 Every item has a @emph{canonical path} within its crate, but the path naming
945 an item is only meaningful within a given crate. There is no global namespace
946 across crates; an item's canonical path merely identifies it within the
947 crate. @xref{Ref.Comp.Crate}.
949 Path components are usually identifiers. @xref{Ref.Lex.Ident}. The last
950 component of a path may also have trailing explicit type arguments.
952 Two examples of simple paths consisting of only identifier components:
958 In most contexts, the Rust grammar accepts a general @emph{path}, but
959 subsequent passes may restrict paths occurring in various contexts to refer to
960 slots or items, depending on the semantics of the occurrence. In other words:
961 in some contexts a slot is required (for example, on the left hand side of the
962 copy operator, @pxref{Ref.Expr.Copy}) and in other contexts an item is
963 required (for example, as a type parameter, @pxref{Ref.Item}). In no case is
964 the grammar made ambiguous by accepting a general path and interpreting the
965 reference in later passes. @xref{Ref.Gram}.
967 An example of a path with type parameters:
975 @c * Ref.Gram:: Grammar.
977 @emph{TODO: mostly LL(1), it reads like C++, Alef and bits of Napier;
983 @c * Ref.Comp:: Compilation and component model.
984 @cindex Compilation model
986 Rust is a @emph{compiled} language. Its semantics are divided along a
987 @emph{phase distinction} between compile-time and run-time. Those semantic
988 rules that have a @emph{static interpretation} govern the success or failure
989 of compilation. A program that fails to compile due to violation of a
990 compile-time rule has no defined semantics at run-time; the compiler should
991 halt with an error report, and produce no executable artifact.
993 The compilation model centres on artifacts called @emph{crates}. Each
994 compilation is directed towards a single crate in source form, and if
995 successful produces a single crate in executable form.
998 * Ref.Comp.Crate:: Units of compilation and linking.
999 * Ref.Comp.Attr:: Attributes of crates, modules and items.
1000 * Ref.Comp.Syntax:: Syntax extensions.
1003 @node Ref.Comp.Crate
1004 @subsection Ref.Comp.Crate
1005 @c * Ref.Comp.Crate:: Units of compilation and linking.
1008 A @dfn{crate} is a unit of compilation and linking, as well as versioning,
1009 distribution and runtime loading. Crates are defined by @emph{crate source
1010 files}, which are a type of source file written in a special declarative
1011 language: @emph{crate language}.@footnote{A crate is somewhat analogous to an
1012 @emph{assembly} in the ECMA-335 CLI model, a @emph{library} in the SML/NJ
1013 Compilation Manager, a @emph{unit} in the Owens and Flatt module system, or a
1014 @emph{configuration} in Mesa.} A crate source file describes:
1017 @item Metadata about the crate, such as author, name, version, and copyright.
1018 @item The source-file and directory modules that make up the crate.
1019 @item Any external crates or native modules that the crate imports to its top level.
1020 @item The organization of the crate's internal namespace.
1021 @item The set of names exported from the crate.
1024 A single crate source file may describe the compilation of a large number of
1025 Rust source files; it is compiled in its entirety, as a single indivisible
1026 unit. The compilation phase attempts to transform a single crate source file,
1027 and its referenced contents, into a single compiled crate. Crate source files
1028 and compiled crates have a 1:1 relationship.
1030 The syntactic form of a crate is a sequence of @emph{directives}, some of
1031 which have nested sub-directives.
1033 A crate defines an implicit top-level anonymous module: within this module,
1034 all members of the crate have canonical path names. @xref{Ref.Path}. The
1035 @code{mod} directives within a crate file specify sub-modules to include in
1036 the crate: these are either directory modules, corresponding to directories in
1037 the filesystem of the compilation environment, or file modules, corresponding
1038 to Rust source files. The names given to such modules in @code{mod} directives
1039 become prefixes of the paths of items defined within any included Rust source
1042 The @code{use} directives within the crate specify @emph{other crates} to scan
1043 for, locate, import into the crate's module namespace during compilation, and
1044 link against at runtime. Use directives may also occur independently in rust
1045 source files. These directives may specify loose or tight ``matching
1046 criteria'' for imported crates, depending on the preferences of the crate
1047 developer. In the simplest case, a @code{use} directive may only specify a
1048 symbolic name and leave the task of locating and binding an appropriate crate
1049 to a compile-time heuristic. In a more controlled case, a @code{use} directive
1050 may specify any metadata as matching criteria, such as a URI, an author name
1051 or version number, a checksum or even a cryptographic signature, in order to
1052 select an an appropriate imported crate. @xref{Ref.Comp.Attr}.
1054 The compiled form of a crate is a loadable and executable object file full of
1055 machine code, in a standard loadable operating-system format such as ELF, PE
1056 or Mach-O. The loadable object contains metadata, describing:
1058 @item Metadata required for type reflection.
1059 @item The publicly exported module structure of the crate.
1060 @item Any metadata about the crate, defined by attributes.
1061 @item The crates to dynamically link with at run-time, with matching criteria
1062 derived from the same @code{use} directives that guided compile-time imports.
1065 @c This might come along sometime in the future.
1067 @c The @code{syntax} directives of a crate are similar to the @code{use}
1068 @c directives, except they govern the syntax extension namespace (accessed
1069 @c through the syntax-extension sigil @code{#}, @pxref{Ref.Comp.Syntax})
1070 @c available only at compile time. A @code{syntax} directive also makes its
1071 @c extension available to all subsequent directives in the crate file.
1073 An example of a crate:
1076 // Linkage attributes
1077 #[ link(name = "projx"
1079 uuid = "9cccc5d5-aceb-4af5-8285-811211826b82") ];
1081 // Additional metadata attributes
1082 #[ desc = "Project X",
1084 author = "Jane Doe" ];
1087 use std (ver = "1.0");
1089 // Define some modules.
1092 mod quux = "quux.rs";
1097 @subsection Ref.Comp.Attr
1100 Static entities in Rust -- crates, modules and items -- may have attributes
1101 applied to them.@footnote{Attributes in Rust are modeled on Attributes in
1102 ECMA-335, C#} An attribute is a general, free-form piece of metadata that is
1103 interpreted according to name, convention, and language and compiler version.
1104 Attributes may appear as any of:
1106 @item A single identifier, the attribute name
1107 @item An identifier followed by the equals sign '=' and a literal, providing a key/value pair
1108 @item An identifier followed by a parenthesized list of sub-attribute arguments
1111 Attributes are applied to an entity by placing them within a hash-list
1112 (@code{#[...]}) as either a prefix to the entity or as a semicolon-delimited
1113 declaration within the entity body.
1115 An example of attributes:
1118 // A function marked as a unit test
1124 // General metadata applied to the enclosing module or crate.
1127 // A conditionally-compiled module
1128 #[cfg(target_os="linux")]
1135 In future versions of Rust, user-provided extensions to the compiler will be able
1136 to interpret attributes. When this facility is provided, a distinction will be
1137 made between language-reserved and user-available attributes.
1139 At present, only the Rust compiler interprets attributes, so all attribute
1140 names are effectively reserved. Some significant attributes include:
1143 @item The @code{cfg} attribute, for conditional-compilation by build-configuration
1144 @item The @code{link} attribute, describing linkage metadata for a crate
1145 @item The @code{test} attribute, for marking functions as unit tests.
1148 Other attributes may be added or removed during development of the language.
1150 @node Ref.Comp.Syntax
1151 @subsection Ref.Comp.Syntax
1152 @c * Ref.Comp.Syntax:: Syntax extension.
1153 @cindex Syntax extension
1155 Rust provides a notation for @dfn{syntax extension}. The notation for invoking
1156 a syntax extension is a marked syntactic form that can appear as an expression
1157 in the body of a Rust program. @xref{Ref.Lex.Syntax}.
1159 After parsing, a syntax-extension incovation is expanded into a Rust
1160 expression. The name of the extension determines the translation performed. In
1161 future versions of Rust, user-provided syntax extensions aside from macros
1162 will be provided via external crates.
1164 At present, only a set of built-in syntax extensions, as well as macros
1165 introduced inline in source code using the @code{macro} extension, may be
1166 used. The current built-in syntax extensions are:
1169 @item @code{fmt} expands into code to produce a formatted string, similar to
1170 @code{printf} from C.
1171 @item @code{env} expands into a string literal containing the value of that
1172 environment variable at compile-time.
1173 @item @code{concat_idents} expands into an identifier which is the
1174 concatenation of its arguments.
1175 @item @code{ident_to_str} expands into a string literal containing the name of
1176 its argument (which must be a literal).
1177 @item @code{log_syntax} causes the compiler to pretty-print its arguments.
1180 Finally, @code{macro} is used to define a new macro. A macro can abstract over
1181 second-class Rust concepts that are present in syntax. The arguments to
1182 @code{macro} are a bracketed list of pairs (two-element lists). The pairs
1183 consist of an invocation and the syntax to expand into. An example:
1186 #macro[[#apply[fn, [args, ...]], fn(args, ...)]];
1189 In this case, the invocation @code{#apply[sum, 5, 8, 6]} expands to
1190 @code{sum(5,8,6)}. If @code{...} follows an expression (which need not be as
1191 simple as a single identifier) in the input syntax, the matcher will expect an
1192 arbitrary number of occurences of the thing preceeding it, and bind syntax to
1193 the identifiers it contains. If it follows an expression in the output syntax,
1194 it will transcribe that expression repeatedly, according to the identifiers
1195 (bound to syntax) that it contains.
1197 The behavior of @code{...} is known as Macro By Example. It allows you to
1198 write a macro with arbitrary repetition by specifying only one case of that
1199 repetition, and following it by @code{...}, both where the repeated input is
1200 matched, and where the repeated output must be transcribed. A more
1201 sophisticated example:
1204 #macro[#zip_literals[[x, ...], [y, ...]],
1206 #macro[#unzip_literals[[x, y], ...],
1207 [[x, ...], [y, ...]]];
1210 In this case, @code{#zip_literals[[1,2,3], [1,2,3]]} expands to
1211 @code{[[1,1],[2,2],[3,3]]}, and @code{#unzip_literals[[1,1], [2,2], [3,3]]}
1212 expands to @code{[[1,2,3],[1,2,3]]}.
1214 Macro expansion takes place outside-in: that is,
1215 @code{#unzip_literals[#zip_literals[[1,2,3],[1,2,3]]]} will fail because
1216 @code{unzip_literals} expects a list, not a macro invocation, as an
1220 The macro system currently has some limitations. It's not possible to
1221 destructure anything other than vector literals (therefore, the arguments to
1222 complicated macros will tend to be an ocean of square brackets). Macro
1223 invocations and @code{...} can only appear in expression positions. Finally,
1224 macro expansion is currently unhygienic. That is, name collisions between
1225 macro-generated and user-written code can cause unintentional capture.
1231 @c * Ref.Mem:: Semantic model of memory.
1232 @cindex Memory model
1236 A Rust task's memory consists of a static set of @emph{items}, a set of tasks
1237 each with its own @emph{stack}, and a @emph{heap}. Immutable portions of the
1238 heap may be shared between tasks, mutable portions may not.
1240 Allocations in the stack consist of @emph{slots}, and allocations in the heap
1241 consist of @emph{boxes}.
1244 * Ref.Mem.Alloc:: Memory allocation model.
1245 * Ref.Mem.Own:: Memory ownership model.
1246 * Ref.Mem.Slot:: Stack memory model.
1247 * Ref.Mem.Box:: Heap memory model.
1251 @subsection Ref.Mem.Alloc
1252 @c * Ref.Mem.Alloc:: Memory allocation model.
1257 @cindex Task-local box
1259 The @dfn{items} of a program are those functions, iterators, objects, modules
1260 and types that have their value calculated at compile-time and stored uniquely
1261 in the memory image of the rust process. Items are neither dynamically
1262 allocated nor freed.
1264 A task's @dfn{stack} consists of activation frames automatically allocated on
1265 entry to each function as the task executes. A stack allocation is reclaimed
1266 when control leaves the frame containing it.
1268 The @dfn{heap} is a general term that describes two separate sets of boxes:
1269 shared boxes -- which may be subject to garbage collection -- and unique
1270 boxes. The lifetime of an allocation in the heap depends on the lifetime of
1271 the box values pointing to it. Since box values may themselves be passed in
1272 and out of frames, or stored in the heap, heap allocations may outlive the
1273 frame they are allocated within.
1277 @subsection Ref.Mem.Own
1278 @c * Ref.Mem.Own:: Memory ownership model.
1281 A task owns all memory it can @emph{safely} reach through local variables,
1282 shared or unique boxes, and/or references. Sharing memory between tasks can
1283 only be accomplished using @emph{unsafe} constructs, such as raw pointer
1284 operations or calling C code.
1286 When a task sends a value of @emph{unique} kind over a channel, it loses
1287 ownership of the value sent and can no longer refer to it. This is statically
1288 guaranteed by the combined use of ``move semantics'' and unique kinds, within
1289 the communication system.
1291 When a stack frame is exited, its local allocations are all released, and its
1292 references to boxes (both shared and owned) are dropped.
1294 A shared box may (in the case of a recursive, mutable shared type) be cyclic;
1295 in this case the release of memory inside the shared structure may be deferred
1296 until task-local garbage collection can reclaim it. Code can ensure no such
1297 delayed deallocation occurs by restricting itself to unique boxes and similar
1298 unshared kinds of data.
1300 When a task finishes, its stack is necessarily empty and it therefore has no
1301 references to any boxes; the remainder of its heap is immediately freed.
1304 @subsection Ref.Mem.Slot
1305 @c * Ref.Mem.Slot:: Stack memory model.
1309 @cindex Reference slot
1311 A task's stack contains slots.
1313 A @dfn{slot} is a component of a stack frame. A slot is either @emph{local} or
1316 A @dfn{local} slot (or @emph{stack-local} allocation) holds a value directly,
1317 allocated within the stack's memory. The value is a part of the stack frame.
1319 A @dfn{reference} references a value outside the frame. It may refer to a
1320 value allocated in another frame @emph{or} a boxed value in the heap. The
1321 reference-formation rules ensure that the referent will outlive the reference.
1323 Local slots are always implicitly mutable.
1325 Local slots are not initialized when allocated; the entire frame worth of
1326 local slots are allocated at once, on frame-entry, in an uninitialized
1327 state. Subsequent statements within a function may or may not initialize the
1328 local slots. Local slots can be used only after they have been initialized;
1329 this condition is guaranteed by the typestate system.
1331 References are created for function arguments. If the compiler can not prove
1332 that the referred-to value will outlive the reference, it will try to set
1333 aside a copy of that value to refer to. If this is not sematically safe (for
1334 example, if the referred-to value contains mutable fields), it will reject the
1335 program. If the compiler deems copying the value expensive, it will warn.
1337 A function can be declared to take an argument by mutable reference. This
1338 allows the function to write to the slot that the reference refers to.
1340 An example function that accepts an value by mutable reference:
1348 @subsection Ref.Mem.Box
1349 @c * Ref.Mem.Box:: Heap memory model.
1351 @cindex Dereference operator
1353 A @dfn{box} is a reference to a heap allocation holding another value. There
1354 are two kinds of boxes: @emph{shared boxes} and @emph{unique boxes}.
1356 A @dfn{shared box} type or value is constructed by the prefix @emph{at} sigil @code{@@}.
1358 A @dfn{unique box} type or value is constructed by the prefix @emph{tilde} sigil @code{~}.
1360 Multiple shared box values can point to the same heap allocation; copying a
1361 shared box value makes a shallow copy of the pointer (optionally incrementing
1362 a reference count, if the shared box is implemented through
1363 reference-counting).
1365 Unique box values exist in 1:1 correspondence with their heap allocation;
1366 copying a unique box value makes a deep copy of the heap allocation and
1367 produces a pointer to the new allocation.
1369 An example of constructing one shared box type and value, and one unique box type and value:
1371 let x: @@int = @@10;
1375 Some operations implicitly dereference boxes. Examples of such @dfn{implicit
1376 dereference} operations are:
1378 @item arithmetic operators (@code{x + y - z})
1379 @item field selection (@code{x.y.z})
1382 An example of an implicit-dereference operation performed on box values:
1384 let x: @@int = @@10;
1385 let y: @@int = @@12;
1386 assert (x + y == 22);
1389 Other operations act on box values as single-word-sized address values. For
1390 these operations, to access the value held in the box requires an explicit
1391 dereference of the box value. Explicitly dereferencing a box is indicated with
1392 the unary @emph{star} operator @code{*}. Examples of such @dfn{explicit
1393 dereference} operations are:
1395 @item copying box values (@code{x = y})
1396 @item passing box values to functions (@code{f(x,y)})
1399 An example of an explicit-dereference operation performed on box values:
1401 fn takes_boxed(b: @@int) @{
1404 fn takes_unboxed(b: int) @{
1408 let x: @@int = @@10;
1418 @c * Ref.Task:: Semantic model of tasks.
1422 An executing Rust program consists of a tree of tasks. A Rust @dfn{task}
1423 consists of an entry function, a stack, a set of outgoing communication
1424 channels and incoming communication ports, and ownership of some portion of
1425 the heap of a single operating-system process.
1427 Multiple Rust tasks may coexist in a single operating-system
1428 process. Execution of multiple Rust tasks in a single operating-system process
1429 may be either truly concurrent or interleaved by the runtime scheduler. Rust
1430 tasks are lightweight: each consumes less memory than an operating-system
1431 process, and switching between Rust tasks is faster than switching between
1432 operating-system processes.
1435 * Ref.Task.Comm:: Inter-task communication.
1436 * Ref.Task.Life:: Task lifecycle and state transitions.
1437 * Ref.Task.Sched:: Task scheduling model.
1438 * Ref.Task.Spawn:: Library interface for making new tasks.
1439 * Ref.Task.Send:: Library interface for sending messages.
1440 * Ref.Task.Recv:: Library interface for receiving messages.
1444 @subsection Ref.Task.Comm
1445 @c * Ref.Task.Comm:: Inter-task communication.
1447 @cindex Communication
1450 @cindex Message passing
1451 @cindex Send expression
1452 @cindex Receive expression
1454 With the exception of @emph{unsafe} blocks, Rust tasks are isolated from
1455 interfering with one another's memory directly. Instead of manipulating shared
1456 storage, Rust tasks communicate with one another using a typed, asynchronous,
1457 simplex message-passing system.
1459 A @dfn{port} is a communication endpoint that can @emph{receive}
1460 messages. Ports receive messages from channels.
1462 A @dfn{channel} is a communication endpoint that can @emph{send}
1463 messages. Channels send messages to ports.
1465 Each port is implicitly boxed and mutable; as such a port has a unique
1466 per-task identity and cannot be replicated or transmitted. If a port value is
1467 copied, both copies refer to the @emph{same} port. New ports can be
1468 constructed dynamically and stored in data structures.
1470 Each channel is bound to a port when the channel is constructed, so the
1471 destination port for a channel must exist before the channel itself. A channel
1472 cannot be rebound to a different port from the one it was constructed with.
1474 Channels are weak: a channel does not keep the port it is bound to
1475 alive. Ports are owned by their allocating task and cannot be sent over
1476 channels; if a task dies its ports die with it, and all channels bound to
1477 those ports no longer function. Messages sent to a channel connected to a dead
1478 port will be dropped.
1480 Channels are immutable types with meaning known to the runtime; channels can
1481 be sent over channels.
1483 Many channels can be bound to the same port, but each channel is bound to a
1484 single port. In other words, channels and ports exist in an N:1 relationship,
1485 N channels to 1 port. @footnote{It may help to remember nautical terminology
1486 when differentiating channels from ports. Many different waterways --
1487 channels -- may lead to the same port.}
1489 Each port and channel can carry only one type of message. The message type is
1490 encoded as a parameter of the channel or port type. The message type of a
1491 channel is equal to the message type of the port it is bound to. The types of
1492 messages must be of @emph{unique} kind.
1494 Messages are generally sent asynchronously, with optional rate-limiting on the
1495 transmit side. A channel contains a message queue and asynchronously sending a
1496 message merely inserts it into the sending channel's queue; message receipt is
1497 the responsibility of the receiving task.
1499 Messages are sent on channels and received on ports using standard library
1503 @subsection Ref.Task.Life
1504 @c * Ref.Task.Life:: Task lifecycle and state transitions.
1506 @cindex Lifecycle of task
1508 @cindex Running, task state
1509 @cindex Blocked, task state
1510 @cindex Failing, task state
1511 @cindex Dead, task state
1512 @cindex Soft failure
1513 @cindex Hard failure
1515 The @dfn{lifecycle} of a task consists of a finite set of states and events
1516 that cause transitions between the states. The lifecycle states of a task are:
1525 A task begins its lifecycle -- once it has been spawned -- in the
1526 @emph{running} state. In this state it executes the statements of its entry
1527 function, and any functions called by the entry function.
1529 A task may transition from the @emph{running} state to the @emph{blocked}
1530 state any time it evaluates a communication expression on a port or channel that
1531 cannot be immediately completed. When the communication expression can be
1532 completed -- when a message arrives at a sender, or a queue drains
1533 sufficiently to complete a semi-synchronous send -- then the blocked task will
1534 unblock and transition back to @emph{running}.
1536 A task may transition to the @emph{failing} state at any time, due to an
1537 un-trapped signal or the evaluation of a @code{fail} expression. Once
1538 @emph{failing}, a task unwinds its stack and transitions to the @emph{dead}
1539 state. Unwinding the stack of a task is done by the task itself, on its own
1540 control stack. If a value with a destructor is freed during unwinding, the
1541 code for the destructor is run, also on the task's control
1542 stack. Running the destructor code causes a temporary transition to a
1543 @emph{running} state, and allows the destructor code to cause any
1544 subsequent state transitions. The original task of unwinding and
1545 failing thereby may suspend temporarily, and may involve (recursive)
1546 unwinding of the stack of a failed destructor. Nonetheless, the
1547 outermost unwinding activity will continue until the stack is unwound
1548 and the task transitions to the @emph{dead} state. There is no way to
1549 ``recover'' from task failure. Once a task has temporarily suspended
1550 its unwinding in the @emph{failing} state, failure occurring from
1551 within this destructor results in @emph{hard} failure. The unwinding
1552 procedure of hard failure frees resources but does not execute
1553 destructors. The original (soft) failure is still resumed at the
1554 point where it was temporarily suspended.
1556 A task in the @emph{dead} state cannot transition to other states; it exists
1557 only to have its termination status inspected by other tasks, and/or to await
1558 reclamation when the last reference to it drops.
1560 @node Ref.Task.Sched
1561 @subsection Ref.Task.Sched
1562 @c * Ref.Task.Sched:: Task scheduling model.
1566 @cindex Yielding control
1568 The currently scheduled task is given a finite @emph{time slice} in which to
1569 execute, after which it is @emph{descheduled} at a loop-edge or similar
1570 preemption point, and another task within is scheduled, pseudo-randomly.
1572 An executing task can @code{yield} control at any time, which deschedules it
1573 immediately. Entering any other non-executing state (blocked, dead) similarly
1574 deschedules the task.
1578 @node Ref.Task.Spawn
1579 @subsection Ref.Task.Spawn
1580 @c * Ref.Task.Spawn:: Calls for creating new tasks.
1581 @cindex Spawn expression
1583 A call to @code{std::task::spawn}, passing a 0-argument function as its single
1584 argument, causes the runtime to construct a new task executing the passed
1585 function. The passed function is referred to as the @dfn{entry function} for
1586 the spawned task, and any captured environment is carries is moved from the
1587 spawning task to the spawned task before the spawned task begins execution.
1589 The result of a @code{spawn} call is a @code{std::task::task} value.
1591 An example of a @code{spawn} call:
1593 import std::task::*;
1594 import std::comm::*;
1596 fn helper(c: chan<u8>) @{
1604 spawn(bind helper(chan(p)));
1605 // let task run, do other things.
1607 let result = recv(p);
1612 @subsection Ref.Task.Send
1613 @c * Ref.Task.Send:: Calls for sending a value into a channel.
1616 @cindex Communication
1618 Sending a value into a channel is done by a library call to
1619 @code{std::comm::send}, which takes a channel and a value to send, and moves
1620 the value into the channel's outgoing buffer.
1622 An example of a send:
1624 import std::comm::*;
1625 let c: chan<str> = @dots{};
1626 send(c, "hello, world");
1630 @subsection Ref.Task.Recv
1631 @c * Ref.Task.Recv:: Calls for receiving a value from a channel.
1632 @cindex Receive call
1634 @cindex Communication
1636 Receiving a value is done by a call to the @code{recv} method, on an object of
1637 type @code{std::comm::port}. This call causes the receiving task to enter the
1638 @emph{blocked reading} state until a task is sending a value to the port, at
1639 which point the runtime pseudo-randomly selects a sending task and moves a
1640 value from the head of one of the task queues to the call's return value, and
1641 un-blocks the receiving task. @xref{Ref.Run.Comm}.
1643 An example of a @emph{receive}:
1645 import std::comm::*;
1646 let p: port<str> = @dots{};
1647 let s: str = recv(p);
1655 @c * Ref.Item:: The components of a module.
1658 @cindex Type parameters
1661 An @dfn{item} is a component of a module. Items are entirely determined at
1662 compile-time, remain constant during execution, and may reside in read-only
1665 There are five primary kinds of item: modules, functions, iterators, objects and
1668 All items form an implicit scope for the declaration of sub-items. In other
1669 words, within a function, object or iterator, declarations of items can (in
1670 many cases) be mixed with the statements, control blocks, and similar
1671 artifacts that otherwise compose the item body. The meaning of these scoped
1672 items is the same as if the item was declared outside the scope, except that
1673 the item's @emph{path name} within the module namespace is qualified by the
1674 name of the enclosing item. The exact locations in which sub-items may be
1675 declared is given by the grammar. @xref{Ref.Gram}.
1677 Functions, iterators, objects and type definitions may be @emph{parametrized}
1678 by type. Type parameters are given as a comma-separated list of identifiers
1679 enclosed in angle brackets (@code{<>}), after the name of the item and before
1680 its definition. The type parameters of an item are part of the name, not the
1681 type of the item; in order to refer to the type-parametrized item, a
1682 referencing name must in general provide type arguments as a list of
1683 comma-separated types enclosed within angle brackets. In practice, the
1684 type-inference system can usually infer such argument types from
1685 context. There are no general parametric types.
1688 * Ref.Item.Mod:: Items defining modules.
1689 * Ref.Item.Fn:: Items defining functions.
1690 * Ref.Item.Pred:: Items defining predicates for typestates.
1691 * Ref.Item.Iter:: Items defining iterators.
1692 * Ref.Item.Obj:: Items defining objects.
1693 * Ref.Item.Type:: Items defining the types of values and slots.
1694 * Ref.Item.Tag:: Items defining the constructors of a tag type.
1698 @subsection Ref.Item.Mod
1699 @c * Ref.Item.Mod:: Items defining sub-modules.
1702 @cindex Importing names
1703 @cindex Exporting names
1704 @cindex Visibility control
1706 A @dfn{module item} contains declarations of other @emph{items}. The items
1707 within a module may be functions, modules, objects or types. These
1708 declarations have both static and dynamic interpretation. The purpose of a
1709 module is to organize @emph{names} and control @emph{visibility}. Modules are
1710 declared with the keyword @code{mod}.
1712 An example of a module:
1715 type complex = (f64,f64);
1716 fn sin(f64) -> f64 @{
1719 fn cos(f64) -> f64 @{
1722 fn tan(f64) -> f64 @{
1729 Modules may also include any number of @dfn{import and export
1730 declarations}. These declarations must precede any module item declarations
1731 within the module, and control the visibility of names both within the module
1735 * Ref.Item.Mod.Import:: Declarations for module-local synonyms.
1736 * Ref.Item.Mod.Export:: Declarations for restricting visibility.
1739 @node Ref.Item.Mod.Import
1740 @subsubsection Ref.Item.Mod.Import
1741 @c * Ref.Item.Mod.Import:: Declarations for module-local synonyms.
1743 @cindex Importing names
1744 @cindex Visibility control
1746 An @dfn{import declaration} creates one or more local name bindings synonymous
1747 with some other name. Usually an import declaration is used to shorten the
1748 path required to refer to a module item.
1750 @emph{Note}: unlike many languages, Rust's @code{import} declarations do
1751 @emph{not} declare linkage-dependency with external crates. Linkage
1752 dependencies are independently declared with @code{use}
1753 declarations. @xref{Ref.Comp.Crate}.
1755 An example of imports:
1757 import std::math::sin;
1758 import std::option::*;
1759 import std::str::@{char_at, hash@};
1762 // Equivalent to 'log std::math::sin(1.0);'
1764 // Equivalent to 'log std::option::some(1.0);'
1766 // Equivalent to 'log std::str::hash(std::str::char_at("foo"));'
1767 log hash(char_at("foo"));
1771 @node Ref.Item.Mod.Export
1772 @subsubsection Ref.Item.Mod.Export
1773 @c * Ref.Item.Mod.Import:: Declarations for restricting visibility.
1775 @cindex Exporting names
1776 @cindex Visibility control
1778 An @dfn{export declaration} restricts the set of local declarations within a
1779 module that can be accessed from code outside the module. By default, all
1780 local declarations in a module are exported. If a module contains an export
1781 declaration, this declaration replaces the default export with the export
1784 An example of an export:
1794 fn helper(x: int, y: int) @{
1800 foo::primary(); // Will compile.
1801 foo::helper(2,3) // ERROR: will not compile.
1805 Multiple items may be exported from a single export declaration:
1809 export primary, secondary;
1820 fn helper(x: int, y: int) @{
1828 @subsection Ref.Item.Fn
1829 @c * Ref.Item.Fn:: Items defining functions.
1831 @cindex Slots, function input and output
1833 A @dfn{function item} defines a sequence of statements associated with a name
1834 and a set of parameters. Functions are declared with the keyword
1835 @code{fn}. Functions declare a set of @emph{input slots} as parameters,
1836 through which the caller passes arguments into the function, and an
1837 @emph{output slot} through which the function passes results back to the
1840 A function may also be copied into a first class @emph{value}, in which case
1841 the value has the corresponding @emph{function type}, and can be used
1842 otherwise exactly as a function item (with a minor additional cost of calling
1843 the function, as such a call is indirect). @xref{Ref.Type.Fn}.
1845 Every control path in a function ends with a @code{ret} or @code{be}
1846 expression or with a diverging expression (described later in this
1847 section). If a control path lacks a @code{ret} expression in source code, an
1848 implicit @code{ret} expression is appended to the end of the control path
1849 during compilation, returning the implicit @code{()} value.
1851 An example of a function:
1853 fn add(x: int, y: int) -> int @{
1858 A special kind of function can be declared with a @code{!} character where the
1859 output slot type would normally be. For example:
1861 fn my_err(s: str) -> ! @{
1867 We call such functions ``diverging'' because they never return a value to the
1868 caller. Every control path in a diverging function must end with a @code{fail}
1869 or a call to another diverging function on every control path. The @code{!}
1870 annotation does @emph{not} denote a type. Rather, the result type
1871 of a diverging function is a special type called @math{\bot} (``bottom'') that
1872 unifies with any type. Rust has no syntax for @math{\bot}.
1874 It might be necessary to declare a diverging function because as mentioned
1875 previously, the typechecker checks that every control path in a function ends
1876 with a @code{ret}, @code{be}, or diverging expression. So, if @code{my_err}
1877 were declared without the @code{!} annotation, the following code would not
1880 fn f(i: int) -> int @{
1885 my_err("Bad number!");
1890 The typechecker would complain that @code{f} doesn't return a value in the
1891 @code{else} branch. Adding the @code{!} annotation on @code{my_err} would
1892 express that @code{f} requires no explicit @code{ret}, as if it returns
1893 control to the caller, it returns a value (true because it never returns
1897 @subsection Ref.Item.Pred
1898 @c * Ref.Item.Pred:: Items defining predicates.
1901 Any pure boolean function is called a @emph{predicate}, and may be used
1902 as part of the static typestate system. @xref{Ref.Typestate.Constr}. A
1903 predicate declaration is identical to a function declaration, except that it
1904 is declared with the additional keyword @code{pure}. In addition,
1905 the typechecker checks the body of a predicate with a restricted set of
1906 typechecking rules. A predicate
1908 @item may not contain a @code{put}, @code{send}, @code{recv}, assignment, or
1909 self-call expression; and
1910 @item may only call other predicates, not general functions.
1913 An example of a predicate:
1915 pure fn lt_42(x: int) -> bool @{
1920 A non-boolean function may also be declared with @code{pure fn}. This allows
1921 predicates to call non-boolean functions as long as they are pure. For example:
1923 pure fn pure_length<@@T>(ls: list<T>) -> uint @{ /* ... */ @}
1925 pure fn nonempty_list<@@T>(ls: list<T>) -> bool @{ pure_length(ls) > 0u @}
1928 In this example, @code{nonempty_list} is a predicate---it can be used in a
1929 typestate constraint---but the auxiliary function @code{pure_length}@ is
1932 @emph{ToDo:} should actually define referential transparency.
1934 The effect checking rules previously enumerated are a restricted set of
1935 typechecking rules meant to approximate the universe of observably
1936 referentially transparent Rust procedures conservatively. Sometimes, these
1937 rules are @emph{too} restrictive. Rust allows programmers to violate these
1938 rules by writing predicates that the compiler cannot prove to be referentially
1939 transparent, using an escape-hatch feature called ``unchecked blocks''. When
1940 writing code that uses unchecked blocks, programmers should always be aware
1941 that they have an obligation to show that the code @emph{behaves} referentially
1942 transparently at all times, even if the compiler cannot @emph{prove}
1943 automatically that the code is referentially transparent. In the presence of
1944 unchecked blocks, the compiler provides no static guarantee that the code will
1945 behave as expected at runtime. Rather, the programmer has an independent
1946 obligation to verify the semantics of the predicates they write.
1948 @emph{ToDo:} last two sentences are vague.
1950 An example of a predicate that uses an unchecked block:
1952 fn pure_foldl<@@T, @@U>(ls: list<T>, u: U, f: block(&T, &U) -> U) -> U @{
1955 cons(hd, tl) @{ f(hd, pure_foldl(*tl, f(hd, u), f)) @}
1959 pure fn pure_length<@@T>(ls: list<T>) -> uint @{
1960 fn count<T>(_t: T, u: uint) -> uint @{ u + 1u @}
1962 pure_foldl(ls, 0u, count)
1967 Despite its name, @code{pure_foldl} is a @code{fn}, not a @code{pure fn},
1968 because there is no way in Rust to specify that the higher-order function
1969 argument @code{f} is a pure function. So, to use @code{foldl} in a pure list
1970 length function that a predicate could then use, we must use an
1971 @code{unchecked} block wrapped around the call to @code{pure_foldl} in the
1972 definition of @code{pure_length}.
1975 @subsection Ref.Item.Iter
1976 @c * Ref.Item.Iter:: Items defining iterators.
1979 @cindex Put expression
1980 @cindex Put each expression
1981 @cindex Foreach expression
1983 Iterators are function-like items that can @code{put} multiple values during
1984 their execution before returning.
1986 Putting a value is similar to returning a value -- the argument to @code{put}
1987 is copied into the caller's frame and control transfers back to the caller --
1988 but the iterator frame is only @emph{suspended} during the put, and will be
1989 @emph{resumed} at the point after the @code{put}, on the next iteration of
1992 The output type of an iterator is the type of value that the function will
1993 @code{put}, before it eventually evaluates a @code{ret} or @code{be} expression
1994 of type @code{()} and completes its execution.
1996 An iterator can be called only in the loop header of a matching @code{for
1997 each} loop or as the argument in a @code{put each} expression.
1998 @xref{Ref.Expr.Foreach}.
2000 An example of an iterator:
2002 iter range(lo: int, hi: int) -> int @{
2011 for each x: int in range(0,100) @{
2018 @subsection Ref.Item.Obj
2019 @c * Ref.Item.Obj:: Items defining objects.
2021 @cindex Object constructors
2023 An @dfn{object item} defines the @emph{state} and @emph{methods} of a set of
2024 @emph{object values}. Object values have object types. @xref{Ref.Type.Obj}.
2026 An @emph{object item} declaration -- in addition to providing a scope for
2027 state and method declarations -- implicitly declares a static function called
2028 the @emph{object constructor}, as well as a named @emph{object type}. The name
2029 given to the object item is resolved to a type when used in type context, or a
2030 constructor function when used in value context (such as a call).
2032 Example of an object item:
2034 obj counter(state: @@mutable int) @{
2043 let c: counter = counter(@@mutable 1);
2047 assert c.get() == 3;
2050 Inside an object's methods, you can make @emph{self-calls} using the
2051 @code{self} keyword.
2064 assert o.foo() == 5;
2067 Rust objects are extendable with additional methods and fields using
2068 @emph{anonymous object} expressions. @xref{Ref.Expr.AnonObj}.
2071 @subsection Ref.Item.Type
2072 @c * Ref.Item.Type:: Items defining the types of values and slots.
2073 @cindex Type definitions
2075 A @dfn{type definition} defines a set of possible values in
2076 memory. @xref{Ref.Type}. Type definitions are declared with the keyword
2077 @code{type}. Every value has a single, specific type; the type-specified
2078 aspects of a value include:
2081 @item Whether the value is composed of sub-values or is indivisible.
2082 @item Whether the value represents textual or numerical information.
2083 @item Whether the value represents integral or floating-point information.
2084 @item The sequence of memory operations required to access the value.
2085 @item The @emph{kind} of the type (pinned, unique or shared).
2088 For example, the type @code{@{x: u8, y: u8@}} defines the set of immutable
2089 values that are composite records, each containing two unsigned 8-bit integers
2090 accessed through the components @code{x} and @code{y}, and laid out in memory
2091 with the @code{x} component preceding the @code{y} component. This type is of
2092 @emph{unique} kind, meaning that there is no shared substructure with other
2093 types, but it can be copied and moved freely.
2096 @subsection Ref.Item.Tag
2097 @c * Ref.Item.Type:: Items defining the constructors of a tag type.
2100 A tag item simultaneously declares a new nominal tag type
2101 (@pxref{Ref.Type.Tag}) as well as a set of @emph{constructors} that can be
2102 used to create or pattern-match values of the corresponding tag type.
2104 The constructors of a @code{tag} type may be recursive: that is, each constructor
2105 may take an argument that refers, directly or indirectly, to the tag type the constructor
2106 is a member of. Such recursion has restrictions:
2108 @item Recursive types can be introduced only through @code{tag} constructors.
2109 @item A recursive @code{tag} item must have at least one non-recursive
2110 constructor (in order to give the recursion a basis case).
2111 @item The recursive argument of recursive tag constructors must be @emph{box}
2112 values (in order to bound the in-memory size of the constructor).
2113 @item Recursive type definitions can cross module boundaries, but not module
2114 @emph{visibility} boundaries, nor crate boundaries (in order to simplify the
2118 An example of a @code{tag} item and its use:
2125 let a: animal = dog;
2129 An example of a @emph{recursive} @code{tag} item and its use:
2136 let a: list<int> = cons(7, cons(13, nil));
2145 Every slot and value in a Rust program has a type. The @dfn{type} of a
2146 @emph{value} defines the interpretation of the memory holding it. The type of
2147 a @emph{slot} may also include constraints. @xref{Ref.Type.Constr}.
2149 Built-in types and type-constructors are tightly integrated into the language,
2150 in nontrivial ways that are not possible to emulate in user-defined
2151 types. User-defined types have limited capabilities. In addition, every
2152 built-in type or type-constructor name is reserved as a @emph{keyword} in
2153 Rust; they cannot be used as user-defined identifiers in any context.
2156 * Ref.Type.Any:: An open union of every possible type.
2157 * Ref.Type.Mach:: Machine-level types.
2158 * Ref.Type.Int:: The machine-dependent integer types.
2159 * Ref.Type.Float:: The machine-dependent floating-point types.
2160 * Ref.Type.Prim:: Primitive types.
2161 * Ref.Type.Big:: The arbitrary-precision integer type.
2162 * Ref.Type.Text:: Strings and characters.
2163 * Ref.Type.Rec:: Labeled products of heterogeneous types.
2164 * Ref.Type.Tup:: Unlabeled products of heterogeneous types.
2165 * Ref.Type.Vec:: Open products of homogeneous types.
2166 * Ref.Type.Tag:: Disjoint unions of heterogeneous types.
2167 * Ref.Type.Fn:: Subroutine types.
2168 * Ref.Type.Iter:: Scoped coroutine types.
2169 * Ref.Type.Obj:: Abstract types.
2170 * Ref.Type.Constr:: Constrained types.
2171 * Ref.Type.Type:: Types describing types.
2175 @subsection Ref.Type.Any
2177 @cindex Dynamic type, see @i{Any type}
2178 @cindex Alt type expression
2180 The type @code{any} is the union of all possible Rust types. A value of type
2181 @code{any} is represented in memory as a pair consisting of a boxed value of
2182 some non-@code{any} type @var{T} and a reflection of the type @var{T}.
2184 Values of type @code{any} can be used in an @code{alt type} expression, in
2185 which the reflection is used to select a block corresponding to a particular
2186 type extraction. @xref{Ref.Expr.Alt}.
2189 @subsection Ref.Type.Mach
2190 @cindex Machine types
2191 @cindex Floating-point types
2192 @cindex Integer types
2195 The machine types are the following:
2199 The unsigned word types @code{u8}, @code{u16}, @code{u32} and @code{u64},
2200 with values drawn from the integer intervals
2202 @math{[0, 2^8 - 1]},
2203 @math{[0, 2^{16} - 1]},
2204 @math{[0, 2^{32} - 1]} and
2205 @math{[0, 2^{64} - 1]}
2209 [0, 2<sup>8</sup>-1],
2210 [0, 2<sup>16</sup>-1],
2211 [0, 2<sup>32</sup>-1] and
2212 [0, 2<sup>64</sup>-1]
2217 The signed two's complement word types @code{i8}, @code{i16}, @code{i32} and
2218 @code{i64}, with values drawn from the integer intervals
2220 @math{[-(2^7),(2^7)-1)]},
2221 @math{[-(2^{15}),2^{15}-1)]},
2222 @math{[-(2^{31}),2^{31}-1)]} and
2223 @math{[-(2^{63}),2^{63}-1)]}
2227 [-(2<sup>7</sup>), 2<sup>7</sup>-1],
2228 [-(2<sup>15</sup>), 2<sup>15</sup>-1],
2229 [-(2<sup>31</sup>), 2<sup>31</sup>-1] and
2230 [-(2<sup>63</sup>), 2<sup>63</sup>-1]
2235 The IEEE 754-2008 @code{binary32} and @code{binary64} floating-point types:
2236 @code{f32} and @code{f64}, respectively.
2240 @subsection Ref.Type.Int
2241 @cindex Machine-dependent types
2242 @cindex Integer types
2246 The Rust type @code{uint}@footnote{A Rust @code{uint} is analogous to a C99
2247 @code{uintptr_t}.} is an unsigned integer type with with
2248 target-machine-dependent size. Its size, in bits, is equal to the number of
2249 bits required to hold any memory address on the target machine.
2251 The Rust type @code{int}@footnote{A Rust @code{int} is analogous to a C99
2252 @code{intptr_t}.} is a two's complement signed integer type with
2253 target-machine-dependent size. Its size, in bits, is equal to the size of the
2254 rust type @code{uint} on the same target machine.
2256 @node Ref.Type.Float
2257 @subsection Ref.Type.Float
2258 @cindex Machine-dependent types
2259 @cindex Floating-point types
2261 The Rust type @code{float} is a machine-specific type equal to one of the
2262 supported Rust floating-point machine types (@code{f32} or @code{f64}). It is
2263 the largest floating-point type that is directly supported by hardware on the
2264 target machine, or if the target machine has no floating-point hardware
2265 support, the largest floating-point type supported by the software
2266 floating-point library used to support the other floating-point machine types.
2268 Note that due to the preference for hardware-supported floating-point, the
2269 type @code{float} may not be equal to the largest @emph{supported}
2270 floating-point type.
2274 @subsection Ref.Type.Prim
2275 @cindex Primitive types
2276 @cindex Integer types
2277 @cindex Floating-point types
2278 @cindex Character type
2279 @cindex Boolean type
2281 The primitive types are the following:
2285 The ``nil'' type @code{()}, having the single ``nil'' value
2286 @code{()}.@footnote{The ``nil'' value @code{()} is @emph{not} a sentinel
2287 ``null pointer'' value for reference slots; the ``nil'' type is the implicit
2288 return type from functions otherwise lacking a return type, and can be used in
2289 other contexts (such as message-sending or type-parametric code) as a
2292 The boolean type @code{bool} with values @code{true} and @code{false}.
2296 The machine-dependent integer and floating-point types.
2301 @subsection Ref.Type.Big
2302 @cindex Integer types
2303 @cindex Big integer type
2305 The Rust type @code{big}@footnote{A Rust @code{big} is analogous to a Lisp
2306 bignum or a Python long integer.} is an arbitrary precision integer type that
2307 fits in a machine word @emph{when possible} and transparently expands to a
2308 boxed ``big integer'' allocated in the run-time heap when it overflows or
2309 underflows outside of the range of a machine word.
2311 A Rust @code{big} grows to accommodate extra binary digits as they are needed,
2312 by taking extra memory from the memory budget available to each Rust task, and
2313 should only exhaust its range due to memory exhaustion.
2316 @subsection Ref.Type.Text
2319 @cindex Character type
2324 The types @code{char} and @code{str} hold textual data.
2326 A value of type @code{char} is a Unicode character, represented as a 32-bit
2327 unsigned word holding a UCS-4 codepoint.
2329 A value of type @code{str} is a Unicode string, represented as a vector of
2330 8-bit unsigned bytes holding a sequence of UTF-8 codepoints.
2333 @subsection Ref.Type.Rec
2334 @cindex Record types
2335 @cindex Structure types, see @i{Record types}
2337 The record type-constructor forms a new heterogeneous product of
2338 values.@footnote{The record type-constructor is analogous to the @code{struct}
2339 type-constructor in the Algol/C family, the @emph{record} types of the ML
2340 family, or the @emph{structure} types of the Lisp family.} Fields of a record
2341 type are accessed by name and are arranged in memory in the order specified by
2344 An example of a record type and its use:
2346 type point = @{x: int, y: int@};
2347 let p: point = @{x: 10, y: 11@};
2352 @subsection Ref.Type.Tup
2355 The tuple type-constructor forms a new heterogeneous product of
2356 values similar to the record type-constructor. The differences are as follows:
2359 @item tuple elements cannot be mutable, unlike record fields
2360 @item tuple elements are not named and can be accessed only by pattern-matching
2363 Tuple types and values are denoted by listing the types or values of
2364 their elements, respectively, in a parenthesized, comma-separated
2365 list. Single-element tuples are not legal; all tuples have two or more values.
2367 The members of a tuple are laid out in memory contiguously, like a record, in
2368 order specified by the tuple type.
2370 An example of a tuple type and its use:
2372 type pair = (int,str);
2373 let p: pair = (10,"hello");
2375 assert (b == "world");
2380 @subsection Ref.Type.Vec
2381 @cindex Vector types
2382 @cindex Array types, see @i{Vector types}
2384 The vector type-constructor represents a homogeneous array of values of a
2385 given type. A vector has a fixed size. The kind of a vector type depends on
2386 the kind of its member type, as with other simple structural types.
2388 An example of a vector type and its use:
2390 let v: [int] = [7, 5, 3];
2395 Vectors always @emph{allocate} a storage region sufficient to store the first
2396 power of two worth of elements greater than or equal to the size of the
2397 vector. This behaviour supports idiomatic in-place ``growth'' of a mutable
2398 slot holding a vector:
2401 let v: mutable [int] = [1, 2, 3];
2405 Normal vector concatenation causes the allocation of a fresh vector to hold
2406 the result; in this case, however, the slot holding the vector recycles the
2407 underlying storage in-place (since the reference-count of the underlying
2408 storage is equal to 1).
2410 All accessible elements of a vector are always initialized, and access to a
2411 vector is always bounds-checked.
2415 @subsection Ref.Type.Tag
2417 @cindex Union types, see @i{Tag types}
2419 A @emph{tag type} is a nominal, heterogeneous disjoint union
2420 type.@footnote{The @code{tag} type is analogous to a @code{data} constructor
2421 declaration in ML or a @emph{pick ADT} in Limbo.} A @code{tag} @emph{item}
2422 consists of a number of @emph{constructors}, each of which is independently
2423 named and takes an optional tuple of arguments.
2425 Tag types cannot be denoted @emph{structurally} as types, but must be denoted
2426 by named reference to a @emph{tag item} declaration. @xref{Ref.Item.Tag}.
2429 @subsection Ref.Type.Fn
2430 @cindex Function types
2432 The function type-constructor @code{fn} forms new function types. A function
2433 type consists of a sequence of input slots, an optional set of input
2434 constraints (@pxref{Ref.Typestate.Constr}) and an output
2435 slot. @xref{Ref.Item.Fn}.
2437 An example of a @code{fn} type:
2439 fn add(x: int, y: int) -> int @{
2443 let int x = add(5,7);
2445 type binop = fn(int,int) -> int;
2446 let bo: binop = add;
2451 @subsection Ref.Type.Iter
2452 @cindex Iterator types
2454 The iterator type-constructor @code{iter} forms new iterator types. An
2455 iterator type consists a sequence of input slots, an optional set of input
2456 constraints and an output slot. @xref{Ref.Item.Iter}.
2458 An example of an @code{iter} type:
2460 iter range(x: int, y: int) -> int @{
2467 for each i: int in range(5,7) @{
2473 @subsection Ref.Type.Obj
2474 @c * Ref.Type.Obj:: Object types.
2475 @cindex Object types
2477 A @dfn{object type} describes values of abstract type, that carry some hidden
2478 @emph{fields} and are accessed through a set of un-ordered
2479 @emph{methods}. Every object item (@pxref{Ref.Item.Obj}) implicitly declares
2480 an object type carrying methods with types derived from all the methods of the
2483 Object types can also be declared in isolation, independent of any object item
2484 declaration. Such a ``plain'' object type can be used to describe an interface
2485 that a variety of particular objects may conform to, by supporting a superset
2488 The kind of an object type serves as a restriction to the kinds of fields that
2489 may be stored in it. Unique objects, for example, can only carry unique values
2492 An example of an object type with two separate object items supporting it, and
2493 a client function using both items via the object type:
2502 obj adder(x: @@mutable int) @{
2508 obj sender(c: chan<int>) @{
2510 std::comm::send(c, z);
2514 fn give_ints(t: taker) @{
2520 let p: port<int> = std::comm::mk_port();
2522 let t1: taker = adder(@@mutable 0);
2523 let t2: taker = sender(p.mk_chan());
2532 @node Ref.Type.Constr
2533 @subsection Ref.Type.Constr
2534 @c * Ref.Type.Constr:: Constrained types.
2535 @cindex Constrained types
2537 A @dfn{constrained type} is a type that carries a @emph{formal constraint}
2538 (@pxref{Ref.Typestate.Constr}), which is similar to a normal constraint except
2539 that the @emph{base name} of any slots mentioned in the constraint must be the
2540 special @emph{formal symbol} @emph{*}.
2542 When a constrained type is instantiated in a particular slot declaration, the
2543 formal symbol in the constraint is replaced with the name of the declared slot
2544 and the resulting constraint is checked immediately after the slot is
2545 declared. @xref{Ref.Expr.Check}.
2547 An example of a constrained type with two separate instantiations:
2549 type ordered_range = @{low: int, high: int@} : less_than(*.low, *.high);
2551 let rng1: ordered_range = @{low: 5, high: 7@};
2552 // implicit: 'check less_than(rng1.low, rng1.high);'
2554 let rng2: ordered_range = @{low: 15, high: 17@};
2555 // implicit: 'check less_than(rng2.low, rng2.high);'
2559 @subsection Ref.Type.Type
2560 @c * Ref.Type.Type:: Types describing types.
2568 @section Ref.Typestate
2569 @c * Ref.Typestate:: The static system of predicate analysis.
2570 @cindex Typestate system
2572 Rust programs have a static semantics that determine the types of values
2573 produced by each expression, as well as the @emph{predicates} that hold over
2574 slots in the environment at each point in time during execution.
2576 The latter semantics -- the dataflow analysis of predicates holding over slots
2577 -- is called the @emph{typestate} system.
2580 * Ref.Typestate.Point:: Discrete positions in execution.
2581 * Ref.Typestate.CFG:: The control-flow graph formed by points.
2582 * Ref.Typestate.Constr:: Predicates applied to slots.
2583 * Ref.Typestate.Cond:: Constraints required and implied by a point.
2584 * Ref.Typestate.State:: Constraints that hold at points.
2585 * Ref.Typestate.Check:: Relating dynamic state to static typestate.
2588 @node Ref.Typestate.Point
2589 @subsection Ref.Typestate.Point
2590 @c * Ref.Typestate.Point:: Discrete positions in execution.
2593 Control flows from statement to statement in a block, and through the
2594 evaluation of each expression, from one sub-expression to another. This
2595 sequential control flow is specified as a set of @dfn{points}, each of which
2596 has a set of points before and after it in the implied control flow.
2598 For example, this code:
2605 Consists of 2 statements, 3 expressions and 12 points:
2608 @item the point before the first statement
2609 @item the point before evaluating the static initializer @code{"hello, world"}
2610 @item the point after evaluating the static initializer @code{"hello, world"}
2611 @item the point after the first statement
2612 @item the point before the second statement
2613 @item the point before evaluating the function value @code{print}
2614 @item the point after evaluating the function value @code{print}
2615 @item the point before evaluating the arguments to @code{print}
2616 @item the point before evaluating the symbol @code{s}
2617 @item the point after evaluating the symbol @code{s}
2618 @item the point after evaluating the arguments to @code{print}
2619 @item the point after the second statement
2628 Consists of 1 statement, 7 expressions and 14 points:
2631 @item the point before the statement
2632 @item the point before evaluating the function value @code{print}
2633 @item the point after evaluating the function value @code{print}
2634 @item the point before evaluating the arguments to @code{print}
2635 @item the point before evaluating the arguments to @code{+}
2636 @item the point before evaluating the function value @code{x}
2637 @item the point after evaluating the function value @code{x}
2638 @item the point before evaluating the arguments to @code{x}
2639 @item the point after evaluating the arguments to @code{x}
2640 @item the point before evaluating the function value @code{y}
2641 @item the point after evaluating the function value @code{y}
2642 @item the point before evaluating the arguments to @code{y}
2643 @item the point after evaluating the arguments to @code{y}
2644 @item the point after evaluating the arguments to @code{+}
2645 @item the point after evaluating the arguments to @code{print}
2649 The typestate system reasons over points, rather than statements or
2650 expressions. This may seem counter-intuitive, but points are the more
2651 primitive concept. Another way of thinking about a point is as a set of
2652 @emph{instants in time} at which the state of a task is fixed. By contrast, a
2653 statement or expression represents a @emph{duration in time}, during which the
2654 state of the task changes. The typestate system is concerned with constraining
2655 the possible states of a task's memory at @emph{instants}; it is meaningless
2656 to speak of the state of a task's memory ``at'' a statement or expression, as
2657 each statement or expression is likely to change the contents of memory.
2659 @node Ref.Typestate.CFG
2660 @subsection Ref.Typestate.CFG
2661 @c * Ref.Typestate.CFG:: The control-flow graph formed by points.
2662 @cindex Control-flow graph
2664 Each @emph{point} can be considered a vertex in a directed @emph{graph}. Each
2665 kind of expression or statement implies a number of points @emph{and edges} in
2666 this graph. The edges connect the points within each statement or expression,
2667 as well as between those points and those of nearby statements and expressions
2668 in the program. The edges between points represent @emph{possible} indivisible
2669 control transfers that might occur during execution.
2671 This implicit graph is called the @dfn{control-flow graph}, or @dfn{CFG}.
2673 @node Ref.Typestate.Constr
2674 @subsection Ref.Typestate.Constr
2675 @c * Ref.Typestate.Constr:: Predicates applied to slots.
2679 A @dfn{predicate} is a pure boolean function declared with the keyword
2680 @code{pred}. @xref{Ref.Item.Pred}.
2682 A @dfn{constraint} is a predicate applied to specific slots.
2684 For example, consider the following code:
2687 pure fn is_less_than(int a, int b) -> bool @{
2694 check is_less_than(x,y);
2698 This example defines the predicate @code{is_less_than}, and applies it to the
2699 slots @code{x} and @code{y}. The constraint being checked on the third line of
2700 the function is @code{is_less_than(x,y)}.
2702 Predicates can only apply to slots holding immutable values. The slots a
2703 predicate applies to can themselves be mutable, but the types of values held
2704 in those slots must be immutable.
2706 @node Ref.Typestate.Cond
2707 @subsection Ref.Typestate.Cond
2708 @c * Ref.Typestate.Cond:: Constraints required and implied by a point.
2710 @cindex Precondition
2711 @cindex Postcondition
2713 A @dfn{condition} is a set of zero or more constraints.
2715 Each @emph{point} has an associated @emph{condition}:
2718 @item The @dfn{precondition} of a statement or expression is the condition
2719 required at in the point before it.
2720 @item The @dfn{postcondition} of a statement or expression is the condition
2721 enforced in the point after it.
2724 Any constraint present in the precondition and @emph{absent} in the
2725 postcondition is considered to be @emph{dropped} by the statement or
2728 @node Ref.Typestate.State
2729 @subsection Ref.Typestate.State
2730 @c * Ref.Typestate.State:: Constraints that hold at points.
2735 The typestate checking system @emph{calculates} an additional condition for
2736 each point called its typestate. For a given statement or expression, we call
2737 the two typestates associated with its two points the prestate and a
2741 @item The @dfn{prestate} of a statement or expression is the typestate of the
2743 @item The @dfn{poststate} of a statement or expression is the typestate of the
2747 A @dfn{typestate} is a condition that has @emph{been determined by the
2748 typestate algorithm} to hold at a point. This is a subtle but important point
2749 to understand: preconditions and postconditions are @emph{inputs} to the
2750 typestate algorithm; prestates and poststates are @emph{outputs} from the
2751 typestate algorithm.
2753 The typestate algorithm analyses the preconditions and postconditions of every
2754 statement and expression in a block, and computes a condition for each
2755 typestate. Specifically:
2758 @item Initially, every typestate is empty.
2759 @item Each statement or expression's poststate is given the union of the its
2760 prestate, precondition, and postcondition.
2761 @item Each statement or expression's poststate has the difference between its
2762 precondition and postcondition removed.
2763 @item Each statement or expression's prestate is given the intersection of the
2764 poststates of every predecessor point in the CFG.
2765 @item The previous three steps are repeated until no typestates in the
2769 The typestate algorithm is a very conventional dataflow calculation, and can
2770 be performed using bit-set operations, with one bit per predicate and one
2771 bit-set per condition.
2773 After the typestates of a block are computed, the typestate algorithm checks
2774 that every constraint in the precondition of a statement is satisfied by its
2775 prestate. If any preconditions are not satisfied, the mismatch is considered a
2776 static (compile-time) error.
2779 @node Ref.Typestate.Check
2780 @subsection Ref.Typestate.Check
2781 @c * Ref.Typestate.Check:: Relating dynamic state to static typestate.
2782 @cindex Check statement
2783 @cindex Assertions, see @i{Check statement}
2785 The key mechanism that connects run-time semantics and compile-time analysis
2786 of typestates is the use of @code{check} expressions. @xref{Ref.Expr.Check}. A
2787 @code{check} expression guarantees that @emph{if} control were to proceed past
2788 it, the predicate associated with the @code{check} would have succeeded, so
2789 the constraint being checked @emph{statically} holds in subsequent
2790 points.@footnote{A @code{check} expression is similar to an @code{assert}
2791 call in a C program, with the significant difference that the Rust compiler
2792 @emph{tracks} the constraint that each @code{check} expression
2793 enforces. Naturally, @code{check} expressions cannot be omitted from a
2794 ``production build'' of a Rust program the same way @code{asserts} are
2795 frequently disabled in deployed C programs.}
2797 It is important to understand that the typestate system has @emph{no insight}
2798 into the meaning of a particular predicate. Predicates and constraints are not
2799 evaluated in any way at compile time. Predicates are treated as specific (but
2800 unknown) functions applied to specific (also unknown) slots. All the typestate
2801 system does is track which of those predicates -- whatever they calculate --
2802 @emph{must have been checked already} in order for program control to reach a
2803 particular point in the CFG. The fundamental building block, therefore, is the
2804 @code{check} statement, which tells the typestate system ``if control passes
2805 this point, the checked predicate holds''.
2807 From this building block, constraints can be propagated to function signatures
2808 and constrained types, and the responsibility to @code{check} a constraint
2809 pushed further and further away from the site at which the program requires it
2810 to hold in order to execute properly.
2816 @c * Ref.Stmt:: Components of an executable block.
2819 A @dfn{statement} is a component of a block, which is in turn a component of
2820 an outer block-expression, a function or an iterator. When a function is
2821 spawned into a task, the task @emph{executes} statements in an order
2822 determined by the body of the enclosing structure. Each statement causes the
2823 task to perform certain actions.
2825 Rust has two kinds of statement: declarations and expressions.
2827 A declaration serves to introduce a @emph{name} that can be used in the block
2828 @emph{scope} enclosing the statement: all statements before and after the
2829 name, from the previous opening curly-brace (@code{@{}) up to the next closing
2830 curly-brace (@code{@}}).
2832 An expression serves the dual roles of causing side effects and producing a
2833 @emph{value}. Expressions are said to @emph{evaluate to} a value, and the side
2834 effects are caused during @emph{evaluation}. Many expressions contain
2835 sub-expressions as operands; the definition of each kind of expression
2836 dictates whether or not, and in which order, it will evaluate its
2837 sub-expressions, and how the expression's value derives from the value of its
2840 In this way, the structure of execution -- both the overall sequence of
2841 observable side effects and the final produced value -- is dictated by the
2842 structure of expressions. Blocks themselves are expressions, so the nesting
2843 sequence of block, statement, expression, and block can repeatedly nest to an
2847 * Ref.Stmt.Decl:: Statement declaring an item or slot.
2848 * Ref.Stmt.Expr:: Statement evaluating an expression.
2852 @subsection Ref.Stmt.Decl
2853 @c * Ref.Stmt.Decl:: Statement declaring an item or slot.
2854 @cindex Declaration statement
2856 A @dfn{declaration statement} is one that introduces a @emph{name} into the
2857 enclosing statement block. The declared name may denote a new slot or a new
2858 item. The scope of the name extends to the entire containing block, both
2859 before and after the declaration.
2862 * Ref.Stmt.Decl.Item:: Statement declaring an item.
2863 * Ref.Stmt.Decl.Slot:: Statement declaring a slot.
2866 @node Ref.Stmt.Decl.Item
2867 @subsubsection Ref.Stmt.Decl.Item
2868 @c * Ref.Stmt.Decl.Item:: Statement declaring an item.
2870 An @dfn{item declaration statement} has a syntactic form identical to an item
2871 declaration within a module. Declaring an item -- a function, iterator,
2872 object, type or module -- locally within a statement block is simply a way of
2873 restricting its scope to a narrow region containing all of its uses; it is
2874 otherwise identical in meaning to declaring the item outside the statement
2877 Note: there is no implicit capture of the function's dynamic environment when
2878 declaring a function-local item.
2880 @node Ref.Stmt.Decl.Slot
2881 @subsubsection Ref.Stmt.Decl.Slot
2882 @c * Ref.Stmt.Decl.Slot:: Statement declaring an slot.
2884 @cindex Variable, see @i{Local slot}
2885 @cindex Type inference
2887 A @code{slot declaration statement} has one one of two forms:
2890 @item @code{let} @var{pattern} @var{optional-init};
2891 @item @code{let} @var{pattern} : @var{type} @var{optional-init};
2894 Where @var{type} is a type expression, @var{pattern} is an irrefutable pattern
2895 (often just the name of a single slot), and @var{optional-init} is an optional
2896 initializer. If present, the initializer consists of either an equals sign
2897 (@code{=}) or move operator (@code{<-}), followed by an expression.
2899 Both forms introduce a new slot into the containing block scope. The new slot
2900 is visible across the entire scope, but is initialized only at the point
2901 following the declaration statement.
2903 The former form, with no type annotation, causes the compiler to infer the
2904 static type of the slot through unification with the types of values assigned
2905 to the slot in the remaining code in the block scope. Inference only occurs on
2906 frame-local slots, not argument slots. Function, iterator and object
2907 signatures must always declared types for all argument slots.
2908 @xref{Ref.Mem.Slot}.
2911 @subsection Ref.Stmt.Expr
2912 @c * Ref.Stmt.Expr:: Statement evaluating an expression
2913 @cindex Expression statement
2915 An @dfn{expression statement} is one that evaluates an expression and drops
2916 its result. The purpose of an expression statement is often to cause the side
2917 effects of the expression's evaluation.
2922 @c * Ref.Expr:: Parsed and primitive expressions.
2927 * Ref.Expr.Copy:: Expression for copying a value.
2928 * Ref.Expr.Call:: Expression for calling a function.
2929 * Ref.Expr.Bind:: Expression for binding arguments to functions.
2930 * Ref.Expr.Ret:: Expression for stopping and producing a value.
2931 @c * Ref.Expr.Be:: Expression for stopping and executing a tail call.
2932 * Ref.Expr.Put:: Expression for pausing and producing a value.
2933 * Ref.Expr.Fail:: Expression for causing task failure.
2934 * Ref.Expr.Log:: Expression for logging values to diagnostic buffers.
2935 * Ref.Expr.Note:: Expression for logging values during failure.
2936 * Ref.Expr.While:: Expression for simple conditional looping.
2937 * Ref.Expr.Break:: Expression for terminating a loop.
2938 * Ref.Expr.Cont:: Expression for terminating a single loop iteration.
2939 * Ref.Expr.For:: Expression for looping over strings and vectors.
2940 * Ref.Expr.Foreach:: Expression for looping via an iterator.
2941 * Ref.Expr.If:: Expression for simple conditional branching.
2942 * Ref.Expr.Alt:: Expression for complex conditional branching.
2943 * Ref.Expr.Prove:: Expression for static assertion of typestate.
2944 * Ref.Expr.Check:: Expression for dynamic assertion of typestate.
2945 * Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate.
2946 * Ref.Expr.Assert:: Expression for halting the program if a boolean condition fails to hold.
2947 * Ref.Expr.IfCheck:: Expression for dynamic testing of typestate.
2948 * Ref.Expr.AnonObj:: Expression for extending objects with additional methods.
2953 @subsection Ref.Expr.Copy
2954 @c * Ref.Expr.Copy:: Expression for copying a value.
2955 @cindex Copy expression
2956 @cindex Assignment operator, see @i{Copy expression}
2958 A @dfn{copy expression} consists of an @emph{lval} followed by an equals-sign
2959 (@code{=}) and a primitive expression. @xref{Ref.Expr}.
2961 Executing a copy expression causes the value denoted by the expression --
2962 either a value or a primitive combination of values -- to be copied into the
2963 memory location denoted by the @emph{lval}.
2965 A copy may entail the adjustment of reference counts, execution of destructors,
2966 or similar adjustments in order to respect the path through the memory graph
2967 implied by the @code{lval}, as well as any existing value held in the memory
2968 being written-to. All such adjustment is automatic and implied by the @code{=}
2971 An example of three different copy expressions:
2979 @subsection Ref.Expr.Call
2980 @c * Ref.Expr.Call:: Expression for calling a function.
2981 @cindex Call expression
2982 @cindex Function calls
2984 A @dfn{call expression} invokes a function, providing a tuple of input slots
2985 and an reference slot to serve as the function's output, bound to the @var{lval}
2986 on the right hand side of the call. If the function eventually returns, then
2987 the expression completes.
2989 A call expression statically requires that the precondition declared in the
2990 callee's signature is satisfied by the expression prestate. In this way,
2991 typestates propagate through function boundaries. @xref{Ref.Typestate}.
2993 An example of a call expression:
2995 let x: int = add(1, 2);
2999 @subsection Ref.Expr.Bind
3000 @c * Ref.Expr.Bind:: Expression for binding arguments to functions.
3001 @cindex Bind expression
3005 A @dfn{bind expression} constructs a new function from an existing
3006 function.@footnote{The @code{bind} expression is analogous to the @code{bind}
3007 expression in the Sather language.} The new function has zero or more of its
3008 arguments @emph{bound} into a new, hidden boxed tuple that holds the
3009 bindings. For each concrete argument passed in the @code{bind} expression, the
3010 corresponding parameter in the existing function is @emph{omitted} as a
3011 parameter of the new function. For each argument passed the placeholder symbol
3012 @code{_} in the @code{bind} expression, the corresponding parameter of the
3013 existing function is @emph{retained} as a parameter of the new function.
3015 Any subsequent invocation of the new function with residual arguments causes
3016 invocation of the existing function with the combination of bound arguments
3017 and residual arguments that was specified during the binding.
3019 An example of a @code{bind} expression:
3021 fn add(x: int, y: int) -> int @{
3024 type single_param_fn = fn(int) -> int;
3026 let add4: single_param_fn = bind add(4, _);
3028 let add5: single_param_fn = bind add(_, 5);
3030 assert (add(4,5) == add4(5));
3031 assert (add(4,5) == add5(4));
3035 A @code{bind} expression generally stores a copy of the bound arguments in the
3036 hidden, boxed tuple, owned by the resulting first-class function. For each
3037 bound slot in the bound function's signature, space is allocated in the hidden
3038 tuple and populated with a copy of the bound value.
3040 The @code{bind} expression is a lightweight mechanism for simulating the more
3041 elaborate construct of @emph{lexical closures} that exist in other
3042 languages. Rust has no support for lexical closures, but many realistic uses
3043 of them can be achieved with @code{bind} expressions.
3047 @subsection Ref.Expr.Ret
3048 @c * Ref.Expr.Ret:: Expression for stopping and producing a value.
3049 @cindex Return expression
3051 Executing a @code{ret} expression@footnote{A @code{ret} expression is analogous
3052 to a @code{return} expression in the C family.} copies a value into the output
3053 slot of the current function, destroys the current function activation frame,
3054 and transfers control to the caller frame.
3056 An example of a @code{ret} expression:
3058 fn max(a: int, b: int) -> int @{
3068 @subsection Ref.Expr.Be
3069 @c * Ref.Expr.Be:: Expression for stopping and executing a tail call.
3070 @cindex Be expression
3073 Executing a @code{be} expression @footnote{A @code{be} expression in is
3074 analogous to a @code{become} expression in Newsqueak or Alef.} destroys the
3075 current function activation frame and replaces it with an activation frame for
3076 the called function. In other words, @code{be} executes a tail-call. The
3077 syntactic form of a @code{be} expression is therefore limited to @emph{tail
3078 position}: its argument must be a @emph{call expression}, and it must be the
3079 last expression in a block.
3081 An example of a @code{be} expression:
3083 fn print_loop(n: int) @{
3093 The above example executes in constant space, replacing each frame with a new
3099 @subsection Ref.Expr.Put
3100 @c * Ref.Expr.Put:: Expression for pausing and producing a value.
3101 @cindex Put expression
3104 Executing a @code{put} expression copies a value into the output slot of the
3105 current iterator, suspends execution of the current iterator, and transfers
3106 control to the current put-recipient frame.
3108 A @code{put} expression is only valid within an iterator. @footnote{A
3109 @code{put} expression is analogous to a @code{yield} expression in the CLU, and
3110 Sather languages, or in more recent languages providing a ``generator''
3111 facility, such as Python, Javascript or C#. Like the generators of CLU and
3112 Sather but @emph{unlike} these later languages, Rust's iterators reside on the
3113 stack and obey a strict stack discipline.} The current put-recipient will
3114 eventually resume the suspended iterator containing the @code{put} expression,
3115 either continuing execution after the @code{put} expression, or terminating its
3116 execution and destroying the iterator frame.
3120 @subsection Ref.Expr.Fail
3121 @c * Ref.Expr.Fail:: Expression for causing task failure.
3122 @cindex Fail expression
3126 Executing a @code{fail} expression causes a task to enter the @emph{failing}
3127 state. In the @emph{failing} state, a task unwinds its stack, destroying all
3128 frames and freeing all resources until it reaches its entry frame, at which
3129 point it halts execution in the @emph{dead} state.
3132 @subsection Ref.Expr.Log
3133 @c * Ref.Expr.Log:: Expression for logging values to diagnostic buffers.
3134 @cindex Log expression
3137 Executing a @code{log} expression may, depending on runtime configuration,
3138 cause a value to be appended to an internal diagnostic logging buffer provided
3139 by the runtime or emitted to a system console. Log expressions are enabled or
3140 disabled dynamically at run-time on a per-task and per-item
3141 basis. @xref{Ref.Run.Log}.
3147 @subsection Ref.Expr.Note
3148 @c * Ref.Expr.Note:: Expression for logging values during failure.
3149 @cindex Note expression
3154 A @code{note} expression has no effect during normal execution. The purpose of
3155 a @code{note} expression is to provide additional diagnostic information to the
3156 logging subsystem during task failure. @xref{Ref.Expr.Log}. Using @code{note}
3157 expressions, normal diagnostic logging can be kept relatively sparse, while
3158 still providing verbose diagnostic ``back-traces'' when a task fails.
3160 When a task is failing, control frames @emph{unwind} from the innermost frame
3161 to the outermost, and from the innermost lexical block within an unwinding
3162 frame to the outermost. When unwinding a lexical block, the runtime processes
3163 all the @code{note} expressions in the block sequentially, from the first
3164 expression of the block to the last. During processing, a @code{note}
3165 expression has equivalent meaning to a @code{log} expression: it causes the
3166 runtime to append the argument of the @code{note} to the internal logging
3169 An example of a @code{note} expression:
3171 fn read_file_lines(path: str) -> [str] @{
3174 let f: file = open_read(path);
3175 for each s: str in lines(f) @{
3182 In this example, if the task fails while attempting to open or read a file,
3183 the runtime will log the path name that was being read. If the function
3184 completes normally, the runtime will not log the path.
3186 A value that is marked by a @code{note} expression is @emph{not} copied aside
3187 when control passes through the @code{note}. In other words, if a @code{note}
3188 expression notes a particular @var{lval}, and code after the @code{note}
3189 mutates that slot, and then a subsequent failure occurs, the @emph{mutated}
3190 value will be logged during unwinding, @emph{not} the original value that was
3191 denoted by the @var{lval} at the moment control passed through the @code{note}
3194 @node Ref.Expr.While
3195 @subsection Ref.Expr.While
3196 @c * Ref.Expr.While:: Expression for simple conditional looping.
3197 @cindex While expression
3199 @cindex Control-flow
3201 A @code{while} expression is a loop construct. A @code{while} loop may be
3202 either a simple @code{while} or a @code{do}-@code{while} loop.
3204 In the case of a simple @code{while}, the loop begins by evaluating the
3205 boolean loop conditional expression. If the loop conditional expression
3206 evaluates to @code{true}, the loop body block executes and control returns to
3207 the loop conditional expression. If the loop conditional expression evaluates
3208 to @code{false}, the @code{while} expression completes.
3210 In the case of a @code{do}-@code{while}, the loop begins with an execution of
3211 the loop body. After the loop body executes, it evaluates the loop conditional
3212 expression. If it evaluates to @code{true}, control returns to the beginning
3213 of the loop body. If it evaluates to @code{false}, control exits the loop.
3215 An example of a simple @code{while} expression:
3223 An example of a @code{do}-@code{while} expression:
3231 @node Ref.Expr.Break
3232 @subsection Ref.Expr.Break
3233 @c * Ref.Expr.Break:: Expression for terminating a loop.
3234 @cindex Break expression
3236 @cindex Control-flow
3238 Executing a @code{break} expression immediately terminates the innermost loop
3239 enclosing it. It is only permitted in the body of a loop.
3242 @subsection Ref.Expr.Cont
3243 @c * Ref.Expr.Cont:: Expression for terminating a single loop iteration.
3244 @cindex Continue expression
3246 @cindex Control-flow
3248 Executing a @code{cont} expression immediately terminates the current iteration
3249 of the innermost loop enclosing it, returning control to the loop
3250 @emph{head}. In the case of a @code{while} loop, the head is the conditional
3251 expression controlling the loop. In the case of a @code{for} or @code{for
3252 each} loop, the head is the iterator or vector-element increment controlling the
3255 A @code{cont} expression is only permitted in the body of a loop.
3259 @subsection Ref.Expr.For
3260 @c * Ref.Expr.For:: Expression for looping over strings and vectors.
3261 @cindex For expression
3263 @cindex Control-flow
3265 A @dfn{for loop} is controlled by a vector or string. The for loop
3266 bounds-checks the underlying sequence @emph{once} when initiating the loop,
3267 then repeatedly copies each value of the underlying sequence into the element
3268 variable, executing the loop body once per copy.
3272 let v: [foo] = [a, b, c];
3279 @node Ref.Expr.Foreach
3280 @subsection Ref.Expr.Foreach
3281 @c * Ref.Expr.Foreach:: Expression for general conditional looping.
3282 @cindex Foreach expression
3284 @cindex Control-flow
3286 An @dfn{foreach loop} is denoted by the @code{for each} keywords, and is
3287 controlled by an iterator. The loop executes once for each value @code{put} by
3288 the iterator. When the iterator returns or fails, the loop terminates.
3290 Example of a foreach loop:
3294 for each s: str in str::split(txt, "\n") @{
3295 vec::push(lines, s);
3301 @subsection Ref.Expr.If
3302 @c * Ref.Expr.If:: Expression for simple conditional branching.
3303 @cindex If expression
3304 @cindex Control-flow
3306 An @code{if} expression is a conditional branch in program control. The form of
3307 an @code{if} expression is a condition expression, followed by a consequent
3308 block, any number of @code{else if} conditions and blocks, and an optional
3309 trailing @code{else} block. The condition expressions must have type
3310 @code{bool}. If a condition expression evaluates to @code{true}, the
3311 consequent block is executed and any subsequent @code{else if} or @code{else}
3312 block is skipped. If a condition expression evaluates to @code{false}, the
3313 consequent block is skipped and any subsequent @code{else if} condition is
3314 evaluated. If all @code{if} and @code{else if} conditions evaluate to @code{false}
3315 then any @code{else} block is executed.
3318 @subsection Ref.Expr.Alt
3319 @c * Ref.Expr.Alt:: Expression for complex conditional branching.
3320 @cindex Alt expression
3321 @cindex Control-flow
3322 @cindex Switch expression, see @i{Alt expression}
3324 An @code{alt} expression is a multi-directional branch in program control.
3325 There are two kinds of @code{alt} expression: pattern @code{alt} expressions
3326 and @code{alt type} expressions.
3328 The form of each kind of @code{alt} is similar: an initial @emph{head} that
3329 describes the criteria for branching, followed by a sequence of zero or more
3330 @emph{arms}, each of which describes a @emph{case} and provides a @emph{block}
3331 of expressions associated with the case. When an @code{alt} is executed,
3332 control enters the head, determines which of the cases to branch to, branches
3333 to the block associated with the chosen case, and then proceeds to the
3334 expression following the @code{alt} when the case block completes.
3337 * Ref.Expr.Alt.Pat:: Expression for branching on pattern matches.
3338 * Ref.Expr.Alt.Type:: Expression for branching on types.
3341 @node Ref.Expr.Alt.Pat
3342 @subsubsection Ref.Expr.Alt.Pat
3343 @c * Ref.Expr.Alt.Pat:: Expression for branching on pattern matches.
3344 @cindex Pattern alt expression
3345 @cindex Control-flow
3347 A pattern @code{alt} expression branches on a @emph{pattern}. The exact form of
3348 matching that occurs depends on the pattern. Patterns consist of some
3349 combination of literals, tag constructors, variable binding specifications and
3350 placeholders (@code{_}). A pattern @code{alt} has a @emph{head expression},
3351 which is the value to compare to the patterns. The type of the patterns must
3352 equal the type of the head expression.
3354 To execute a pattern @code{alt} expression, first the head expression is
3355 evaluated, then its value is sequentially compared to the patterns in the arms
3356 until a match is found. The first arm with a matching @code{case} pattern is
3357 chosen as the branch target of the @code{alt}, any variables bound by the
3358 pattern are assigned to local slots in the arm's block, and control enters the
3361 An example of a pattern @code{alt} expression:
3364 type list<X> = tag(nil, cons(X, @@list<X>));
3366 let x: list<int> = cons(10, cons(11, nil));
3369 case (cons(a, cons(b, _))) @{
3372 case (cons(v=10, _)) @{
3382 @node Ref.Expr.Alt.Type
3383 @subsubsection Ref.Expr.Alt.Type
3384 @c * Ref.Expr.Alt.Type:: Expression for branching on type.
3385 @cindex Type alt expression
3386 @cindex Control-flow
3388 An @code{alt type} expression is similar to a pattern @code{alt}, but branches
3389 on the @emph{type} of its head expression, rather than the value. The head
3390 expression of an @code{alt type} expression must be of type @code{any}, and the
3391 arms of the expression are slot patterns rather than value patterns. Control
3392 branches to the arm with a @code{case} that matches the @emph{actual type} of
3393 the value in the @code{any}.
3395 An example of an @code{alt type} expression:
3404 case (list<int> li) @{
3405 ret int_list_sum(li);
3407 case (list<X> lx) @{
3417 @node Ref.Expr.Prove
3418 @subsection Ref.Expr.Prove
3419 @c * Ref.Expr.Prove:: Expression for static assertion of typestate.
3420 @cindex Prove expression
3421 @cindex Typestate system
3423 A @code{prove} expression has no run-time effect. Its purpose is to statically
3424 check (and document) that its argument constraint holds at its expression entry
3425 point. If its argument typestate does not hold, under the typestate algorithm,
3426 the program containing it will fail to compile.
3428 @node Ref.Expr.Check
3429 @subsection Ref.Expr.Check
3430 @c * Ref.Expr.Check:: Expression for dynamic assertion of typestate.
3431 @cindex Check expression
3432 @cindex Typestate system
3434 A @code{check} expression connects dynamic assertions made at run-time to the
3435 static typestate system. A @code{check} expression takes a constraint to check
3436 at run-time. If the constraint holds at run-time, control passes through the
3437 @code{check} and on to the next expression in the enclosing block. If the
3438 condition fails to hold at run-time, the @code{check} expression behaves as a
3439 @code{fail} expression.
3441 The typestate algorithm is built around @code{check} expressions, and in
3442 particular the fact that control @emph{will not pass} a check expression with a
3443 condition that fails to hold. The typestate algorithm can therefore assume
3444 that the (static) postcondition of a @code{check} expression includes the
3445 checked constraint itself. From there, the typestate algorithm can perform
3446 dataflow calculations on subsequent expressions, propagating conditions forward
3447 and statically comparing implied states and their
3448 specifications. @xref{Ref.Typestate}.
3451 pure fn even(x: int) -> bool @{
3455 fn print_even(x: int) : even(x) @{
3462 // Cannot call print_even(y) here.
3466 // Can call print_even(y) here, since even(y) now holds.
3471 @node Ref.Expr.Claim
3472 @subsection Ref.Expr.Claim
3473 @c * Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate.
3474 @cindex Claim expression
3475 @cindex Typestate system
3477 A @code{claim} expression is an unsafe variant on a @code{check} expression
3478 that is not actually checked at runtime. Thus, using a @code{claim} implies a
3479 proof obligation to ensure---without compiler assistance---that an assertion
3482 Setting a runtime flag can turn all @code{claim} expressions
3483 into @code{check} expressions in a compiled Rust program, but the default is to not check the assertion
3484 contained in a @code{claim}. The idea behind @code{claim} is that performance profiling might identify a
3485 few bottlenecks in the code where actually checking a given callee's predicate
3486 is too expensive; @code{claim} allows the code to typecheck without removing
3487 the predicate check at every other call site.
3489 @node Ref.Expr.IfCheck
3490 @subsection Ref.Expr.IfCheck
3491 @c * Ref.Expr.IfCheck:: Expression for dynamic testing of typestate.
3492 @cindex If check expression
3493 @cindex Typestate system
3494 @cindex Control-flow
3496 An @code{if check} expression combines a @code{if} expression and a @code{check}
3497 expression in an indivisible unit that can be used to build more complex
3498 conditional control-flow than the @code{check} expression affords.
3500 In fact, @code{if check} is a ``more primitive'' expression than @code{check};
3501 instances of the latter can be rewritten as instances of the former. The
3502 following two examples are equivalent:
3505 Example using @code{check}:
3512 Equivalent example using @code{if check}:
3521 @node Ref.Expr.Assert
3522 @subsection Ref.Expr.Assert
3523 @c * Ref.Expr.Assert:: Expression that halts the program if a boolean condition fails to hold.
3526 An @code{assert} expression is similar to a @code{check} expression, except
3527 the condition may be any boolean-typed expression, and the compiler makes no
3528 use of the knowledge that the condition holds if the program continues to
3529 execute after the @code{assert}.
3531 @node Ref.Expr.AnonObj
3532 @subsection Ref.Expr.AnonObj
3533 @c * Ref.Expr.AnonObj:: Expression that extends an object with additional methods.
3534 @cindex Anonymous objects
3536 An @emph{anonymous object} expression extends an existing object with methods.
3541 @c * Ref.Run:: Organization of runtime services.
3542 @cindex Runtime library
3544 The Rust @dfn{runtime} is a relatively compact collection of C and Rust code
3545 that provides fundamental services and datatypes to all Rust tasks at
3546 run-time. It is smaller and simpler than many modern language runtimes. It is
3547 tightly integrated into the language's execution model of memory, tasks,
3548 communication, reflection, logging and signal handling.
3551 * Ref.Run.Mem:: Runtime memory management service.
3552 * Ref.Run.Type:: Runtime built-in type services.
3553 * Ref.Run.Comm:: Runtime communication service.
3554 * Ref.Run.Log:: Runtime logging system.
3555 * Ref.Run.Sig:: Runtime signal handler.
3559 @subsection Ref.Run.Mem
3560 @c * Ref.Run.Mem:: Runtime memory management service.
3561 @cindex Memory allocation
3563 The runtime memory-management system is based on a @emph{service-provider
3564 interface}, through which the runtime requests blocks of memory from its
3565 environment and releases them back to its environment when they are no longer
3566 in use. The default implementation of the service-provider interface consists
3567 of the C runtime functions @code{malloc} and @code{free}.
3569 The runtime memory-management system in turn supplies Rust tasks with
3570 facilities for allocating, extending and releasing stacks, as well as
3571 allocating and freeing boxed values.
3574 @subsection Ref.Run.Type
3575 @c * Ref.Run.Mem:: Runtime built-in type services.
3576 @cindex Built-in types
3578 The runtime provides C and Rust code to assist with various built-in types,
3579 such as vectors, strings, bignums, and the low level communication system
3580 (ports, channels, tasks).
3582 Support for other built-in types such as simple types, tuples, records, and
3583 tags is open-coded by the Rust compiler.
3586 @subsection Ref.Run.Comm
3587 @c * Ref.Run.Comm:: Runtime communication service.
3588 @cindex Communication
3592 The runtime provides code to manage inter-task communication. This includes
3593 the system of task-lifecycle state transitions depending on the contents of
3594 queues, as well as code to copy values between queues and their recipients and
3595 to serialize values for transmission over operating-system inter-process
3596 communication facilities.
3599 @subsection Ref.Run.Log
3600 @c * Ref.Run.Log:: Runtime logging system.
3603 The runtime contains a system for directing logging expressions to a logging
3604 console and/or internal logging buffers. @xref{Ref.Expr.Log}. Logging
3605 expressions can be enabled or disabled via a two-dimensional filtering process:
3613 Each @emph{item} (module, function, iterator, object, type) in Rust has a
3614 static path within its crate module, and can have logging enabled or
3615 disabled on a path-prefix basis.
3621 Each @emph{task} in a running Rust program has a unique ownership relation
3622 through the task ownership tree, and can have logging enabled or disabled on
3623 an ownership-ancestry basis.
3626 Logging is integrated into the language for efficiency reasons, as well as the
3627 need to filter logs based on these two built-in dimensions.
3630 @subsection Ref.Run.Sig
3631 @c * Ref.Run.Sig:: Runtime signal handler.
3634 The runtime signal-handling system is driven by a signal-dispatch table and a
3635 signal queue associated with each task. Sending a signal to a task inserts the
3636 signal into the task's signal queue and marks the task as having a pending
3637 signal. At the next scheduling opportunity, the runtime processes signals in
3638 the task's queue using its dispatch table. The signal queue memory is charged
3639 to the task; if the queue grows too big, the task will fail.
3641 @c ############################################################
3642 @c end main body of nodes
3643 @c ############################################################
3656 @c indent-tabs-mode: nil
3657 @c buffer-file-coding-system: utf-8-unix
3658 @c compile-command: "make -C $RBUILD -k 2>&1 | sed -e 's/\\/x\\//x:\\//g'";