doc/rust.texi

   1 \input texinfo   @c -*-texinfo-*-
   2 @c %**start of header
   3 @setfilename rust.info
   4 @settitle Rust Documentation
   5 @setchapternewpage odd
   6 @c %**end of header
   7
   8 @include version.texi
   9
  10 @ifinfo
  11 This manual is for the ``Rust'' programming language.
  12
  13
  14 @uref{http://www.rust-lang.org}
  15
  16 Version: @gitversion
  17
  18
  19 Copyright 2006-2010 Graydon Hoare
  20
  21 Copyright 2009-2011 Mozilla Foundation
  22
  23 See accompanying LICENSE.txt for terms.
  24
  25 @end ifinfo
  26
  27 @dircategory Programming
  28 @direntry
  29 * rust: (rust).         Rust programming language
  30 @end direntry
  31
  32 @titlepage
  33 @title Rust
  34 @subtitle A safe, concurrent, practical language.
  35 @author Graydon Hoare
  36 @author Mozilla Foundation
  37
  38 @page
  39 @vskip 0pt plus 1filll
  40
  41
  42 @uref{http://rust-lang.org}
  43
  44 Version: @gitversion
  45
  46 @sp 2
  47
  48 Copyright @copyright{} 2006-2010 Graydon Hoare
  49
  50 Copyright @copyright{} 2009-2011 Mozilla Foundation
  51
  52 See accompanying LICENSE.txt for terms.
  53
  54 @end titlepage
  55
  56 @everyfooting @| @emph{-- Draft @today --} @|
  57
  58 @ifnottex
  59 @node Top
  60 @top Top
  61
  62 Rust Documentation
  63
  64 @end ifnottex
  65
  66 @menu
  67 * Disclaimer::                 Notes on a work in progress.
  68 * Introduction::               Background, intentions, lineage.
  69 * Tutorial::                   Gentle introduction to reading Rust code.
  70 * Reference::                  Systematic reference of language elements.
  71 * Index::                      Index
  72 @end menu
  73
  74 @ifnottex
  75 Complete table of contents
  76 @end ifnottex
  77
  78 @contents
  79
  80 @c ############################################################
  81 @c Disclaimer
  82 @c ############################################################
  83
  84 @node    Disclaimer
  85 @chapter Disclaimer
  86
  87 To the reader,
  88
  89 Rust is a work in progress. The language continues to evolve as the design
  90 shifts and is fleshed out in working code. Certain parts work, certain parts
  91 do not, certain parts will be removed or changed.
  92
  93 This manual is a snapshot written in the present tense. Some features
  94 described do not yet exist in working code. Some may be temporary. It
  95 is a @emph{draft}, and we ask that you not take anything you read here
  96 as either definitive or final. The manual is to help you get a sense
  97 of the language and its organization, not to serve as a complete
  98 specification. At least not yet.
  99
 100 If you have suggestions to make, please try to focus them on @emph{reductions}
 101 to the language: possible features that can be combined or omitted. At this
 102 point, every ``additive'' feature we're likely to support is already on the
 103 table. The task ahead involves combining, trimming, and implementing.
 104
 105
 106 @c ############################################################
 107 @c Introduction
 108 @c ############################################################
 109
 110 @node    Introduction
 111 @chapter Introduction
 112
 113 @quotation
 114   We have to fight chaos, and the most effective way of doing that is
 115   to prevent its emergence.
 116 @flushright
 117                                    - Edsger Dijkstra
 118 @end flushright
 119 @end quotation
 120 @sp 2
 121
 122 Rust is a curly-brace, block-structured expression language. It visually
 123 resembles the C language family, but differs significantly in syntactic and
 124 semantic details. Its design is oriented toward concerns of ``programming in
 125 the large'', that is, of creating and maintaining @emph{boundaries} -- both
 126 abstract and operational -- that preserve large-system @emph{integrity},
 127 @emph{availability} and @emph{concurrency}.
 128
 129 It supports a mixture of imperative procedural, concurrent actor,
 130 object-oriented and pure functional styles. Rust also supports generic
 131 programming and metaprogramming, in both static and dynamic styles.
 132
 133 @menu
 134 * Goals::                      Intentions, motivations.
 135 * Sales Pitch::                A summary for the impatient.
 136 * Influences::                 Relationship to past languages.
 137 @end menu
 138
 139
 140 @node    Goals
 141 @section Goals
 142
 143 The language design pursues the following goals:
 144
 145 @sp 1
 146 @itemize
 147 @item Compile-time error detection and prevention.
 148 @item Run-time fault tolerance and containment.
 149 @item System building, analysis and maintenance affordances.
 150 @item Clarity and precision of expression.
 151 @item Implementation simplicity.
 152 @item Run-time efficiency.
 153 @item High concurrency.
 154 @end itemize
 155 @sp 1
 156
 157 Note that most of these goals are @emph{engineering} goals, not showcases for
 158 sophisticated language technology. Most of the technology in Rust is
 159 @emph{old} and has been seen decades earlier in other languages.
 160
 161 All new languages are developed in a technological context. Rust's goals arise
 162 from the context of writing large programs that interact with the internet --
 163 both servers and clients -- and are thus much more concerned with
 164 @emph{safety} and @emph{concurrency} than older generations of program. Our
 165 experience is that these two forces do not conflict; rather they drive system
 166 design decisions toward extensive use of @emph{partitioning} and
 167 @emph{statelessness}. Rust aims to make these a more natural part of writing
 168 programs, within the niche of lower-level, practical, resource-conscious
 169 languages.
 170
 171
 172 @page
 173 @node    Sales Pitch
 174 @section Sales Pitch
 175
 176 The following comprises a brief ``sales pitch'' overview of the salient
 177 features of Rust, relative to other languages.
 178
 179 @itemize
 180
 181 @sp 1
 182 @item No @code{null} pointers
 183
 184 The initialization state of every slot is statically computed as part of the
 185 typestate system (see below), and requires that all slots are initialized
 186 before use. There is no @code{null} value; uninitialized slots are
 187 uninitialized and can only be written to, not read.
 188
 189 The common use for @code{null} in other languages -- as a sentinel value -- is
 190 subsumed into the more general facility of disjoint union types. A program
 191 must explicitly model its use of such types.
 192
 193 @sp 1
 194 @item Lightweight tasks with no shared values
 195
 196 Like many @emph{actor} languages, Rust provides an isolation (and concurrency)
 197 model based on lightweight tasks scheduled by the language runtime. These
 198 tasks are very inexpensive and statically unable to manipulate one another's
 199 local memory. Breaking the rule of task isolation is possible only by calling
 200 external (C/C++) code.
 201
 202 Inter-task communication is typed, asynchronous, and simplex, based on passing
 203 messages over channels to ports.
 204
 205 @sp 1
 206 @item Predictable native code, simple runtime
 207
 208 The meaning and cost of every operation within a Rust program is intended to
 209 be easy to model for the reader. The code should not ``surprise'' the
 210 programmer once it has been compiled.
 211
 212 Rust compiles to native code. Rust compilation units are large and the
 213 compilation model is designed around multi-file, whole-library or
 214 whole-program optimization. The compiled units are standard loadable objects
 215 (ELF, PE, Mach-O) containing standard debug information (DWARF) and are
 216 compatible with existing, standard low-level tools (disassemblers, debuggers,
 217 profilers, dynamic loaders). The compiled units include custom metadata that
 218 carries full type and version information.
 219
 220 The Rust runtime library is a small collection of support code for scheduling,
 221 memory management, inter-task communication, reflection and runtime
 222 linkage. This library is written in standard C++ and is quite
 223 straightforward. It presents a simple interface to embeddings. No
 224 research-level virtual machine, JIT or garbage collection technology is
 225 required. It should be relatively easy to adapt a Rust front-end on to many
 226 existing native toolchains.
 227
 228 @sp 1
 229 @item Integrated system-construction facility
 230
 231 The units of compilation of Rust are multi-file amalgamations called
 232 @emph{crates}. A crate is described by a separate, declarative type of source
 233 file that guides the compilation of the crate, its packaging, its versioning,
 234 and its external dependencies. Crates are also the units of distribution and
 235 loading. Significantly: the dependency graph of crates is @emph{acyclic} and
 236 @emph{anonymous}: there is no global namespace for crates, and module-level
 237 recursion cannot cross crate barriers.
 238
 239 Unlike many languages, individual modules do @emph{not} carry all the
 240 mechanisms or restrictions of crates. Modules and crates serve different
 241 roles.
 242
 243 @sp 1
 244 @item Static control over memory allocation, packing and aliasing.
 245
 246 Many values in Rust are allocated @emph{within} their containing stack-frame
 247 or parent structure. Numbers, records, tuples and tags are all allocated this
 248 way. To allocate such values in the heap, they must be explicitly
 249 @emph{boxed}. A @dfn{box} is a pointer to a heap allocation that holds another
 250 value, its @emph{content}. Boxes may be either shared or unique, depending
 251 on which sort of storage management is desired.
 252
 253 Boxing and unboxing in Rust is explicit, though in some cases (such as
 254 name-component dereferencing) Rust will automatically dereference a
 255 box to access its content. Box values can be passed and assigned
 256 independently, like pointers in C; the difference is that in Rust they always
 257 point to live contents, and are not subject to pointer arithmetic.
 258
 259 In addition to boxes, Rust supports a kind of pass-by-pointer slot called a
 260 reference. Forming or releasing a reference does not perform reference-count
 261 operations; references can only be formed on values that will provably outlive
 262 the reference. References are not ``general values'', in the sense that they
 263 cannot be independently manipulated. They are a lot like C++'s references,
 264 except that they are safe: the compiler ensures that they always point to live
 265 values.
 266
 267 In addition, every slot (stack-local allocation or reference) has a static
 268 initialization state that is calculated by the typestate system. This permits
 269 late initialization of slots in functions with complex control-flow, while
 270 still guaranteeing that every use of a slot occurs after it has been
 271 initialized.
 272
 273 @sp 1
 274 @item Immutable data by default
 275
 276 All types in Rust are immutable by default. A field within a type must be
 277 declared as @code{mutable} in order to be modified.
 278
 279 @sp 1
 280 @item Move semantics and unique pointers
 281
 282 Rust differentiates copying values from moving them, and permits moving and
 283 swapping values explicitly rather than copying. Moving can be more efficient and,
 284 crucially, represents an indivisible transfer of ownership of a value from its
 285 source to its destination.
 286
 287 In addition, pointer types in Rust come in several varieties. One important
 288 type of pointer related to move semantics is the @emph{unique} pointer,
 289 denoted @code{~}, which is statically guaranteed to be the only pointer
 290 pointing to its referent at any given time.
 291
 292 Combining move-semantics and unique pointers, Rust permits a very lightweight
 293 form of inter-task communication: values are sent between tasks by moving, and
 294 only types composed of unique pointers can be sent. This statically ensures
 295 there can never be sharing of data between tasks, while keeping the costs of
 296 transferring data between tasks as cheap as moving a pointer.
 297
 298 @sp 1
 299 @item Stack-based iterators
 300
 301 Rust provides a type of function-like multiple-invocation iterator that is
 302 very efficient: the iterator state lives only on the stack and is tightly
 303 coupled to the loop that invoked it.
 304
 305 @sp 1
 306 @item Direct interface to C code
 307
 308 Rust can load and call many C library functions simply by declaring
 309 them. Calling a C function is an ``unsafe'' action, and can only be taken
 310 within a block marked with the @code{unsafe} keyword. Every unsafe block
 311 in a Rust compilation unit must be explicitly authorized in the crate file.
 312
 313 @sp 1
 314 @item Structural algebraic data types
 315
 316 The Rust type system is primarily structural, and contains the standard
 317 assortment of useful ``algebraic'' type constructors from functional
 318 languages, such as function types, tuples, record types, vectors, and
 319 nominally-tagged disjoint unions. Such values may be @emph{pattern-matched} in
 320 an @code{alt} expression.
 321
 322 @sp 1
 323 @item Generic code
 324
 325 Rust supports a simple form of parametric polymorphism: functions, iterators,
 326 types and objects can be parametrized by other types.
 327
 328 @sp 1
 329 @item Argument binding
 330
 331 Rust provides a mechanism of partially binding arguments to functions,
 332 producing new functions that accept the remaining un-bound arguments. This
 333 mechanism combines some of the features of lexical closures with some of the
 334 features of currying, in a smaller and simpler package.
 335
 336 @sp 1
 337 @item Local type inference
 338
 339 To save some quantity of programmer key-pressing, Rust supports local type
 340 inference: signatures of functions, objects and iterators always require type
 341 annotation, but within the body of a function or iterator many slots can be
 342 declared without a type, and Rust will infer the slot's type from its uses.
 343
 344 @sp 1
 345 @item Structural object system
 346
 347 Rust has a lightweight object system based on structural object types: there
 348 is no ``class hierarchy'' nor any concept of inheritance. Method overriding
 349 and object restriction are performed explicitly on object values, which are
 350 little more than order-insensitive records of methods sharing a common private
 351 value.
 352
 353 @sp 1
 354 @item Static metaprogramming (syntactic extension)
 355
 356 Rust supports a system for syntactic extensions that can be loaded into the
 357 compiler, to implement user-defined notations, macros, program-generators and
 358 the like. These notations are @emph{marked} using a special form of
 359 bracketing, such that a reader unfamiliar with the extension can still parse
 360 the surrounding text by skipping over the bracketed ``extension text''.
 361
 362 @sp 1
 363 @item Idempotent failure
 364
 365 If a task fails due to a signal, or if it evaluates the special @code{fail}
 366 expression, it enters the @emph{failing} state. A failing task unwinds its
 367 control stack, frees all of its owned resources (executing destructors) and
 368 enters the @emph{dead} state. Failure is idempotent and non-recoverable.
 369
 370 @sp 1
 371 @item Supervision hierarchy
 372
 373 Rust has a system for propagating task-failures, either directly to a
 374 supervisor task, or indirectly by sending a message into a channel.
 375
 376 @sp 1
 377 @item Resource types with deterministic destruction
 378
 379 Rust includes a type constructor for @emph{resource} types, which have an
 380 associated destructor and cannot be moved in memory. Resources types belong to
 381 the kind of @emph{pinned} types, and any value that directly contains a
 382 resource is implicitly pinned as well.
 383
 384 Resources can only contain types from the pinned or unique kinds of type,
 385 which means that unlike finalizers, there is always a deterministic, top-down
 386 order to run the destructors of a resource and its sub-resources.
 387
 388 @sp 1
 389 @item Typestate system
 390
 391 Every storage slot in a Rust frame participates in not only a conventional
 392 structural static type system, describing the interpretation of memory in the
 393 slot, but also a @emph{typestate} system. The static typestates of a program
 394 describe the set of @emph{pure, dynamic predicates} that provably hold over
 395 some set of slots, at each point in the program's control-flow graph within
 396 each frame. The static calculation of the typestates of a program is a
 397 function-local dataflow problem, and handles user-defined predicates in a
 398 similar fashion to the way the type system permits user-defined types.
 399
 400 A short way of thinking of this is: types statically model values,
 401 typestates statically model @emph{assertions that hold} before and
 402 after statements and expressions.
 403
 404 @end itemize
 405
 406
 407 @page
 408 @node    Influences
 409 @section Influences
 410 @sp 2
 411
 412 @quotation
 413   The essential problem that must be solved in making a fault-tolerant
 414   software system is therefore that of fault-isolation. Different programmers
 415   will write different modules, some modules will be correct, others will have
 416   errors. We do not want the errors in one module to adversely affect the
 417   behaviour of a module which does not have any errors.
 418
 419 @flushright
 420                                    - Joe Armstrong
 421 @end flushright
 422 @end quotation
 423 @sp 2
 424
 425 @quotation
 426   In our approach, all data is private to some process, and processes can
 427   only communicate through communications channels. @emph{Security}, as used
 428   in this paper, is the property which guarantees that processes in a system
 429   cannot affect each other except by explicit communication.
 430
 431   When security is absent, nothing which can be proven about a single module
 432   in isolation can be guaranteed to hold when that module is embedded in a
 433   system [...]
 434 @flushright
 435                                    - Robert Strom and Shaula Yemini
 436 @end flushright
 437 @end quotation
 438 @sp 2
 439
 440 @quotation
 441   Concurrent and applicative programming complement each other. The
 442   ability to send messages on channels provides I/O without side effects,
 443   while the avoidance of shared data helps keep concurrent processes from
 444   colliding.
 445 @flushright
 446                                    - Rob Pike
 447 @end flushright
 448 @end quotation
 449 @sp 2
 450
 451 @page
 452 Rust is not a particularly original language. It may however appear unusual by
 453 contemporary standards, as its design elements are drawn from a number of
 454 ``historical'' languages that have, with a few exceptions, fallen out of
 455 favour. Five prominent lineages contribute the most:
 456
 457 @itemize
 458 @sp 1
 459 @item
 460 The NIL (1981) and Hermes (1990) family. These languages were developed by
 461 Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM
 462 Watson Research Center (Yorktown Heights, NY, USA).
 463
 464 @sp 1
 465 @item
 466 The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes
 467 Wikstr@"om, Mike Williams and others in their group at the Ericsson Computer
 468 Science Laboratory (@"Alvsj@"o, Stockholm, Sweden) .
 469
 470 @sp 1
 471 @item
 472 The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim,
 473 Heinz Schmidt and others in their group at The International Computer Science
 474 Institute of the University of California, Berkeley (Berkeley, CA, USA).
 475
 476 @sp 1
 477 @item
 478 The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These languages
 479 were developed by Rob Pike, Phil Winterbottom, Sean Dorward and others in
 480 their group at Bell labs Computing Sciences Reserch Center (Murray Hill, NJ,
 481 USA).
 482
 483 @sp 1
 484 @item
 485 The Napier (1985) and Napier88 (1988) family. These languages were developed
 486 by Malcolm Atkinson, Ron Morrison and others in their group at the University
 487 of St. Andrews (St. Andrews, Fife, UK).
 488 @end itemize
 489
 490 @sp 1
 491 Additional specific influences can be seen from the following languages:
 492 @itemize
 493 @item The structural algebraic types and compilation manager of SML.
 494 @item The deterministic destructor system of C++.
 495 @end itemize
 496
 497 @c ############################################################
 498 @c Tutorial
 499 @c ############################################################
 500
 501 @node    Tutorial
 502 @chapter Tutorial
 503
 504 @emph{TODO}.
 505
 506 @c ############################################################
 507 @c Reference
 508 @c ############################################################
 509
 510 @node    Reference
 511 @chapter Reference
 512
 513 @menu
 514 * Ref.Lex::                     Lexical structure.
 515 * Ref.Path::                    References to items.
 516 * Ref.Gram::                    Grammar.
 517 * Ref.Comp::                    Compilation and component model.
 518 * Ref.Mem::                     Semantic model of memory.
 519 * Ref.Task::                    Semantic model of tasks.
 520 * Ref.Item::                    The components of a module.
 521 * Ref.Type::                    The types of values held in memory.
 522 * Ref.Typestate::               Predicates that hold at points in time.
 523 * Ref.Stmt::                    Components of an executable block.
 524 * Ref.Expr::                    Units of execution and evaluation.
 525 * Ref.Run::                     Organization of runtime services.
 526 @end menu
 527
 528 @node    Ref.Lex
 529 @section Ref.Lex
 530 @c * Ref.Lex::                     Lexical structure.
 531 @cindex Lexical structure
 532 @cindex Token
 533
 534 The lexical structure of a Rust source file or crate file is defined in terms
 535 of Unicode character codes and character properties.
 536
 537 Groups of Unicode character codes and characters are organized into
 538 @emph{tokens}. Tokens are defined as the longest contiguous sequence of
 539 characters within the same token type (identifier, keyword, literal, symbol),
 540 or interrupted by ignored characters.
 541
 542 Most tokens in Rust follow rules similar to the C family.
 543
 544 Most tokens (including whitespace, keywords, operators and structural symbols)
 545 are drawn from the ASCII-compatible range of Unicode. Identifiers are drawn
 546 from Unicode characters specified by the @code{XID_start} and
 547 @code{XID_continue} rules given by UAX #31@footnote{Unicode Standard Annex
 548 #31: Unicode Identifier and Pattern Syntax}. String and character literals may
 549 include the full range of Unicode characters.
 550
 551 @emph{TODO: formalize this section much more}.
 552
 553 @menu
 554 * Ref.Lex.Ignore::       Ignored characters.
 555 * Ref.Lex.Ident::        Identifier tokens.
 556 * Ref.Lex.Key::          Keyword tokens.
 557 * Ref.Lex.Res::          Reserved tokens.
 558 * Ref.Lex.Num::          Numeric tokens.
 559 * Ref.Lex.Text::         String and character tokens.
 560 * Ref.Lex.Syntax::       Syntactic extension tokens.
 561 * Ref.Lex.Sym::          Special symbol tokens.
 562 @end menu
 563
 564 @node       Ref.Lex.Ignore
 565 @subsection Ref.Lex.Ignore
 566 @c * Ref.Lex.Ignore::            Ignored tokens.
 567
 568 Characters considered to be @emph{whitespace} or @emph{comment} are ignored,
 569 and are not considered as tokens. They serve only to delimit tokens. Rust is
 570 otherwise a free-form language.
 571
 572 @dfn{Whitespace} is any of the following Unicode characters: U+0020 (space),
 573 U+0009 (tab, @code{'\t'}), U+000A (LF, @code{'\n'}), U+000D (CR, @code{'\r'}).
 574
 575 @dfn{Comments} are @emph{single-line comments} or @emph{multi-line comments}.
 576
 577 A @dfn{single-line comment} is any sequence of Unicode characters beginning
 578 with U+002F U+002F (@code{"//"}) and extending to the next U+000A character,
 579 @emph{excluding} cases in which such a sequence occurs within a string literal
 580 token.
 581
 582 A @dfn{multi-line comments} is any sequence of Unicode characters beginning
 583 with U+002F U+002A (@code{"/*"}) and ending with U+002A U+002F (@code{"*/"}),
 584 @emph{excluding} cases in which such a sequence occurs within a string literal
 585 token.  Multi-line comments may be nested.
 586
 587 @node       Ref.Lex.Ident
 588 @subsection Ref.Lex.Ident
 589 @c * Ref.Lex.Ident::             Identifier tokens.
 590 @cindex Identifier token
 591
 592 Identifiers follow the rules given by Unicode Standard Annex #31, in the form
 593 closed under NFKC normalization, @emph{excluding} those tokens that are
 594 otherwise defined as keywords or reserved
 595 tokens. @xref{Ref.Lex.Key}. @xref{Ref.Lex.Res}.
 596
 597 That is: an identifier starts with any character having derived property
 598 @code{XID_Start} and continues with zero or more characters having derived
 599 property @code{XID_Continue}; and such an identifier is NFKC-normalized during
 600 lexing, such that all subsequent comparison of identifiers is performed on the
 601 NFKC-normalized forms.
 602
 603 @emph{TODO: define relationship between Unicode and Rust versions}.
 604
 605 @footnote{This identifier syntax is a superset of the identifier syntaxes of C
 606 and Java, and is modeled on Python PEP #3131, which formed the definition of
 607 identifiers in Python 3.0 and later.}
 608
 609 @node       Ref.Lex.Key
 610 @subsection Ref.Lex.Key
 611 @c * Ref.Lex.Key::                Keyword tokens.
 612
 613 The keywords are:
 614 @cindex Keywords
 615
 616 @sp 2
 617
 618 @multitable @columnfractions .15 .15 .15 .15 .15
 619 @item @code{use}
 620 @tab @code{syntax}
 621 @tab @code{mutable}
 622 @tab @code{native}
 623 @tab @code{unchecked}
 624 @item @code{mod}
 625 @tab @code{import}
 626 @tab @code{export}
 627 @tab @code{let}
 628 @tab @code{const}
 629 @item @code{auth}
 630 @tab @code{unsafe}
 631 @tab @code{as}
 632 @tab @code{self}
 633 @tab @code{log}
 634 @item @code{bind}
 635 @tab @code{type}
 636 @tab @code{true}
 637 @tab @code{false}
 638 @tab @code{any}
 639 @item @code{int}
 640 @tab @code{uint}
 641 @tab @code{float}
 642 @tab @code{char}
 643 @tab @code{bool}
 644 @item @code{u8}
 645 @tab @code{u16}
 646 @tab @code{u32}
 647 @tab @code{u64}
 648 @tab @code{f32}
 649 @item @code{i8}
 650 @tab @code{i16}
 651 @tab @code{i32}
 652 @tab @code{i64}
 653 @tab @code{f64}
 654 @item @code{tag}
 655 @tab @code{vec}
 656 @tab @code{str}
 657 @tab @code{with}
 658 @tab @code{fn}
 659 @item @code{iter}
 660 @tab @code{pure}
 661 @tab @code{obj}
 662 @tab @code{resource}
 663 @tab @code{if}
 664 @item @code{else}
 665 @tab @code{alt}
 666 @tab @code{in}
 667 @tab @code{do}
 668 @tab @code{while}
 669 @item @code{break}
 670 @tab @code{cont}
 671 @tab @code{note}
 672 @tab @code{assert}
 673 @tab @code{claim}
 674 @item @code{check}
 675 @tab @code{prove}
 676 @tab @code{fail}
 677 @tab @code{for}
 678 @tab @code{each}
 679 @item @code{ret}
 680 @tab @code{put}
 681 @tab @code{be}
 682 @end multitable
 683
 684 @node       Ref.Lex.Res
 685 @subsection Ref.Lex.Res
 686 @c * Ref.Lex.Res::                Reserved tokens.
 687
 688 The reserved tokens are:
 689 @cindex Reserved
 690
 691 @sp 2
 692
 693 @multitable @columnfractions .15 .15 .15 .15 .15
 694 @item @code{f16}
 695 @tab @code{f80}
 696 @tab @code{f128}
 697 @item @code{m32}
 698 @tab @code{m64}
 699 @tab @code{m128}
 700 @tab @code{dec}
 701 @end multitable
 702
 703 @sp 2
 704
 705 At present these tokens have no defined meaning in the Rust language.
 706
 707 These tokens may correspond, in some current or future implementation,
 708 to additional built-in types for decimal floating-point, extended
 709 binary and interchange floating-point formats, as defined in the IEEE
 710 754-1985 and IEEE 754-2008 specifications.
 711
 712
 713 @node       Ref.Lex.Num
 714 @subsection Ref.Lex.Num
 715 @c * Ref.Lex.Num::                 Numeric tokens.
 716 @cindex Number token
 717 @cindex Hex token
 718 @cindex Decimal token
 719 @cindex Binary token
 720 @cindex Floating-point token
 721
 722 @c FIXME: This discussion isn't quite right since 'f' and 'i' can be used as
 723 @c suffixes
 724
 725 A @dfn{number literal} is either an @emph{integer literal} or a
 726 @emph{floating-point literal}.
 727
 728 @sp 1
 729 An @dfn{integer literal} has one of three forms:
 730 @enumerate
 731 @item A @dfn{decimal literal} starts with a @emph{decimal digit} and continues
 732 with any mixture of @emph{decimal digits} and @emph{underscores}.
 733
 734 @item A @dfn{hex literal} starts with the character sequence U+0030
 735 U+0078 (@code{"0x"}) and continues as any mixture @emph{hex digits}
 736 and @emph{underscores}.
 737
 738 @item A @dfn{binary literal} starts with the character sequence U+0030
 739 U+0062 (@code{"0b"}) and continues as any mixture @emph{binary digits}
 740 and @emph{underscores}.
 741
 742 @end enumerate
 743
 744 By default, an integer literal is of type @code{int}. An integer literal may
 745 be followed (immediately, without any spaces) by a @dfn{integer suffix}, which
 746 changes the type of the literal. There are three kinds of integer literal
 747 suffix:
 748
 749 @enumerate
 750 @item The @code{u} suffix gives the literal type @code{uint}.
 751 @item The @code{g} suffix gives the literal type @code{big}.
 752 @item Each of the signed and unsigned machine types @code{u8}, @code{i8},
 753 @code{u16}, @code{i16}, @code{u32}, @code{i32}, @code{u64} and @code{i64}
 754 give the literal the corresponding machine type.
 755 @end enumerate
 756
 757 @sp 1
 758 A @dfn{floating-point literal} has one of two forms:
 759 @enumerate
 760 @item Two @emph{decimal literals} separated by a period
 761 character U+002E ('.'), with an optional @emph{exponent} trailing after the
 762 second @emph{decimal literal}.
 763 @item A single @emph{decimal literal} followed by an @emph{exponent}.
 764 @end enumerate
 765
 766 By default, a floating-point literal is of type @code{float}. A floating-point
 767 literal may be followed (immediately, without any spaces) by a
 768 @dfn{floating-point suffix}, which changes the type of the literal. There are
 769 only two floating-point suffixes: @code{f32} and @code{f64}. Each of these
 770 gives the floating point literal the associated type, rather than
 771 @code{float}.
 772
 773 A set of suffixes are also reserved to accommodate literal support for
 774 types corresponding to reserved tokens. The reserved suffixes are @code{f16},
 775 @code{f80}, @code{f128}, @code{m}, @code{m32}, @code{m64} and @code{m128}.
 776
 777 @sp 1
 778 A @dfn{hex digit} is either a @emph{decimal digit} or else a character in the
 779 ranges U+0061-U+0066 and U+0041-U+0046 (@code{'a'}-@code{'f'},
 780 @code{'A'}-@code{'F'}).
 781
 782 A @dfn{binary digit} is either the character U+0030 or U+0031 (@code{'0'} or
 783 @code{'1'}).
 784
 785 An @dfn{exponent} begins with either of the characters U+0065 or U+0045
 786 (@code{'e'} or @code{'E'}), followed by an optional @emph{sign character},
 787 followed by a trailing @emph{decimal literal}.
 788
 789 A @dfn{sign character} is either U+002B or U+002D (@code{'+'} or @code{'-'}).
 790
 791
 792 Examples of integer literals of various forms:
 793 @example
 794 123;                               // type int
 795 123u;                              // type uint
 796 123_u;                             // type uint
 797 0xff00;                            // type int
 798 0xffu8;                            // type u8
 799 0b1111_1111_1001_0000_i32;         // type i32
 800 0xffff_ffff_ffff_ffff_ffff_ffffg;  // type big
 801 @end example
 802
 803
 804 Examples of floating-point literals of various forms:
 805 @example
 806 123.0;                             // type float
 807 0.1;                               // type float
 808 0.1f32;                            // type f32
 809 12E+99_f64;                        // type f64
 810 @end example
 811
 812
 813 @node       Ref.Lex.Text
 814 @subsection Ref.Lex.Text
 815 @c * Ref.Lex.Key::                 String and character tokens.
 816 @cindex String token
 817 @cindex Character token
 818 @cindex Escape sequence
 819 @cindex Unicode
 820
 821 A @dfn{character literal} is a single Unicode character enclosed within two
 822 U+0027 (single-quote) characters, with the exception of U+0027 itself, which
 823 must be @emph{escaped} by a preceding U+005C character ('\').
 824
 825 A @dfn{string literal} is a sequence of any Unicode characters enclosed
 826 within two U+0022 (double-quote) characters, with the exception of U+0022
 827 itself, which must be @emph{escaped} by a preceding U+005C character
 828 ('\').
 829
 830 Some additional @emph{escapes} are available in either character or string
 831 literals.  An escape starts with a U+005C ('\') and continues with one
 832 of the following forms:
 833 @itemize
 834 @item An @dfn{8-bit codepoint escape} escape starts with U+0078 ('x') and is
 835 followed by exactly two @dfn{hex digits}. It denotes the Unicode codepoint
 836 equal to the provided hex value.
 837 @item A @dfn{16-bit codepoint escape} starts with U+0075 ('u') and is followed
 838  by exactly four @dfn{hex digits}. It denotes the Unicode codepoint equal to
 839 the provided hex value.
 840 @item A @dfn{32-bit codepoint escape} starts with U+0055 ('U') and is followed
 841  by exactly eight @dfn{hex digits}. It denotes the Unicode codepoint equal to
 842 the provided hex value.
 843 @item A @dfn{whitespace escape} is one of the characters U+006E, U+0072, or
 844 U+0074, denoting the unicode values U+000A (LF), U+000D (CR) or U+0009 (HT)
 845 respectively.
 846 @item The @dfn{backslash escape} is the character U+005C ('\') which must be
 847 escaped in order to denote @emph{itself}.
 848 @end itemize
 849
 850 @node       Ref.Lex.Syntax
 851 @subsection Ref.Lex.Syntax
 852 @c * Ref.Lex.Syntax::              Syntactic extension tokens.
 853
 854 Syntactic extensions are marked with the @emph{pound} sigil U+0023 (@code{#}),
 855 followed by an identifier, one of @code{fmt}, @code{env},
 856 @code{concat_idents}, @code{ident_to_str}, @code{log_syntax}, @code{macro}, or
 857 the name of a user-defined macro. This is followed by a vector literal. (Its
 858 value will be interpreted syntactically; in particular, it need not be
 859 well-typed.)
 860
 861 @emph{TODO: formalize those terms more}.
 862
 863 @node       Ref.Lex.Sym
 864 @subsection Ref.Lex.Sym
 865 @c * Ref.Lex.Sym::                 Special symbol tokens.
 866
 867 @cindex Symbol
 868 @cindex Operator
 869
 870 The special symbols are:
 871
 872 @sp 2
 873
 874 @multitable @columnfractions .1 .1 .1 .1 .1 .1
 875
 876 @item @code{@@}
 877 @tab @code{_}
 878 @item @code{#}
 879 @tab @code{:}
 880 @tab @code{.}
 881 @tab @code{;}
 882 @tab @code{,}
 883 @item @code{[}
 884 @tab @code{]}
 885 @tab @code{@{}
 886 @tab @code{@}}
 887 @tab @code{(}
 888 @tab @code{)}
 889 @item @code{=}
 890 @tab @code{<-}
 891 @tab @code{<->}
 892 @tab @code{->}
 893 @item @code{+}
 894 @tab @code{++}
 895 @tab @code{+=}
 896 @tab @code{-}
 897 @tab @code{--}
 898 @tab @code{-=}
 899 @item @code{*}
 900 @tab @code{/}
 901 @tab @code{%}
 902 @tab @code{*=}
 903 @tab @code{/=}
 904 @tab @code{%=}
 905 @item @code{&}
 906 @tab @code{|}
 907 @tab @code{!}
 908 @tab @code{~}
 909 @tab @code{^}
 910 @item @code{&=}
 911 @tab @code{|=}
 912 @tab @code{^=}
 913 @tab @code{!=}
 914 @item @code{>>}
 915 @tab @code{>>>}
 916 @tab @code{<<}
 917 @tab @code{<<=}
 918 @tab @code{>>=}
 919 @tab @code{>>>=}
 920 @item @code{<}
 921 @tab @code{<=}
 922 @tab @code{==}
 923 @tab @code{>=}
 924 @tab @code{>}
 925 @item @code{&&}
 926 @tab @code{||}
 927 @end multitable
 928
 929 @page
 930 @page
 931 @node    Ref.Path
 932 @section Ref.Path
 933 @c * Ref.Path::               References to items.
 934 @cindex Names of items or slots
 935 @cindex Path name
 936 @cindex Type parameters
 937
 938 A @dfn{path} is a sequence of one or more path components separated by a
 939 namespace qualifier (@code{::}). If a path consists of only one component, it
 940 may refer to either an item or a slot in a local control
 941 scope. @xref{Ref.Mem.Slot}. @xref{Ref.Item}. If a path has multiple
 942 components, it refers to an item.
 943
 944 Every item has a @emph{canonical path} within its crate, but the path naming
 945 an item is only meaningful within a given crate. There is no global namespace
 946 across crates; an item's canonical path merely identifies it within the
 947 crate. @xref{Ref.Comp.Crate}.
 948
 949 Path components are usually identifiers. @xref{Ref.Lex.Ident}. The last
 950 component of a path may also have trailing explicit type arguments.
 951
 952 Two examples of simple paths consisting of only identifier components:
 953 @example
 954 x;
 955 x::y::z;
 956 @end example
 957
 958 In most contexts, the Rust grammar accepts a general @emph{path}, but
 959 subsequent passes may restrict paths occurring in various contexts to refer to
 960 slots or items, depending on the semantics of the occurrence. In other words:
 961 in some contexts a slot is required (for example, on the left hand side of the
 962 copy operator, @pxref{Ref.Expr.Copy}) and in other contexts an item is
 963 required (for example, as a type parameter, @pxref{Ref.Item}). In no case is
 964 the grammar made ambiguous by accepting a general path and interpreting the
 965 reference in later passes. @xref{Ref.Gram}.
 966
 967 An example of a path with type parameters:
 968 @example
 969 m::map<int,str>;
 970 @end example
 971
 972 @page
 973 @node    Ref.Gram
 974 @section Ref.Gram
 975 @c * Ref.Gram::                    Grammar.
 976
 977 @emph{TODO: mostly LL(1), it reads like C++, Alef and bits of Napier;
 978 formalize here}.
 979
 980 @page
 981 @node    Ref.Comp
 982 @section Ref.Comp
 983 @c * Ref.Comp::                    Compilation and component model.
 984 @cindex Compilation model
 985
 986 Rust is a @emph{compiled} language. Its semantics are divided along a
 987 @emph{phase distinction} between compile-time and run-time. Those semantic
 988 rules that have a @emph{static interpretation} govern the success or failure
 989 of compilation. A program that fails to compile due to violation of a
 990 compile-time rule has no defined semantics at run-time; the compiler should
 991 halt with an error report, and produce no executable artifact.
 992
 993 The compilation model centres on artifacts called @emph{crates}. Each
 994 compilation is directed towards a single crate in source form, and if
 995 successful produces a single crate in executable form.
 996
 997 @menu
 998 * Ref.Comp.Crate::              Units of compilation and linking.
 999 * Ref.Comp.Attr::               Attributes of crates, modules and items.
1000 * Ref.Comp.Syntax::             Syntax extensions.
1001 @end menu
1002
1003 @node       Ref.Comp.Crate
1004 @subsection Ref.Comp.Crate
1005 @c * Ref.Comp.Crate::              Units of compilation and linking.
1006 @cindex Crate
1007
1008 A @dfn{crate} is a unit of compilation and linking, as well as versioning,
1009 distribution and runtime loading. Crates are defined by @emph{crate source
1010 files}, which are a type of source file written in a special declarative
1011 language: @emph{crate language}.@footnote{A crate is somewhat analogous to an
1012 @emph{assembly} in the ECMA-335 CLI model, a @emph{library} in the SML/NJ
1013 Compilation Manager, a @emph{unit} in the Owens and Flatt module system, or a
1014 @emph{configuration} in Mesa.} A crate source file describes:
1015
1016 @itemize
1017 @item Metadata about the crate, such as author, name, version, and copyright.
1018 @item The source-file and directory modules that make up the crate.
1019 @item Any external crates or native modules that the crate imports to its top level.
1020 @item The organization of the crate's internal namespace.
1021 @item The set of names exported from the crate.
1022 @end itemize
1023
1024 A single crate source file may describe the compilation of a large number of
1025 Rust source files; it is compiled in its entirety, as a single indivisible
1026 unit. The compilation phase attempts to transform a single crate source file,
1027 and its referenced contents, into a single compiled crate. Crate source files
1028 and compiled crates have a 1:1 relationship.
1029
1030 The syntactic form of a crate is a sequence of @emph{directives}, some of
1031 which have nested sub-directives.
1032
1033 A crate defines an implicit top-level anonymous module: within this module,
1034 all members of the crate have canonical path names. @xref{Ref.Path}. The
1035 @code{mod} directives within a crate file specify sub-modules to include in
1036 the crate: these are either directory modules, corresponding to directories in
1037 the filesystem of the compilation environment, or file modules, corresponding
1038 to Rust source files. The names given to such modules in @code{mod} directives
1039 become prefixes of the paths of items defined within any included Rust source
1040 files.
1041
1042 The @code{use} directives within the crate specify @emph{other crates} to scan
1043 for, locate, import into the crate's module namespace during compilation, and
1044 link against at runtime. Use directives may also occur independently in rust
1045 source files. These directives may specify loose or tight ``matching
1046 criteria'' for imported crates, depending on the preferences of the crate
1047 developer. In the simplest case, a @code{use} directive may only specify a
1048 symbolic name and leave the task of locating and binding an appropriate crate
1049 to a compile-time heuristic. In a more controlled case, a @code{use} directive
1050 may specify any metadata as matching criteria, such as a URI, an author name
1051 or version number, a checksum or even a cryptographic signature, in order to
1052 select an an appropriate imported crate. @xref{Ref.Comp.Attr}.
1053
1054 The compiled form of a crate is a loadable and executable object file full of
1055 machine code, in a standard loadable operating-system format such as ELF, PE
1056 or Mach-O. The loadable object contains metadata, describing:
1057 @itemize
1058 @item Metadata required for type reflection.
1059 @item The publicly exported module structure of the crate.
1060 @item Any metadata about the crate, defined by attributes.
1061 @item The crates to dynamically link with at run-time, with matching criteria
1062 derived from the same @code{use} directives that guided compile-time imports.
1063 @end itemize
1064
1065 @c This might come along sometime in the future.
1066
1067 @c The @code{syntax} directives of a crate are similar to the @code{use}
1068 @c directives, except they govern the syntax extension namespace (accessed
1069 @c through the syntax-extension sigil @code{#}, @pxref{Ref.Comp.Syntax})
1070 @c available only at compile time. A @code{syntax} directive also makes its
1071 @c extension available to all subsequent directives in the crate file.
1072
1073 An example of a crate:
1074
1075 @example
1076 // Linkage attributes
1077 #[ link(name = "projx"
1078         vers = "2.5",
1079         uuid = "9cccc5d5-aceb-4af5-8285-811211826b82") ];
1080
1081 // Additional metadata attributes
1082 #[ desc = "Project X",
1083    license = "BSD" ];
1084    author = "Jane Doe" ];
1085
1086 // Import a module.
1087 use std (ver = "1.0");
1088
1089 // Define some modules.
1090 mod foo = "foo.rs";
1091 mod bar @{
1092     mod quux = "quux.rs";
1093 @}
1094 @end example
1095
1096 @node       Ref.Comp.Attr
1097 @subsection Ref.Comp.Attr
1098 @cindex Attributes
1099
1100 Static entities in Rust -- crates, modules and items -- may have attributes
1101 applied to them.@footnote{Attributes in Rust are modeled on Attributes in
1102 ECMA-335, C#} An attribute is a general, free-form piece of metadata that is
1103 interpreted according to name, convention, and language and compiler version.
1104 Attributes may appear as any of:
1105 @itemize
1106 @item A single identifier, the attribute name
1107 @item An identifier followed by the equals sign '=' and a literal, providing a key/value pair
1108 @item An identifier followed by a parenthesized list of sub-attribute arguments
1109 @end itemize
1110
1111 Attributes are applied to an entity by placing them within a hash-list
1112 (@code{#[...]}) as either a prefix to the entity or as a semicolon-delimited
1113 declaration within the entity body.
1114
1115 An example of attributes:
1116
1117 @example
1118 // A function marked as a unit test
1119 #[test]
1120 fn test_foo() @{
1121   ...
1122 @}
1123
1124 // General metadata applied to the enclosing module or crate.
1125 #[license = "BSD"];
1126
1127 // A conditionally-compiled module
1128 #[cfg(target_os="linux")]
1129 module bar @{
1130   ...
1131 @}
1132
1133 @end example
1134
1135 In future versions of Rust, user-provided extensions to the compiler will be able
1136 to interpret attributes. When this facility is provided, a distinction will be
1137 made between language-reserved and user-available attributes.
1138
1139 At present, only the Rust compiler interprets attributes, so all attribute
1140 names are effectively reserved. Some significant attributes include:
1141
1142 @itemize
1143 @item The @code{cfg} attribute, for conditional-compilation by build-configuration
1144 @item The @code{link} attribute, describing linkage metadata for a crate
1145 @item The @code{test} attribute, for marking functions as unit tests.
1146 @end itemize
1147
1148 Other attributes may be added or removed during development of the language.
1149
1150 @node          Ref.Comp.Syntax
1151 @subsection    Ref.Comp.Syntax
1152 @c * Ref.Comp.Syntax::        Syntax extension.
1153 @cindex Syntax extension
1154
1155 Rust provides a notation for @dfn{syntax extension}. The notation for invoking
1156 a syntax extension is a marked syntactic form that can appear as an expression
1157 in the body of a Rust program. @xref{Ref.Lex.Syntax}.
1158
1159 After parsing, a syntax-extension incovation is expanded into a Rust
1160 expression. The name of the extension determines the translation performed. In
1161 future versions of Rust, user-provided syntax extensions aside from macros
1162 will be provided via external crates.
1163
1164 At present, only a set of built-in syntax extensions, as well as macros
1165 introduced inline in source code using the @code{macro} extension, may be
1166 used. The current built-in syntax extensions are:
1167
1168 @itemize
1169 @item @code{fmt} expands into code to produce a formatted string, similar to
1170       @code{printf} from C.
1171 @item @code{env} expands into a string literal containing the value of that
1172       environment variable at compile-time.
1173 @item @code{concat_idents} expands into an identifier which is the
1174       concatenation of its arguments.
1175 @item @code{ident_to_str} expands into a string literal containing the name of
1176       its argument (which must be a literal).
1177 @item @code{log_syntax} causes the compiler to pretty-print its arguments.
1178 @end itemize
1179
1180 Finally, @code{macro} is used to define a new macro. A macro can abstract over
1181 second-class Rust concepts that are present in syntax. The arguments to
1182 @code{macro} are a bracketed list of pairs (two-element lists). The pairs
1183 consist of an invocation and the syntax to expand into. An example:
1184
1185 @example
1186 #macro[[#apply[fn, [args, ...]], fn(args, ...)]];
1187 @end example
1188
1189 In this case, the invocation @code{#apply[sum, 5, 8, 6]} expands to
1190 @code{sum(5,8,6)}. If @code{...} follows an expression (which need not be as
1191 simple as a single identifier) in the input syntax, the matcher will expect an
1192 arbitrary number of occurences of the thing preceeding it, and bind syntax to
1193 the identifiers it contains. If it follows an expression in the output syntax,
1194 it will transcribe that expression repeatedly, according to the identifiers
1195 (bound to syntax) that it contains.
1196
1197 The behavior of @code{...} is known as Macro By Example. It allows you to
1198 write a macro with arbitrary repetition by specifying only one case of that
1199 repetition, and following it by @code{...}, both where the repeated input is
1200 matched, and where the repeated output must be transcribed. A more
1201 sophisticated example:
1202
1203 @example
1204 #macro[#zip_literals[[x, ...], [y, ...]],
1205        [[x, y], ...]];
1206 #macro[#unzip_literals[[x, y], ...],
1207        [[x, ...], [y, ...]]];
1208 @end example
1209
1210 In this case, @code{#zip_literals[[1,2,3], [1,2,3]]} expands to
1211 @code{[[1,1],[2,2],[3,3]]}, and @code{#unzip_literals[[1,1], [2,2], [3,3]]}
1212 expands to @code{[[1,2,3],[1,2,3]]}.
1213
1214 Macro expansion takes place outside-in: that is,
1215 @code{#unzip_literals[#zip_literals[[1,2,3],[1,2,3]]]} will fail because
1216 @code{unzip_literals} expects a list, not a macro invocation, as an
1217 argument.
1218
1219 @c
1220 The macro system currently has some limitations. It's not possible to
1221 destructure anything other than vector literals (therefore, the arguments to
1222 complicated macros will tend to be an ocean of square brackets). Macro
1223 invocations and @code{...} can only appear in expression positions. Finally,
1224 macro expansion is currently unhygienic. That is, name collisions between
1225 macro-generated and user-written code can cause unintentional capture.
1226
1227
1228 @page
1229 @node    Ref.Mem
1230 @section Ref.Mem
1231 @c * Ref.Mem::                     Semantic model of memory.
1232 @cindex Memory model
1233 @cindex Box
1234 @cindex Slot
1235
1236 A Rust task's memory consists of a static set of @emph{items}, a set of tasks
1237 each with its own @emph{stack}, and a @emph{heap}. Immutable portions of the
1238 heap may be shared between tasks, mutable portions may not.
1239
1240 Allocations in the stack consist of @emph{slots}, and allocations in the heap
1241 consist of @emph{boxes}.
1242
1243 @menu
1244 * Ref.Mem.Alloc::               Memory allocation model.
1245 * Ref.Mem.Own::                 Memory ownership model.
1246 * Ref.Mem.Slot::                Stack memory model.
1247 * Ref.Mem.Box::                 Heap memory model.
1248 @end menu
1249
1250 @node       Ref.Mem.Alloc
1251 @subsection Ref.Mem.Alloc
1252 @c * Ref.Mem.Alloc::               Memory allocation model.
1253 @cindex Item
1254 @cindex Stack
1255 @cindex Heap
1256 @cindex Shared box
1257 @cindex Task-local box
1258
1259 The @dfn{items} of a program are those functions, iterators, objects, modules
1260 and types that have their value calculated at compile-time and stored uniquely
1261 in the memory image of the rust process. Items are neither dynamically
1262 allocated nor freed.
1263
1264 A task's @dfn{stack} consists of activation frames automatically allocated on
1265 entry to each function as the task executes. A stack allocation is reclaimed
1266 when control leaves the frame containing it.
1267
1268 The @dfn{heap} is a general term that describes two separate sets of boxes:
1269 shared boxes -- which may be subject to garbage collection -- and unique
1270 boxes.  The lifetime of an allocation in the heap depends on the lifetime of
1271 the box values pointing to it. Since box values may themselves be passed in
1272 and out of frames, or stored in the heap, heap allocations may outlive the
1273 frame they are allocated within.
1274
1275
1276 @node       Ref.Mem.Own
1277 @subsection Ref.Mem.Own
1278 @c * Ref.Mem.Own::                 Memory ownership model.
1279 @cindex Ownership
1280
1281 A task owns all memory it can @emph{safely} reach through local variables,
1282 shared or unique boxes, and/or references. Sharing memory between tasks can
1283 only be accomplished using @emph{unsafe} constructs, such as raw pointer
1284 operations or calling C code.
1285
1286 When a task sends a value of @emph{unique} kind over a channel, it loses
1287 ownership of the value sent and can no longer refer to it. This is statically
1288 guaranteed by the combined use of ``move semantics'' and unique kinds, within
1289 the communication system.
1290
1291 When a stack frame is exited, its local allocations are all released, and its
1292 references to boxes (both shared and owned) are dropped.
1293
1294 A shared box may (in the case of a recursive, mutable shared type) be cyclic;
1295 in this case the release of memory inside the shared structure may be deferred
1296 until task-local garbage collection can reclaim it. Code can ensure no such
1297 delayed deallocation occurs by restricting itself to unique boxes and similar
1298 unshared kinds of data.
1299
1300 When a task finishes, its stack is necessarily empty and it therefore has no
1301 references to any boxes; the remainder of its heap is immediately freed.
1302
1303 @node       Ref.Mem.Slot
1304 @subsection Ref.Mem.Slot
1305 @c * Ref.Mem.Slot::                Stack memory model.
1306 @cindex Stack
1307 @cindex Slot
1308 @cindex Local slot
1309 @cindex Reference slot
1310
1311 A task's stack contains slots.
1312
1313 A @dfn{slot} is a component of a stack frame. A slot is either @emph{local} or
1314 an @emph{alias}.
1315
1316 A @dfn{local} slot (or @emph{stack-local} allocation) holds a value directly,
1317 allocated within the stack's memory. The value is a part of the stack frame.
1318
1319 A @dfn{reference} references a value outside the frame. It may refer to a
1320 value allocated in another frame @emph{or} a boxed value in the heap. The
1321 reference-formation rules ensure that the referent will outlive the reference.
1322
1323 Local slots are always implicitly mutable.
1324
1325 Local slots are not initialized when allocated; the entire frame worth of
1326 local slots are allocated at once, on frame-entry, in an uninitialized
1327 state. Subsequent statements within a function may or may not initialize the
1328 local slots. Local slots can be used only after they have been initialized;
1329 this condition is guaranteed by the typestate system.
1330
1331 References are created for function arguments. If the compiler can not prove
1332 that the referred-to value will outlive the reference, it will try to set
1333 aside a copy of that value to refer to. If this is not sematically safe (for
1334 example, if the referred-to value contains mutable fields), it will reject the
1335 program. If the compiler deems copying the value expensive, it will warn.
1336
1337 A function can be declared to take an argument by mutable reference. This
1338 allows the function to write to the slot that the reference refers to.
1339
1340 An example function that accepts an value by mutable reference:
1341 @example
1342 fn incr(&i: int) @{
1343     i = i + 1;
1344 @}
1345 @end example
1346
1347 @node       Ref.Mem.Box
1348 @subsection Ref.Mem.Box
1349 @c * Ref.Mem.Box::                 Heap memory model.
1350 @cindex Box
1351 @cindex Dereference operator
1352
1353 A @dfn{box} is a reference to a heap allocation holding another value. There
1354 are two kinds of boxes: @emph{shared boxes} and @emph{unique boxes}.
1355
1356 A @dfn{shared box} type or value is constructed by the prefix @emph{at} sigil @code{@@}.
1357
1358 A @dfn{unique box} type or value is constructed by the prefix @emph{tilde} sigil @code{~}.
1359
1360 Multiple shared box values can point to the same heap allocation; copying a
1361 shared box value makes a shallow copy of the pointer (optionally incrementing
1362 a reference count, if the shared box is implemented through
1363 reference-counting).
1364
1365 Unique box values exist in 1:1 correspondence with their heap allocation;
1366 copying a unique box value makes a deep copy of the heap allocation and
1367 produces a pointer to the new allocation.
1368
1369 An example of constructing one shared box type and value, and one unique box type and value:
1370 @example
1371 let x: @@int = @@10;
1372 let x: ~int = ~10;
1373 @end example
1374
1375 Some operations implicitly dereference boxes. Examples of such @dfn{implicit
1376 dereference} operations are:
1377 @itemize
1378 @item arithmetic operators (@code{x + y - z})
1379 @item field selection (@code{x.y.z})
1380 @end itemize
1381
1382 An example of an implicit-dereference operation performed on box values:
1383 @example
1384 let x: @@int = @@10;
1385 let y: @@int = @@12;
1386 assert (x + y == 22);
1387 @end example
1388
1389 Other operations act on box values as single-word-sized address values. For
1390 these operations, to access the value held in the box requires an explicit
1391 dereference of the box value. Explicitly dereferencing a box is indicated with
1392 the unary @emph{star} operator @code{*}. Examples of such @dfn{explicit
1393 dereference} operations are:
1394 @itemize
1395 @item copying box values (@code{x = y})
1396 @item passing box values to functions (@code{f(x,y)})
1397 @end itemize
1398
1399 An example of an explicit-dereference operation performed on box values:
1400 @example
1401 fn takes_boxed(b: @@int) @{
1402 @}
1403
1404 fn takes_unboxed(b: int) @{
1405 @}
1406
1407 fn main() @{
1408     let x: @@int = @@10;
1409     takes_boxed(x);
1410     takes_unboxed(*x);
1411 @}
1412 @end example
1413
1414
1415 @page
1416 @node    Ref.Task
1417 @section Ref.Task
1418 @c * Ref.Task::                    Semantic model of tasks.
1419 @cindex Task
1420 @cindex Process
1421
1422 An executing Rust program consists of a tree of tasks. A Rust @dfn{task}
1423 consists of an entry function, a stack, a set of outgoing communication
1424 channels and incoming communication ports, and ownership of some portion of
1425 the heap of a single operating-system process.
1426
1427 Multiple Rust tasks may coexist in a single operating-system
1428 process. Execution of multiple Rust tasks in a single operating-system process
1429 may be either truly concurrent or interleaved by the runtime scheduler. Rust
1430 tasks are lightweight: each consumes less memory than an operating-system
1431 process, and switching between Rust tasks is faster than switching between
1432 operating-system processes.
1433
1434 @menu
1435 * Ref.Task.Comm::               Inter-task communication.
1436 * Ref.Task.Life::               Task lifecycle and state transitions.
1437 * Ref.Task.Sched::              Task scheduling model.
1438 * Ref.Task.Spawn::              Library interface for making new tasks.
1439 * Ref.Task.Send::               Library interface for sending messages.
1440 * Ref.Task.Recv::               Library interface for receiving messages.
1441 @end menu
1442
1443 @node       Ref.Task.Comm
1444 @subsection Ref.Task.Comm
1445 @c * Ref.Task.Comm::               Inter-task communication.
1446
1447 @cindex Communication
1448 @cindex Port
1449 @cindex Channel
1450 @cindex Message passing
1451 @cindex Send expression
1452 @cindex Receive expression
1453
1454 With the exception of @emph{unsafe} blocks, Rust tasks are isolated from
1455 interfering with one another's memory directly. Instead of manipulating shared
1456 storage, Rust tasks communicate with one another using a typed, asynchronous,
1457 simplex message-passing system.
1458
1459 A @dfn{port} is a communication endpoint that can @emph{receive}
1460 messages. Ports receive messages from channels.
1461
1462 A @dfn{channel} is a communication endpoint that can @emph{send}
1463 messages. Channels send messages to ports.
1464
1465 Each port is implicitly boxed and mutable; as such a port has a unique
1466 per-task identity and cannot be replicated or transmitted. If a port value is
1467 copied, both copies refer to the @emph{same} port. New ports can be
1468 constructed dynamically and stored in data structures.
1469
1470 Each channel is bound to a port when the channel is constructed, so the
1471 destination port for a channel must exist before the channel itself. A channel
1472 cannot be rebound to a different port from the one it was constructed with.
1473
1474 Channels are weak: a channel does not keep the port it is bound to
1475 alive. Ports are owned by their allocating task and cannot be sent over
1476 channels; if a task dies its ports die with it, and all channels bound to
1477 those ports no longer function. Messages sent to a channel connected to a dead
1478 port will be dropped.
1479
1480 Channels are immutable types with meaning known to the runtime; channels can
1481 be sent over channels.
1482
1483 Many channels can be bound to the same port, but each channel is bound to a
1484 single port. In other words, channels and ports exist in an N:1 relationship,
1485 N channels to 1 port. @footnote{It may help to remember nautical terminology
1486 when differentiating channels from ports.  Many different waterways --
1487 channels -- may lead to the same port.}
1488
1489 Each port and channel can carry only one type of message. The message type is
1490 encoded as a parameter of the channel or port type. The message type of a
1491 channel is equal to the message type of the port it is bound to. The types of
1492 messages must be of @emph{unique} kind.
1493
1494 Messages are generally sent asynchronously, with optional rate-limiting on the
1495 transmit side. A channel contains a message queue and asynchronously sending a
1496 message merely inserts it into the sending channel's queue; message receipt is
1497 the responsibility of the receiving task.
1498
1499 Messages are sent on channels and received on ports using standard library
1500 functions.
1501
1502 @node       Ref.Task.Life
1503 @subsection Ref.Task.Life
1504 @c * Ref.Task.Life::               Task lifecycle and state transitions.
1505
1506 @cindex Lifecycle of task
1507 @cindex Scheduling
1508 @cindex Running, task state
1509 @cindex Blocked, task state
1510 @cindex Failing, task state
1511 @cindex Dead, task state
1512 @cindex Soft failure
1513 @cindex Hard failure
1514
1515 The @dfn{lifecycle} of a task consists of a finite set of states and events
1516 that cause transitions between the states. The lifecycle states of a task are:
1517
1518 @itemize
1519 @item running
1520 @item blocked
1521 @item failing
1522 @item dead
1523 @end itemize
1524
1525 A task begins its lifecycle -- once it has been spawned -- in the
1526 @emph{running} state. In this state it executes the statements of its entry
1527 function, and any functions called by the entry function.
1528
1529 A task may transition from the @emph{running} state to the @emph{blocked}
1530 state any time it evaluates a communication expression on a port or channel that
1531 cannot be immediately completed.  When the communication expression can be
1532 completed -- when a message arrives at a sender, or a queue drains
1533 sufficiently to complete a semi-synchronous send -- then the blocked task will
1534 unblock and transition back to @emph{running}.
1535
1536 A task may transition to the @emph{failing} state at any time, due to an
1537 un-trapped signal or the evaluation of a @code{fail} expression. Once
1538 @emph{failing}, a task unwinds its stack and transitions to the @emph{dead}
1539 state. Unwinding the stack of a task is done by the task itself, on its own
1540 control stack. If a value with a destructor is freed during unwinding, the
1541 code for the destructor is run, also on the task's control
1542 stack. Running the destructor code causes a temporary transition to a
1543 @emph{running} state, and allows the destructor code to cause any
1544 subsequent state transitions.  The original task of unwinding and
1545 failing thereby may suspend temporarily, and may involve (recursive)
1546 unwinding of the stack of a failed destructor. Nonetheless, the
1547 outermost unwinding activity will continue until the stack is unwound
1548 and the task transitions to the @emph{dead} state. There is no way to
1549 ``recover'' from task failure.  Once a task has temporarily suspended
1550 its unwinding in the @emph{failing} state, failure occurring from
1551 within this destructor results in @emph{hard} failure.  The unwinding
1552 procedure of hard failure frees resources but does not execute
1553 destructors.  The original (soft) failure is still resumed at the
1554 point where it was temporarily suspended.
1555
1556 A task in the @emph{dead} state cannot transition to other states; it exists
1557 only to have its termination status inspected by other tasks, and/or to await
1558 reclamation when the last reference to it drops.
1559
1560 @node       Ref.Task.Sched
1561 @subsection Ref.Task.Sched
1562 @c * Ref.Task.Sched::              Task scheduling model.
1563
1564 @cindex Scheduling
1565 @cindex Preemption
1566 @cindex Yielding control
1567
1568 The currently scheduled task is given a finite @emph{time slice} in which to
1569 execute, after which it is @emph{descheduled} at a loop-edge or similar
1570 preemption point, and another task within is scheduled, pseudo-randomly.
1571
1572 An executing task can @code{yield} control at any time, which deschedules it
1573 immediately. Entering any other non-executing state (blocked, dead) similarly
1574 deschedules the task.
1575
1576
1577
1578 @node       Ref.Task.Spawn
1579 @subsection Ref.Task.Spawn
1580 @c * Ref.Task.Spawn::               Calls for creating new tasks.
1581 @cindex Spawn expression
1582
1583 A call to @code{std::task::spawn}, passing a 0-argument function as its single
1584 argument, causes the runtime to construct a new task executing the passed
1585 function. The passed function is referred to as the @dfn{entry function} for
1586 the spawned task, and any captured environment is carries is moved from the
1587 spawning task to the spawned task before the spawned task begins execution.
1588
1589 The result of a @code{spawn} call is a @code{std::task::task} value.
1590
1591 An example of a @code{spawn} call:
1592 @example
1593 import std::task::*;
1594 import std::comm::*;
1595
1596 fn helper(c: chan<u8>) @{
1597     // do some work.
1598     let result = ...;
1599     send(c, result);
1600 @}
1601
1602 let p: port<u8>;
1603
1604 spawn(bind helper(chan(p)));
1605 // let task run, do other things.
1606 // ...
1607 let result = recv(p);
1608
1609 @end example
1610
1611 @node       Ref.Task.Send
1612 @subsection Ref.Task.Send
1613 @c * Ref.Task.Send::            Calls for sending a value into a channel.
1614 @cindex Send call
1615 @cindex Messages
1616 @cindex Communication
1617
1618 Sending a value into a channel is done by a library call to
1619 @code{std::comm::send}, which takes a channel and a value to send, and moves
1620 the value into the channel's outgoing buffer.
1621
1622 An example of a send:
1623 @example
1624 import std::comm::*;
1625 let c: chan<str> = @dots{};
1626 send(c, "hello, world");
1627 @end example
1628
1629 @node       Ref.Task.Recv
1630 @subsection Ref.Task.Recv
1631 @c * Ref.Task.Recv::           Calls for receiving a value from a channel.
1632 @cindex Receive call
1633 @cindex Messages
1634 @cindex Communication
1635
1636 Receiving a value is done by a call to the @code{recv} method, on an object of
1637 type @code{std::comm::port}. This call causes the receiving task to enter the
1638 @emph{blocked reading} state until a task is sending a value to the port, at
1639 which point the runtime pseudo-randomly selects a sending task and moves a
1640 value from the head of one of the task queues to the call's return value, and
1641 un-blocks the receiving task. @xref{Ref.Run.Comm}.
1642
1643 An example of a @emph{receive}:
1644 @example
1645 import std::comm::*;
1646 let p: port<str> = @dots{};
1647 let s: str = recv(p);
1648 @end example
1649
1650
1651
1652 @page
1653 @node    Ref.Item
1654 @section Ref.Item
1655 @c * Ref.Item::               The components of a module.
1656
1657 @cindex Item
1658 @cindex Type parameters
1659 @cindex Module item
1660
1661 An @dfn{item} is a component of a module. Items are entirely determined at
1662 compile-time, remain constant during execution, and may reside in read-only
1663 memory.
1664
1665 There are five primary kinds of item: modules, functions, iterators, objects and
1666 type definitions.
1667
1668 All items form an implicit scope for the declaration of sub-items. In other
1669 words, within a function, object or iterator, declarations of items can (in
1670 many cases) be mixed with the statements, control blocks, and similar
1671 artifacts that otherwise compose the item body. The meaning of these scoped
1672 items is the same as if the item was declared outside the scope, except that
1673 the item's @emph{path name} within the module namespace is qualified by the
1674 name of the enclosing item. The exact locations in which sub-items may be
1675 declared is given by the grammar.  @xref{Ref.Gram}.
1676
1677 Functions, iterators, objects and type definitions may be @emph{parametrized}
1678 by type. Type parameters are given as a comma-separated list of identifiers
1679 enclosed in angle brackets (@code{<>}), after the name of the item and before
1680 its definition.  The type parameters of an item are part of the name, not the
1681 type of the item; in order to refer to the type-parametrized item, a
1682 referencing name must in general provide type arguments as a list of
1683 comma-separated types enclosed within angle brackets. In practice, the
1684 type-inference system can usually infer such argument types from
1685 context. There are no general parametric types.
1686
1687 @menu
1688 * Ref.Item.Mod::                Items defining modules.
1689 * Ref.Item.Fn::                 Items defining functions.
1690 * Ref.Item.Pred::               Items defining predicates for typestates.
1691 * Ref.Item.Iter::               Items defining iterators.
1692 * Ref.Item.Obj::                Items defining objects.
1693 * Ref.Item.Type::               Items defining the types of values and slots.
1694 * Ref.Item.Tag::                Items defining the constructors of a tag type.
1695 @end menu
1696
1697 @node       Ref.Item.Mod
1698 @subsection Ref.Item.Mod
1699 @c * Ref.Item.Mod::           Items defining sub-modules.
1700
1701 @cindex Module item
1702 @cindex Importing names
1703 @cindex Exporting names
1704 @cindex Visibility control
1705
1706 A @dfn{module item} contains declarations of other @emph{items}. The items
1707 within a module may be functions, modules, objects or types. These
1708 declarations have both static and dynamic interpretation. The purpose of a
1709 module is to organize @emph{names} and control @emph{visibility}. Modules are
1710 declared with the keyword @code{mod}.
1711
1712 An example of a module:
1713 @example
1714 mod math @{
1715     type complex = (f64,f64);
1716     fn sin(f64) -> f64 @{
1717         @dots{}
1718     @}
1719     fn cos(f64) -> f64 @{
1720         @dots{}
1721     @}
1722     fn tan(f64) -> f64 @{
1723         @dots{}
1724     @}
1725     @dots{}
1726 @}
1727 @end example
1728
1729 Modules may also include any number of @dfn{import and export
1730 declarations}. These declarations must precede any module item declarations
1731 within the module, and control the visibility of names both within the module
1732 and outside of it.
1733
1734 @menu
1735 * Ref.Item.Mod.Import::            Declarations for module-local synonyms.
1736 * Ref.Item.Mod.Export::            Declarations for restricting visibility.
1737 @end menu
1738
1739 @node          Ref.Item.Mod.Import
1740 @subsubsection Ref.Item.Mod.Import
1741 @c * Ref.Item.Mod.Import::     Declarations for module-local synonyms.
1742
1743 @cindex Importing names
1744 @cindex Visibility control
1745
1746 An @dfn{import declaration} creates one or more local name bindings synonymous
1747 with some other name. Usually an import declaration is used to shorten the
1748 path required to refer to a module item.
1749
1750 @emph{Note}: unlike many languages, Rust's @code{import} declarations do
1751 @emph{not} declare linkage-dependency with external crates. Linkage
1752 dependencies are independently declared with @code{use}
1753 declarations. @xref{Ref.Comp.Crate}.
1754
1755 An example of imports:
1756 @example
1757 import std::math::sin;
1758 import std::option::*;
1759 import std::str::@{char_at, hash@};
1760
1761 fn main() @{
1762     // Equivalent to 'log std::math::sin(1.0);'
1763     log sin(1.0);
1764     // Equivalent to 'log std::option::some(1.0);'
1765     log some(1.0);
1766     // Equivalent to 'log std::str::hash(std::str::char_at("foo"));'
1767     log hash(char_at("foo"));
1768 @}
1769 @end example
1770
1771 @node          Ref.Item.Mod.Export
1772 @subsubsection Ref.Item.Mod.Export
1773 @c * Ref.Item.Mod.Import::     Declarations for restricting visibility.
1774
1775 @cindex Exporting names
1776 @cindex Visibility control
1777
1778 An @dfn{export declaration} restricts the set of local declarations within a
1779 module that can be accessed from code outside the module. By default, all
1780 local declarations in a module are exported. If a module contains an export
1781 declaration, this declaration replaces the default export with the export
1782 specified.
1783
1784 An example of an export:
1785 @example
1786 mod foo @{
1787     export primary;
1788
1789     fn primary() @{
1790         helper(1, 2);
1791         helper(3, 4);
1792     @}
1793
1794     fn helper(x: int, y: int) @{
1795         @dots{}
1796     @}
1797 @}
1798
1799 fn main() @{
1800     foo::primary();  // Will compile.
1801     foo::helper(2,3) // ERROR: will not compile.
1802 @}
1803 @end example
1804
1805 Multiple items may be exported from a single export declaration:
1806
1807 @example
1808 mod foo @{
1809     export primary, secondary;
1810
1811     fn primary() @{
1812         helper(1, 2);
1813         helper(3, 4);
1814     @}
1815
1816     fn secondary() @{
1817         @dots{}
1818     @}
1819
1820     fn helper(x: int, y: int) @{
1821         @dots{}
1822     @}
1823 @}
1824 @end example
1825
1826
1827 @node       Ref.Item.Fn
1828 @subsection Ref.Item.Fn
1829 @c * Ref.Item.Fn::            Items defining functions.
1830 @cindex Functions
1831 @cindex Slots, function input and output
1832
1833 A @dfn{function item} defines a sequence of statements associated with a name
1834 and a set of parameters. Functions are declared with the keyword
1835 @code{fn}. Functions declare a set of @emph{input slots} as parameters,
1836 through which the caller passes arguments into the function, and an
1837 @emph{output slot} through which the function passes results back to the
1838 caller.
1839
1840 A function may also be copied into a first class @emph{value}, in which case
1841 the value has the corresponding @emph{function type}, and can be used
1842 otherwise exactly as a function item (with a minor additional cost of calling
1843 the function, as such a call is indirect). @xref{Ref.Type.Fn}.
1844
1845 Every control path in a function ends with a @code{ret} or @code{be}
1846 expression or with a diverging expression (described later in this
1847 section). If a control path lacks a @code{ret} expression in source code, an
1848 implicit @code{ret} expression is appended to the end of the control path
1849 during compilation, returning the implicit @code{()} value.
1850
1851 An example of a function:
1852 @example
1853 fn add(x: int, y: int) -> int @{
1854     ret x + y;
1855 @}
1856 @end example
1857
1858 A special kind of function can be declared with a @code{!} character where the
1859 output slot type would normally be. For example:
1860 @example
1861 fn my_err(s: str) -> ! @{
1862     log s;
1863     fail;
1864 @}
1865 @end example
1866
1867 We call such functions ``diverging'' because they never return a value to the
1868 caller. Every control path in a diverging function must end with a @code{fail}
1869 or a call to another diverging function on every control path. The @code{!}
1870 annotation does @emph{not} denote a type. Rather, the result type
1871 of a diverging function is a special type called @math{\bot} (``bottom'') that
1872 unifies with any type. Rust has no syntax for @math{\bot}.
1873
1874 It might be necessary to declare a diverging function because as mentioned
1875 previously, the typechecker checks that every control path in a function ends
1876 with a @code{ret}, @code{be}, or diverging expression. So, if @code{my_err}
1877 were declared without the @code{!} annotation, the following code would not
1878 typecheck:
1879 @example
1880 fn f(i: int) -> int @{
1881    if i == 42 @{
1882      ret 42;
1883    @}
1884    else @{
1885      my_err("Bad number!");
1886    @}
1887 @}
1888 @end example
1889
1890 The typechecker would complain that @code{f} doesn't return a value in the
1891 @code{else} branch. Adding the @code{!} annotation on @code{my_err} would
1892 express that @code{f} requires no explicit @code{ret}, as if it returns
1893 control to the caller, it returns a value (true because it never returns
1894 control).
1895
1896 @node       Ref.Item.Pred
1897 @subsection Ref.Item.Pred
1898 @c * Ref.Item.Pred::            Items defining predicates.
1899 @cindex Predicate
1900
1901 Any pure boolean function is called a @emph{predicate}, and may be used
1902 as part of the static typestate system. @xref{Ref.Typestate.Constr}. A
1903 predicate declaration is identical to a function declaration, except that it
1904 is declared with the additional keyword @code{pure}. In addition,
1905 the typechecker checks the body of a predicate with a restricted set of
1906 typechecking rules. A predicate
1907 @itemize
1908 @item may not contain a @code{put}, @code{send}, @code{recv}, assignment, or
1909 self-call expression; and
1910 @item may only call other predicates, not general functions.
1911 @end itemize
1912
1913 An example of a predicate:
1914 @example
1915 pure fn lt_42(x: int) -> bool @{
1916     ret (x < 42);
1917 @}
1918 @end example
1919
1920 A non-boolean function may also be declared with @code{pure fn}. This allows
1921 predicates to call non-boolean functions as long as they are pure. For example:
1922 @example
1923 pure fn pure_length<@@T>(ls: list<T>) -> uint @{ /* ... */ @}
1924
1925 pure fn nonempty_list<@@T>(ls: list<T>) -> bool @{ pure_length(ls) > 0u @}
1926 @end example
1927
1928 In this example, @code{nonempty_list} is a predicate---it can be used in a
1929 typestate constraint---but the auxiliary function @code{pure_length}@ is
1930 not.
1931
1932 @emph{ToDo:} should actually define referential transparency.
1933
1934 The effect checking rules previously enumerated are a restricted set of
1935 typechecking rules meant to approximate the universe of observably
1936 referentially transparent Rust procedures conservatively. Sometimes, these
1937 rules are @emph{too} restrictive. Rust allows programmers to violate these
1938 rules by writing predicates that the compiler cannot prove to be referentially
1939 transparent, using an escape-hatch feature called ``unchecked blocks''. When
1940 writing code that uses unchecked blocks, programmers should always be aware
1941 that they have an obligation to show that the code @emph{behaves} referentially
1942 transparently at all times, even if the compiler cannot @emph{prove}
1943 automatically that the code is referentially transparent. In the presence of
1944 unchecked blocks, the compiler provides no static guarantee that the code will
1945 behave as expected at runtime. Rather, the programmer has an independent
1946 obligation to verify the semantics of the predicates they write.
1947
1948 @emph{ToDo:} last two sentences are vague.
1949
1950 An example of a predicate that uses an unchecked block:
1951 @example
1952 fn pure_foldl<@@T, @@U>(ls: list<T>, u: U, f: block(&T, &U) -> U) -> U @{
1953     alt ls @{
1954       nil. @{ u @}
1955       cons(hd, tl) @{ f(hd, pure_foldl(*tl, f(hd, u), f)) @}
1956     @}
1957 @}
1958
1959 pure fn pure_length<@@T>(ls: list<T>) -> uint @{
1960     fn count<T>(_t: T, u: uint) -> uint @{ u + 1u @}
1961     unchecked @{
1962         pure_foldl(ls, 0u, count)
1963     @}
1964 @}
1965 @end example
1966
1967 Despite its name, @code{pure_foldl} is a @code{fn}, not a @code{pure fn},
1968 because there is no way in Rust to specify that the higher-order function
1969 argument @code{f} is a pure function. So, to use @code{foldl} in a pure list
1970 length function that a predicate could then use, we must use an
1971 @code{unchecked} block wrapped around the call to @code{pure_foldl} in the
1972 definition of @code{pure_length}.
1973
1974 @node          Ref.Item.Iter
1975 @subsection    Ref.Item.Iter
1976 @c * Ref.Item.Iter::          Items defining iterators.
1977
1978 @cindex Iterators
1979 @cindex Put expression
1980 @cindex Put each expression
1981 @cindex Foreach expression
1982
1983 Iterators are function-like items that can @code{put} multiple values during
1984 their execution before returning.
1985
1986 Putting a value is similar to returning a value -- the argument to @code{put}
1987 is copied into the caller's frame and control transfers back to the caller --
1988 but the iterator frame is only @emph{suspended} during the put, and will be
1989 @emph{resumed} at the point after the @code{put}, on the next iteration of
1990 the caller's loop.
1991
1992 The output type of an iterator is the type of value that the function will
1993 @code{put}, before it eventually evaluates a @code{ret} or @code{be} expression
1994 of type @code{()} and completes its execution.
1995
1996 An iterator can be called only in the loop header of a matching @code{for
1997 each} loop or as the argument in a @code{put each} expression.
1998 @xref{Ref.Expr.Foreach}.
1999
2000 An example of an iterator:
2001 @example
2002 iter range(lo: int, hi: int) -> int @{
2003     let i: int = lo;
2004     while (i < hi) @{
2005         put i;
2006         i = i + 1;
2007     @}
2008 @}
2009
2010 let sum: int = 0;
2011 for each x: int in range(0,100) @{
2012     sum += x;
2013 @}
2014 @end example
2015
2016
2017 @node       Ref.Item.Obj
2018 @subsection Ref.Item.Obj
2019 @c * Ref.Item.Obj::          Items defining objects.
2020 @cindex Objects
2021 @cindex Object constructors
2022
2023 An @dfn{object item} defines the @emph{state} and @emph{methods} of a set of
2024 @emph{object values}. Object values have object types.  @xref{Ref.Type.Obj}.
2025
2026 An @emph{object item} declaration -- in addition to providing a scope for
2027 state and method declarations -- implicitly declares a static function called
2028 the @emph{object constructor}, as well as a named @emph{object type}. The name
2029 given to the object item is resolved to a type when used in type context, or a
2030 constructor function when used in value context (such as a call).
2031
2032 Example of an object item:
2033 @example
2034 obj counter(state: @@mutable int) @{
2035     fn incr() @{
2036        *state += 1;
2037     @}
2038     fn get() -> int @{
2039        ret *state;
2040     @}
2041 @}
2042
2043 let c: counter = counter(@@mutable 1);
2044
2045 c.incr();
2046 c.incr();
2047 assert c.get() == 3;
2048 @end example
2049
2050 Inside an object's methods, you can make @emph{self-calls} using the
2051 @code{self} keyword.
2052 @example
2053 obj my_obj() @{
2054   fn get() -> int @{
2055     ret 3;
2056   @}
2057   fn foo() -> int @{
2058     let c = self.get();
2059     ret c + 2;
2060   @}
2061 @}
2062
2063 let o = my_obj();
2064 assert o.foo() == 5;
2065 @end example
2066
2067 Rust objects are extendable with additional methods and fields using
2068 @emph{anonymous object} expressions.  @xref{Ref.Expr.AnonObj}.
2069
2070 @node       Ref.Item.Type
2071 @subsection Ref.Item.Type
2072 @c * Ref.Item.Type::          Items defining the types of values and slots.
2073 @cindex Type definitions
2074
2075 A @dfn{type definition} defines a set of possible values in
2076 memory. @xref{Ref.Type}. Type definitions are declared with the keyword
2077 @code{type}. Every value has a single, specific type; the type-specified
2078 aspects of a value include:
2079
2080 @itemize
2081 @item Whether the value is composed of sub-values or is indivisible.
2082 @item Whether the value represents textual or numerical information.
2083 @item Whether the value represents integral or floating-point information.
2084 @item The sequence of memory operations required to access the value.
2085 @item The @emph{kind} of the type (pinned, unique or shared).
2086 @end itemize
2087
2088 For example, the type @code{@{x: u8, y: u8@}} defines the set of immutable
2089 values that are composite records, each containing two unsigned 8-bit integers
2090 accessed through the components @code{x} and @code{y}, and laid out in memory
2091 with the @code{x} component preceding the @code{y} component. This type is of
2092 @emph{unique} kind, meaning that there is no shared substructure with other
2093 types, but it can be copied and moved freely.
2094
2095 @node       Ref.Item.Tag
2096 @subsection Ref.Item.Tag
2097 @c * Ref.Item.Type::          Items defining the constructors of a tag type.
2098 @cindex Tag types
2099
2100 A tag item simultaneously declares a new nominal tag type
2101 (@pxref{Ref.Type.Tag}) as well as a set of @emph{constructors} that can be
2102 used to create or pattern-match values of the corresponding tag type.
2103
2104 The constructors of a @code{tag} type may be recursive: that is, each constructor
2105 may take an argument that refers, directly or indirectly, to the tag type the constructor
2106 is a member of. Such recursion has restrictions:
2107 @itemize
2108 @item Recursive types can be introduced only through @code{tag} constructors.
2109 @item A recursive @code{tag} item must have at least one non-recursive
2110 constructor (in order to give the recursion a basis case).
2111 @item The recursive argument of recursive tag constructors must be @emph{box}
2112 values (in order to bound the in-memory size of the constructor).
2113 @item Recursive type definitions can cross module boundaries, but not module
2114 @emph{visibility} boundaries, nor crate boundaries (in order to simplify the
2115 module system).
2116 @end itemize
2117
2118 An example of a @code{tag} item and its use:
2119 @example
2120 tag animal @{
2121   dog;
2122   cat;
2123 @}
2124
2125 let a: animal = dog;
2126 a = cat;
2127 @end example
2128
2129 An example of a @emph{recursive} @code{tag} item and its use:
2130 @example
2131 tag list<T> @{
2132   nil;
2133   cons(T, @@list<T>);
2134 @}
2135
2136 let a: list<int> = cons(7, cons(13, nil));
2137 @end example
2138
2139
2140 @page
2141 @node    Ref.Type
2142 @section Ref.Type
2143 @cindex Types
2144
2145 Every slot and value in a Rust program has a type. The @dfn{type} of a
2146 @emph{value} defines the interpretation of the memory holding it. The type of
2147 a @emph{slot} may also include constraints. @xref{Ref.Type.Constr}.
2148
2149 Built-in types and type-constructors are tightly integrated into the language,
2150 in nontrivial ways that are not possible to emulate in user-defined
2151 types. User-defined types have limited capabilities. In addition, every
2152 built-in type or type-constructor name is reserved as a @emph{keyword} in
2153 Rust; they cannot be used as user-defined identifiers in any context.
2154
2155 @menu
2156 * Ref.Type.Any::                An open union of every possible type.
2157 * Ref.Type.Mach::               Machine-level types.
2158 * Ref.Type.Int::                The machine-dependent integer types.
2159 * Ref.Type.Float::              The machine-dependent floating-point types.
2160 * Ref.Type.Prim::               Primitive types.
2161 * Ref.Type.Big::                The arbitrary-precision integer type.
2162 * Ref.Type.Text::               Strings and characters.
2163 * Ref.Type.Rec::                Labeled products of heterogeneous types.
2164 * Ref.Type.Tup::                Unlabeled products of heterogeneous types.
2165 * Ref.Type.Vec::                Open products of homogeneous types.
2166 * Ref.Type.Tag::                Disjoint unions of heterogeneous types.
2167 * Ref.Type.Fn::                 Subroutine types.
2168 * Ref.Type.Iter::               Scoped coroutine types.
2169 * Ref.Type.Obj::                Abstract types.
2170 * Ref.Type.Constr::             Constrained types.
2171 * Ref.Type.Type::               Types describing types.
2172 @end menu
2173
2174 @node       Ref.Type.Any
2175 @subsection Ref.Type.Any
2176 @cindex Any type
2177 @cindex Dynamic type, see @i{Any type}
2178 @cindex Alt type expression
2179
2180 The type @code{any} is the union of all possible Rust types. A value of type
2181 @code{any} is represented in memory as a pair consisting of a boxed value of
2182 some non-@code{any} type @var{T} and a reflection of the type @var{T}.
2183
2184 Values of type @code{any} can be used in an @code{alt type} expression, in
2185 which the reflection is used to select a block corresponding to a particular
2186 type extraction. @xref{Ref.Expr.Alt}.
2187
2188 @node       Ref.Type.Mach
2189 @subsection Ref.Type.Mach
2190 @cindex Machine types
2191 @cindex Floating-point types
2192 @cindex Integer types
2193 @cindex Word types
2194
2195 The machine types are the following:
2196
2197 @itemize
2198 @item
2199 The unsigned word types @code{u8}, @code{u16}, @code{u32} and @code{u64},
2200 with values drawn from the integer intervals
2201 @iftex
2202 @math{[0, 2^8 - 1]},
2203 @math{[0, 2^{16} - 1]},
2204 @math{[0, 2^{32} - 1]} and
2205 @math{[0, 2^{64} - 1]}
2206 @end iftex
2207 @ifhtml
2208 @html
2209 [0, 2<sup>8</sup>-1],
2210 [0, 2<sup>16</sup>-1],
2211 [0, 2<sup>32</sup>-1] and
2212 [0, 2<sup>64</sup>-1]
2213 @end html
2214 @end ifhtml
2215  respectively.
2216 @item
2217 The signed two's complement word types @code{i8}, @code{i16}, @code{i32} and
2218 @code{i64}, with values drawn from the integer intervals
2219 @iftex
2220 @math{[-(2^7),(2^7)-1)]},
2221 @math{[-(2^{15}),2^{15}-1)]},
2222 @math{[-(2^{31}),2^{31}-1)]} and
2223 @math{[-(2^{63}),2^{63}-1)]}
2224 @end iftex
2225 @ifhtml
2226 @html
2227 [-(2<sup>7</sup>), 2<sup>7</sup>-1],
2228 [-(2<sup>15</sup>), 2<sup>15</sup>-1],
2229 [-(2<sup>31</sup>), 2<sup>31</sup>-1] and
2230 [-(2<sup>63</sup>), 2<sup>63</sup>-1]
2231 @end html
2232 @end ifhtml
2233  respectively.
2234 @item
2235 The IEEE 754-2008 @code{binary32} and @code{binary64} floating-point types:
2236 @code{f32} and @code{f64}, respectively.
2237 @end itemize
2238
2239 @node       Ref.Type.Int
2240 @subsection Ref.Type.Int
2241 @cindex Machine-dependent types
2242 @cindex Integer types
2243 @cindex Word types
2244
2245
2246 The Rust type @code{uint}@footnote{A Rust @code{uint} is analogous to a C99
2247 @code{uintptr_t}.} is an unsigned integer type with with
2248 target-machine-dependent size. Its size, in bits, is equal to the number of
2249 bits required to hold any memory address on the target machine.
2250
2251 The Rust type @code{int}@footnote{A Rust @code{int} is analogous to a C99
2252 @code{intptr_t}.} is a two's complement signed integer type with
2253 target-machine-dependent size. Its size, in bits, is equal to the size of the
2254 rust type @code{uint} on the same target machine.
2255
2256 @node       Ref.Type.Float
2257 @subsection Ref.Type.Float
2258 @cindex Machine-dependent types
2259 @cindex Floating-point types
2260
2261 The Rust type @code{float} is a machine-specific type equal to one of the
2262 supported Rust floating-point machine types (@code{f32} or @code{f64}). It is
2263 the largest floating-point type that is directly supported by hardware on the
2264 target machine, or if the target machine has no floating-point hardware
2265 support, the largest floating-point type supported by the software
2266 floating-point library used to support the other floating-point machine types.
2267
2268 Note that due to the preference for hardware-supported floating-point, the
2269 type @code{float} may not be equal to the largest @emph{supported}
2270 floating-point type.
2271
2272
2273 @node       Ref.Type.Prim
2274 @subsection Ref.Type.Prim
2275 @cindex Primitive types
2276 @cindex Integer types
2277 @cindex Floating-point types
2278 @cindex Character type
2279 @cindex Boolean type
2280
2281 The primitive types are the following:
2282
2283 @itemize
2284 @item
2285 The ``nil'' type @code{()}, having the single ``nil'' value
2286 @code{()}.@footnote{The ``nil'' value @code{()} is @emph{not} a sentinel
2287 ``null pointer'' value for reference slots; the ``nil'' type is the implicit
2288 return type from functions otherwise lacking a return type, and can be used in
2289 other contexts (such as message-sending or type-parametric code) as a
2290 zero-size type.}
2291 @item
2292 The boolean type @code{bool} with values @code{true} and @code{false}.
2293 @item
2294 The machine types.
2295 @item
2296 The machine-dependent integer and floating-point types.
2297 @end itemize
2298
2299
2300 @node       Ref.Type.Big
2301 @subsection Ref.Type.Big
2302 @cindex Integer types
2303 @cindex Big integer type
2304
2305 The Rust type @code{big}@footnote{A Rust @code{big} is analogous to a Lisp
2306 bignum or a Python long integer.} is an arbitrary precision integer type that
2307 fits in a machine word @emph{when possible} and transparently expands to a
2308 boxed ``big integer'' allocated in the run-time heap when it overflows or
2309 underflows outside of the range of a machine word.
2310
2311 A Rust @code{big} grows to accommodate extra binary digits as they are needed,
2312 by taking extra memory from the memory budget available to each Rust task, and
2313 should only exhaust its range due to memory exhaustion.
2314
2315 @node       Ref.Type.Text
2316 @subsection Ref.Type.Text
2317 @cindex Text types
2318 @cindex String type
2319 @cindex Character type
2320 @cindex Unicode
2321 @cindex UCS-4
2322 @cindex UTF-8
2323
2324 The types @code{char} and @code{str} hold textual data.
2325
2326 A value of type @code{char} is a Unicode character, represented as a 32-bit
2327 unsigned word holding a UCS-4 codepoint.
2328
2329 A value of type @code{str} is a Unicode string, represented as a vector of
2330 8-bit unsigned bytes holding a sequence of UTF-8 codepoints.
2331
2332 @node       Ref.Type.Rec
2333 @subsection Ref.Type.Rec
2334 @cindex Record types
2335 @cindex Structure types, see @i{Record types}
2336
2337 The record type-constructor forms a new heterogeneous product of
2338 values.@footnote{The record type-constructor is analogous to the @code{struct}
2339 type-constructor in the Algol/C family, the @emph{record} types of the ML
2340 family, or the @emph{structure} types of the Lisp family.} Fields of a record
2341 type are accessed by name and are arranged in memory in the order specified by
2342 the record type.
2343
2344 An example of a record type and its use:
2345 @example
2346 type point = @{x: int, y: int@};
2347 let p: point = @{x: 10, y: 11@};
2348 let px: int = p.x;
2349 @end example
2350
2351 @node       Ref.Type.Tup
2352 @subsection Ref.Type.Tup
2353 @cindex Tuple types
2354
2355 The tuple type-constructor forms a new heterogeneous product of
2356 values similar to the record type-constructor. The differences are as follows:
2357
2358 @itemize
2359 @item tuple elements cannot be mutable, unlike record fields
2360 @item tuple elements are not named and can be accessed only by pattern-matching
2361 @end itemize
2362
2363 Tuple types and values are denoted by listing the types or values of
2364 their elements, respectively, in a parenthesized, comma-separated
2365 list. Single-element tuples are not legal; all tuples have two or more values.
2366
2367 The members of a tuple are laid out in memory contiguously, like a record, in
2368 order specified by the tuple type.
2369
2370 An example of a tuple type and its use:
2371 @example
2372 type pair = (int,str);
2373 let p: pair = (10,"hello");
2374 let (a, b) = p;
2375 assert (b == "world");
2376 @end example
2377
2378
2379 @node       Ref.Type.Vec
2380 @subsection Ref.Type.Vec
2381 @cindex Vector types
2382 @cindex Array types, see @i{Vector types}
2383
2384 The vector type-constructor represents a homogeneous array of values of a
2385 given type. A vector has a fixed size. The kind of a vector type depends on
2386 the kind of its member type, as with other simple structural types.
2387
2388 An example of a vector type and its use:
2389 @example
2390 let v: [int] = [7, 5, 3];
2391 let i: int = v[2];
2392 assert (i == 3);
2393 @end example
2394
2395 Vectors always @emph{allocate} a storage region sufficient to store the first
2396 power of two worth of elements greater than or equal to the size of the
2397 vector. This behaviour supports idiomatic in-place ``growth'' of a mutable
2398 slot holding a vector:
2399
2400 @example
2401 let v: mutable [int] = [1, 2, 3];
2402 v += [4, 5, 6];
2403 @end example
2404
2405 Normal vector concatenation causes the allocation of a fresh vector to hold
2406 the result; in this case, however, the slot holding the vector recycles the
2407 underlying storage in-place (since the reference-count of the underlying
2408 storage is equal to 1).
2409
2410 All accessible elements of a vector are always initialized, and access to a
2411 vector is always bounds-checked.
2412
2413
2414 @node       Ref.Type.Tag
2415 @subsection Ref.Type.Tag
2416 @cindex Tag types
2417 @cindex Union types, see @i{Tag types}
2418
2419 A @emph{tag type} is a nominal, heterogeneous disjoint union
2420 type.@footnote{The @code{tag} type is analogous to a @code{data} constructor
2421 declaration in ML or a @emph{pick ADT} in Limbo.} A @code{tag} @emph{item}
2422 consists of a number of @emph{constructors}, each of which is independently
2423 named and takes an optional tuple of arguments.
2424
2425 Tag types cannot be denoted @emph{structurally} as types, but must be denoted
2426 by named reference to a @emph{tag item} declaration. @xref{Ref.Item.Tag}.
2427
2428 @node       Ref.Type.Fn
2429 @subsection Ref.Type.Fn
2430 @cindex Function types
2431
2432 The function type-constructor @code{fn} forms new function types. A function
2433 type consists of a sequence of input slots, an optional set of input
2434 constraints (@pxref{Ref.Typestate.Constr}) and an output
2435 slot. @xref{Ref.Item.Fn}.
2436
2437 An example of a @code{fn} type:
2438 @example
2439 fn add(x: int, y: int) -> int @{
2440   ret x + y;
2441 @}
2442
2443 let int x = add(5,7);
2444
2445 type binop = fn(int,int) -> int;
2446 let bo: binop = add;
2447 x = bo(5,7);
2448 @end example
2449
2450 @node       Ref.Type.Iter
2451 @subsection Ref.Type.Iter
2452 @cindex Iterator types
2453
2454 The iterator type-constructor @code{iter} forms new iterator types. An
2455 iterator type consists a sequence of input slots, an optional set of input
2456 constraints and an output slot. @xref{Ref.Item.Iter}.
2457
2458 An example of an @code{iter} type:
2459 @example
2460 iter range(x: int, y: int) -> int @{
2461   while (x < y) @{
2462     put x;
2463     x += 1;
2464   @}
2465 @}
2466
2467 for each i: int in range(5,7) @{
2468   @dots{};
2469 @}
2470 @end example
2471
2472 @node       Ref.Type.Obj
2473 @subsection Ref.Type.Obj
2474 @c * Ref.Type.Obj::                Object types.
2475 @cindex Object types
2476
2477 A @dfn{object type} describes values of abstract type, that carry some hidden
2478 @emph{fields} and are accessed through a set of un-ordered
2479 @emph{methods}. Every object item (@pxref{Ref.Item.Obj}) implicitly declares
2480 an object type carrying methods with types derived from all the methods of the
2481 object item.
2482
2483 Object types can also be declared in isolation, independent of any object item
2484 declaration. Such a ``plain'' object type can be used to describe an interface
2485 that a variety of particular objects may conform to, by supporting a superset
2486 of the methods.
2487
2488 The kind of an object type serves as a restriction to the kinds of fields that
2489 may be stored in it. Unique objects, for example, can only carry unique values
2490 in their fields.
2491
2492 An example of an object type with two separate object items supporting it, and
2493 a client function using both items via the object type:
2494
2495 @example
2496
2497 type taker =
2498     obj @{
2499         fn take(int);
2500     @};
2501
2502 obj adder(x: @@mutable int) @{
2503     fn take(y: int) @{
2504         *x += y;
2505     @}
2506 @}
2507
2508 obj sender(c: chan<int>) @{
2509     fn take(z: int) @{
2510         std::comm::send(c, z);
2511     @}
2512 @}
2513
2514 fn give_ints(t: taker) @{
2515     t.take(1);
2516     t.take(2);
2517     t.take(3);
2518 @}
2519
2520 let p: port<int> = std::comm::mk_port();
2521
2522 let t1: taker = adder(@@mutable 0);
2523 let t2: taker = sender(p.mk_chan());
2524
2525 give_ints(t1);
2526 give_ints(t2);
2527
2528 @end example
2529
2530
2531
2532 @node       Ref.Type.Constr
2533 @subsection Ref.Type.Constr
2534 @c * Ref.Type.Constr::             Constrained types.
2535 @cindex Constrained types
2536
2537 A @dfn{constrained type} is a type that carries a @emph{formal constraint}
2538 (@pxref{Ref.Typestate.Constr}), which is similar to a normal constraint except
2539 that the @emph{base name} of any slots mentioned in the constraint must be the
2540 special @emph{formal symbol} @emph{*}.
2541
2542 When a constrained type is instantiated in a particular slot declaration, the
2543 formal symbol in the constraint is replaced with the name of the declared slot
2544 and the resulting constraint is checked immediately after the slot is
2545 declared. @xref{Ref.Expr.Check}.
2546
2547 An example of a constrained type with two separate instantiations:
2548 @example
2549 type ordered_range = @{low: int, high: int@} : less_than(*.low, *.high);
2550
2551 let rng1: ordered_range = @{low: 5, high: 7@};
2552 // implicit: 'check less_than(rng1.low, rng1.high);'
2553
2554 let rng2: ordered_range = @{low: 15, high: 17@};
2555 // implicit: 'check less_than(rng2.low, rng2.high);'
2556 @end example
2557
2558 @node       Ref.Type.Type
2559 @subsection Ref.Type.Type
2560 @c * Ref.Type.Type::               Types describing types.
2561 @cindex Type type
2562
2563 @emph{TODO}.
2564
2565
2566
2567 @node       Ref.Typestate
2568 @section    Ref.Typestate
2569 @c * Ref.Typestate::         The static system of predicate analysis.
2570 @cindex Typestate system
2571
2572 Rust programs have a static semantics that determine the types of values
2573 produced by each expression, as well as the @emph{predicates} that hold over
2574 slots in the environment at each point in time during execution.
2575
2576 The latter semantics -- the dataflow analysis of predicates holding over slots
2577 -- is called the @emph{typestate} system.
2578
2579 @menu
2580 * Ref.Typestate.Point::         Discrete positions in execution.
2581 * Ref.Typestate.CFG::           The control-flow graph formed by points.
2582 * Ref.Typestate.Constr::        Predicates applied to slots.
2583 * Ref.Typestate.Cond::          Constraints required and implied by a point.
2584 * Ref.Typestate.State::         Constraints that hold at points.
2585 * Ref.Typestate.Check::         Relating dynamic state to static typestate.
2586 @end menu
2587
2588 @node       Ref.Typestate.Point
2589 @subsection Ref.Typestate.Point
2590 @c * Ref.Typestate.Point::         Discrete positions in execution.
2591 @cindex Points
2592
2593 Control flows from statement to statement in a block, and through the
2594 evaluation of each expression, from one sub-expression to another. This
2595 sequential control flow is specified as a set of @dfn{points}, each of which
2596 has a set of points before and after it in the implied control flow.
2597
2598 For example, this code:
2599
2600 @example
2601  s = "hello, world";
2602  print(s);
2603 @end example
2604
2605 Consists of 2 statements, 3 expressions and 12 points:
2606
2607 @itemize
2608 @item the point before the first statement
2609 @item the point before evaluating the static initializer @code{"hello, world"}
2610 @item the point after evaluating the static initializer @code{"hello, world"}
2611 @item the point after the first statement
2612 @item the point before the second statement
2613 @item the point before evaluating the function value @code{print}
2614 @item the point after evaluating the function value @code{print}
2615 @item the point before evaluating the arguments to @code{print}
2616 @item the point before evaluating the symbol @code{s}
2617 @item the point after evaluating the symbol @code{s}
2618 @item the point after evaluating the arguments to @code{print}
2619 @item the point after the second statement
2620 @end itemize
2621
2622 Whereas this code:
2623
2624 @example
2625  print(x() + y());
2626 @end example
2627
2628 Consists of 1 statement, 7 expressions and 14 points:
2629
2630 @itemize
2631 @item the point before the statement
2632 @item the point before evaluating the function value @code{print}
2633 @item the point after evaluating the function value @code{print}
2634 @item the point before evaluating the arguments to @code{print}
2635 @item the point before evaluating the arguments to @code{+}
2636 @item the point before evaluating the function value @code{x}
2637 @item the point after evaluating the function value @code{x}
2638 @item the point before evaluating the arguments to @code{x}
2639 @item the point after evaluating the arguments to @code{x}
2640 @item the point before evaluating the function value @code{y}
2641 @item the point after evaluating the function value @code{y}
2642 @item the point before evaluating the arguments to @code{y}
2643 @item the point after evaluating the arguments to @code{y}
2644 @item the point after evaluating the arguments to @code{+}
2645 @item the point after evaluating the arguments to @code{print}
2646 @end itemize
2647
2648
2649 The typestate system reasons over points, rather than statements or
2650 expressions. This may seem counter-intuitive, but points are the more
2651 primitive concept. Another way of thinking about a point is as a set of
2652 @emph{instants in time} at which the state of a task is fixed. By contrast, a
2653 statement or expression represents a @emph{duration in time}, during which the
2654 state of the task changes. The typestate system is concerned with constraining
2655 the possible states of a task's memory at @emph{instants}; it is meaningless
2656 to speak of the state of a task's memory ``at'' a statement or expression, as
2657 each statement or expression is likely to change the contents of memory.
2658
2659 @node       Ref.Typestate.CFG
2660 @subsection Ref.Typestate.CFG
2661 @c * Ref.Typestate.CFG::           The control-flow graph formed by points.
2662 @cindex Control-flow graph
2663
2664 Each @emph{point} can be considered a vertex in a directed @emph{graph}. Each
2665 kind of expression or statement implies a number of points @emph{and edges} in
2666 this graph. The edges connect the points within each statement or expression,
2667 as well as between those points and those of nearby statements and expressions
2668 in the program. The edges between points represent @emph{possible} indivisible
2669 control transfers that might occur during execution.
2670
2671 This implicit graph is called the @dfn{control-flow graph}, or @dfn{CFG}.
2672
2673 @node       Ref.Typestate.Constr
2674 @subsection Ref.Typestate.Constr
2675 @c * Ref.Typestate.Constr::          Predicates applied to slots.
2676 @cindex Predicate
2677 @cindex Constraint
2678
2679 A @dfn{predicate} is a pure boolean function declared with the keyword
2680 @code{pred}. @xref{Ref.Item.Pred}.
2681
2682 A @dfn{constraint} is a predicate applied to specific slots.
2683
2684 For example, consider the following code:
2685
2686 @example
2687 pure fn is_less_than(int a, int b) -> bool @{
2688      ret a < b;
2689 @}
2690
2691 fn test() @{
2692    let x: int = 10;
2693    let y: int = 20;
2694    check is_less_than(x,y);
2695 @}
2696 @end example
2697
2698 This example defines the predicate @code{is_less_than}, and applies it to the
2699 slots @code{x} and @code{y}. The constraint being checked on the third line of
2700 the function is @code{is_less_than(x,y)}.
2701
2702 Predicates can only apply to slots holding immutable values. The slots a
2703 predicate applies to can themselves be mutable, but the types of values held
2704 in those slots must be immutable.
2705
2706 @node       Ref.Typestate.Cond
2707 @subsection Ref.Typestate.Cond
2708 @c * Ref.Typestate.Cond::          Constraints required and implied by a point.
2709 @cindex Condition
2710 @cindex Precondition
2711 @cindex Postcondition
2712
2713 A @dfn{condition} is a set of zero or more constraints.
2714
2715 Each @emph{point} has an associated @emph{condition}:
2716
2717 @itemize
2718 @item The @dfn{precondition} of a statement or expression is the condition
2719 required at in the point before it.
2720 @item The @dfn{postcondition} of a statement or expression is the condition
2721 enforced in the point after it.
2722 @end itemize
2723
2724 Any constraint present in the precondition and @emph{absent} in the
2725 postcondition is considered to be @emph{dropped} by the statement or
2726 expression.
2727
2728 @node       Ref.Typestate.State
2729 @subsection Ref.Typestate.State
2730 @c * Ref.Typestate.State::     Constraints that hold at points.
2731 @cindex Typestate
2732 @cindex Prestate
2733 @cindex Poststate
2734
2735 The typestate checking system @emph{calculates} an additional condition for
2736 each point called its typestate. For a given statement or expression, we call
2737 the two typestates associated with its two points the prestate and a
2738 poststate.
2739
2740 @itemize
2741 @item The @dfn{prestate} of a statement or expression is the typestate of the
2742 point before it.
2743 @item The @dfn{poststate} of a statement or expression is the typestate of the
2744 point after it.
2745 @end itemize
2746
2747 A @dfn{typestate} is a condition that has @emph{been determined by the
2748 typestate algorithm} to hold at a point. This is a subtle but important point
2749 to understand: preconditions and postconditions are @emph{inputs} to the
2750 typestate algorithm; prestates and poststates are @emph{outputs} from the
2751 typestate algorithm.
2752
2753 The typestate algorithm analyses the preconditions and postconditions of every
2754 statement and expression in a block, and computes a condition for each
2755 typestate. Specifically:
2756
2757 @itemize
2758 @item Initially, every typestate is empty.
2759 @item Each statement or expression's poststate is given the union of the its
2760 prestate, precondition, and postcondition.
2761 @item Each statement or expression's poststate has the difference between its
2762 precondition and postcondition removed.
2763 @item Each statement or expression's prestate is given the intersection of the
2764 poststates of every predecessor point in the CFG.
2765 @item The previous three steps are repeated until no typestates in the
2766 block change.
2767 @end itemize
2768
2769 The typestate algorithm is a very conventional dataflow calculation, and can
2770 be performed using bit-set operations, with one bit per predicate and one
2771 bit-set per condition.
2772
2773 After the typestates of a block are computed, the typestate algorithm checks
2774 that every constraint in the precondition of a statement is satisfied by its
2775 prestate. If any preconditions are not satisfied, the mismatch is considered a
2776 static (compile-time) error.
2777
2778
2779 @node       Ref.Typestate.Check
2780 @subsection Ref.Typestate.Check
2781 @c * Ref.Typestate.Check::         Relating dynamic state to static typestate.
2782 @cindex Check statement
2783 @cindex Assertions, see @i{Check statement}
2784
2785 The key mechanism that connects run-time semantics and compile-time analysis
2786 of typestates is the use of @code{check} expressions. @xref{Ref.Expr.Check}. A
2787 @code{check} expression guarantees that @emph{if} control were to proceed past
2788 it, the predicate associated with the @code{check} would have succeeded, so
2789 the constraint being checked @emph{statically} holds in subsequent
2790 points.@footnote{A @code{check} expression is similar to an @code{assert}
2791 call in a C program, with the significant difference that the Rust compiler
2792 @emph{tracks} the constraint that each @code{check} expression
2793 enforces. Naturally, @code{check} expressions cannot be omitted from a
2794 ``production build'' of a Rust program the same way @code{asserts} are
2795 frequently disabled in deployed C programs.}
2796
2797 It is important to understand that the typestate system has @emph{no insight}
2798 into the meaning of a particular predicate. Predicates and constraints are not
2799 evaluated in any way at compile time. Predicates are treated as specific (but
2800 unknown) functions applied to specific (also unknown) slots. All the typestate
2801 system does is track which of those predicates -- whatever they calculate --
2802 @emph{must have been checked already} in order for program control to reach a
2803 particular point in the CFG. The fundamental building block, therefore, is the
2804 @code{check} statement, which tells the typestate system ``if control passes
2805 this point, the checked predicate holds''.
2806
2807 From this building block, constraints can be propagated to function signatures
2808 and constrained types, and the responsibility to @code{check} a constraint
2809 pushed further and further away from the site at which the program requires it
2810 to hold in order to execute properly.
2811
2812
2813 @page
2814 @node    Ref.Stmt
2815 @section Ref.Stmt
2816 @c * Ref.Stmt::               Components of an executable block.
2817 @cindex Statements
2818
2819 A @dfn{statement} is a component of a block, which is in turn a component of
2820 an outer block-expression, a function or an iterator. When a function is
2821 spawned into a task, the task @emph{executes} statements in an order
2822 determined by the body of the enclosing structure. Each statement causes the
2823 task to perform certain actions.
2824
2825 Rust has two kinds of statement: declarations and expressions.
2826
2827 A declaration serves to introduce a @emph{name} that can be used in the block
2828 @emph{scope} enclosing the statement: all statements before and after the
2829 name, from the previous opening curly-brace (@code{@{}) up to the next closing
2830 curly-brace (@code{@}}).
2831
2832 An expression serves the dual roles of causing side effects and producing a
2833 @emph{value}. Expressions are said to @emph{evaluate to} a value, and the side
2834 effects are caused during @emph{evaluation}. Many expressions contain
2835 sub-expressions as operands; the definition of each kind of expression
2836 dictates whether or not, and in which order, it will evaluate its
2837 sub-expressions, and how the expression's value derives from the value of its
2838 sub-expressions.
2839
2840 In this way, the structure of execution -- both the overall sequence of
2841 observable side effects and the final produced value -- is dictated by the
2842 structure of expressions. Blocks themselves are expressions, so the nesting
2843 sequence of block, statement, expression, and block can repeatedly nest to an
2844 arbitrary depth.
2845
2846 @menu
2847 * Ref.Stmt.Decl::               Statement declaring an item or slot.
2848 * Ref.Stmt.Expr::               Statement evaluating an expression.
2849 @end menu
2850
2851 @node       Ref.Stmt.Decl
2852 @subsection Ref.Stmt.Decl
2853 @c * Ref.Stmt.Decl::                Statement declaring an item or slot.
2854 @cindex Declaration statement
2855
2856 A @dfn{declaration statement} is one that introduces a @emph{name} into the
2857 enclosing statement block. The declared name may denote a new slot or a new
2858 item. The scope of the name extends to the entire containing block, both
2859 before and after the declaration.
2860
2861 @menu
2862 * Ref.Stmt.Decl.Item::              Statement declaring an item.
2863 * Ref.Stmt.Decl.Slot::              Statement declaring a slot.
2864 @end menu
2865
2866 @node          Ref.Stmt.Decl.Item
2867 @subsubsection Ref.Stmt.Decl.Item
2868 @c * Ref.Stmt.Decl.Item::                Statement declaring an item.
2869
2870 An @dfn{item declaration statement} has a syntactic form identical to an item
2871 declaration within a module. Declaring an item -- a function, iterator,
2872 object, type or module -- locally within a statement block is simply a way of
2873 restricting its scope to a narrow region containing all of its uses; it is
2874 otherwise identical in meaning to declaring the item outside the statement
2875 block.
2876
2877 Note: there is no implicit capture of the function's dynamic environment when
2878 declaring a function-local item.
2879
2880 @node          Ref.Stmt.Decl.Slot
2881 @subsubsection Ref.Stmt.Decl.Slot
2882 @c * Ref.Stmt.Decl.Slot::                Statement declaring an slot.
2883 @cindex Local slot
2884 @cindex Variable, see @i{Local slot}
2885 @cindex Type inference
2886
2887 A @code{slot declaration statement} has one one of two forms:
2888
2889 @itemize
2890 @item @code{let} @var{pattern} @var{optional-init};
2891 @item @code{let} @var{pattern} : @var{type} @var{optional-init};
2892 @end itemize
2893
2894 Where @var{type} is a type expression, @var{pattern} is an irrefutable pattern
2895 (often just the name of a single slot), and @var{optional-init} is an optional
2896 initializer. If present, the initializer consists of either an equals sign
2897 (@code{=}) or move operator (@code{<-}), followed by an expression.
2898
2899 Both forms introduce a new slot into the containing block scope. The new slot
2900 is visible across the entire scope, but is initialized only at the point
2901 following the declaration statement.
2902
2903 The former form, with no type annotation, causes the compiler to infer the
2904 static type of the slot through unification with the types of values assigned
2905 to the slot in the remaining code in the block scope. Inference only occurs on
2906 frame-local slots, not argument slots. Function, iterator and object
2907 signatures must always declared types for all argument slots.
2908 @xref{Ref.Mem.Slot}.
2909
2910 @node       Ref.Stmt.Expr
2911 @subsection Ref.Stmt.Expr
2912 @c * Ref.Stmt.Expr::                Statement evaluating an expression
2913 @cindex Expression statement
2914
2915 An @dfn{expression statement} is one that evaluates an expression and drops
2916 its result. The purpose of an expression statement is often to cause the side
2917 effects of the expression's evaluation.
2918
2919 @page
2920 @node    Ref.Expr
2921 @section Ref.Expr
2922 @c * Ref.Expr::               Parsed and primitive expressions.
2923 @cindex Expressions
2924
2925
2926 @menu
2927 * Ref.Expr.Copy::               Expression for copying a value.
2928 * Ref.Expr.Call::               Expression for calling a function.
2929 * Ref.Expr.Bind::               Expression for binding arguments to functions.
2930 * Ref.Expr.Ret::                Expression for stopping and producing a value.
2931 @c * Ref.Expr.Be::                 Expression for stopping and executing a tail call.
2932 * Ref.Expr.Put::                Expression for pausing and producing a value.
2933 * Ref.Expr.Fail::               Expression for causing task failure.
2934 * Ref.Expr.Log::                Expression for logging values to diagnostic buffers.
2935 * Ref.Expr.Note::               Expression for logging values during failure.
2936 * Ref.Expr.While::              Expression for simple conditional looping.
2937 * Ref.Expr.Break::              Expression for terminating a loop.
2938 * Ref.Expr.Cont::               Expression for terminating a single loop iteration.
2939 * Ref.Expr.For::                Expression for looping over strings and vectors.
2940 * Ref.Expr.Foreach::            Expression for looping via an iterator.
2941 * Ref.Expr.If::                 Expression for simple conditional branching.
2942 * Ref.Expr.Alt::                Expression for complex conditional branching.
2943 * Ref.Expr.Prove::              Expression for static assertion of typestate.
2944 * Ref.Expr.Check::              Expression for dynamic assertion of typestate.
2945 * Ref.Expr.Claim::              Expression for static (unsafe) or dynamic assertion of typestate.
2946 * Ref.Expr.Assert::             Expression for halting the program if a boolean condition fails to hold.
2947 * Ref.Expr.IfCheck::            Expression for dynamic testing of typestate.
2948 * Ref.Expr.AnonObj::            Expression for extending objects with additional methods.
2949 @end menu
2950
2951
2952 @node       Ref.Expr.Copy
2953 @subsection Ref.Expr.Copy
2954 @c * Ref.Expr.Copy::                Expression for copying a value.
2955 @cindex Copy expression
2956 @cindex Assignment operator, see @i{Copy expression}
2957
2958 A @dfn{copy expression} consists of an @emph{lval} followed by an equals-sign
2959 (@code{=}) and a primitive expression. @xref{Ref.Expr}.
2960
2961 Executing a copy expression causes the value denoted by the expression --
2962 either a value or a primitive combination of values -- to be copied into the
2963 memory location denoted by the @emph{lval}.
2964
2965 A copy may entail the adjustment of reference counts, execution of destructors,
2966 or similar adjustments in order to respect the path through the memory graph
2967 implied by the @code{lval}, as well as any existing value held in the memory
2968 being written-to. All such adjustment is automatic and implied by the @code{=}
2969 operator.
2970
2971 An example of three different copy expressions:
2972 @example
2973 x = y;
2974 x.y = z;
2975 x.y = z + 2;
2976 @end example
2977
2978 @node       Ref.Expr.Call
2979 @subsection Ref.Expr.Call
2980 @c * Ref.Expr.Call::               Expression for calling a function.
2981 @cindex Call expression
2982 @cindex Function calls
2983
2984 A @dfn{call expression} invokes a function, providing a tuple of input slots
2985 and an reference slot to serve as the function's output, bound to the @var{lval}
2986 on the right hand side of the call. If the function eventually returns, then
2987 the expression completes.
2988
2989 A call expression statically requires that the precondition declared in the
2990 callee's signature is satisfied by the expression prestate. In this way,
2991 typestates propagate through function boundaries. @xref{Ref.Typestate}.
2992
2993 An example of a call expression:
2994 @example
2995 let x: int = add(1, 2);
2996 @end example
2997
2998 @node       Ref.Expr.Bind
2999 @subsection Ref.Expr.Bind
3000 @c * Ref.Expr.Bind::          Expression for binding arguments to functions.
3001 @cindex Bind expression
3002 @cindex Closures
3003 @cindex Currying
3004
3005 A @dfn{bind expression} constructs a new function from an existing
3006 function.@footnote{The @code{bind} expression is analogous to the @code{bind}
3007 expression in the Sather language.} The new function has zero or more of its
3008 arguments @emph{bound} into a new, hidden boxed tuple that holds the
3009 bindings. For each concrete argument passed in the @code{bind} expression, the
3010 corresponding parameter in the existing function is @emph{omitted} as a
3011 parameter of the new function. For each argument passed the placeholder symbol
3012 @code{_} in the @code{bind} expression, the corresponding parameter of the
3013 existing function is @emph{retained} as a parameter of the new function.
3014
3015 Any subsequent invocation of the new function with residual arguments causes
3016 invocation of the existing function with the combination of bound arguments
3017 and residual arguments that was specified during the binding.
3018
3019 An example of a @code{bind} expression:
3020 @example
3021 fn add(x: int, y: int) -> int @{
3022     ret x + y;
3023 @}
3024 type single_param_fn = fn(int) -> int;
3025
3026 let add4: single_param_fn = bind add(4, _);
3027
3028 let add5: single_param_fn = bind add(_, 5);
3029
3030 assert (add(4,5) == add4(5));
3031 assert (add(4,5) == add5(4));
3032
3033 @end example
3034
3035 A @code{bind} expression generally stores a copy of the bound arguments in the
3036 hidden, boxed tuple, owned by the resulting first-class function. For each
3037 bound slot in the bound function's signature, space is allocated in the hidden
3038 tuple and populated with a copy of the bound value.
3039
3040 The @code{bind} expression is a lightweight mechanism for simulating the more
3041 elaborate construct of @emph{lexical closures} that exist in other
3042 languages. Rust has no support for lexical closures, but many realistic uses
3043 of them can be achieved with @code{bind} expressions.
3044
3045
3046 @node       Ref.Expr.Ret
3047 @subsection Ref.Expr.Ret
3048 @c * Ref.Expr.Ret::                Expression for stopping and producing a value.
3049 @cindex Return expression
3050
3051 Executing a @code{ret} expression@footnote{A @code{ret} expression is analogous
3052 to a @code{return} expression in the C family.}  copies a value into the output
3053 slot of the current function, destroys the current function activation frame,
3054 and transfers control to the caller frame.
3055
3056 An example of a @code{ret} expression:
3057 @example
3058 fn max(a: int, b: int) -> int @{
3059    if a > b @{
3060       ret a;
3061    @}
3062    ret b;
3063 @}
3064 @end example
3065
3066 @ignore
3067 @node       Ref.Expr.Be
3068 @subsection Ref.Expr.Be
3069 @c * Ref.Expr.Be::                 Expression for stopping and executing a tail call.
3070 @cindex Be expression
3071 @cindex Tail calls
3072
3073 Executing a @code{be} expression @footnote{A @code{be} expression in is
3074 analogous to a @code{become} expression in Newsqueak or Alef.}  destroys the
3075 current function activation frame and replaces it with an activation frame for
3076 the called function. In other words, @code{be} executes a tail-call. The
3077 syntactic form of a @code{be} expression is therefore limited to @emph{tail
3078 position}: its argument must be a @emph{call expression}, and it must be the
3079 last expression in a block.
3080
3081 An example of a @code{be} expression:
3082 @example
3083 fn print_loop(n: int) @{
3084   if n <= 0 @{
3085     ret;
3086   @} else @{
3087     print_int(n);
3088     be print_loop(n-1);
3089   @}
3090 @}
3091 @end example
3092
3093 The above example executes in constant space, replacing each frame with a new
3094 copy of itself.
3095 @end ignore
3096
3097
3098 @node       Ref.Expr.Put
3099 @subsection Ref.Expr.Put
3100 @c * Ref.Expr.Put::                Expression for pausing and producing a value.
3101 @cindex Put expression
3102 @cindex Iterators
3103
3104 Executing a @code{put} expression copies a value into the output slot of the
3105 current iterator, suspends execution of the current iterator, and transfers
3106 control to the current put-recipient frame.
3107
3108 A @code{put} expression is only valid within an iterator.  @footnote{A
3109 @code{put} expression is analogous to a @code{yield} expression in the CLU, and
3110 Sather languages, or in more recent languages providing a ``generator''
3111 facility, such as Python, Javascript or C#. Like the generators of CLU and
3112 Sather but @emph{unlike} these later languages, Rust's iterators reside on the
3113 stack and obey a strict stack discipline.} The current put-recipient will
3114 eventually resume the suspended iterator containing the @code{put} expression,
3115 either continuing execution after the @code{put} expression, or terminating its
3116 execution and destroying the iterator frame.
3117
3118
3119 @node       Ref.Expr.Fail
3120 @subsection Ref.Expr.Fail
3121 @c * Ref.Expr.Fail::               Expression for causing task failure.
3122 @cindex Fail expression
3123 @cindex Failure
3124 @cindex Unwinding
3125
3126 Executing a @code{fail} expression causes a task to enter the @emph{failing}
3127 state. In the @emph{failing} state, a task unwinds its stack, destroying all
3128 frames and freeing all resources until it reaches its entry frame, at which
3129 point it halts execution in the @emph{dead} state.
3130
3131 @node       Ref.Expr.Log
3132 @subsection Ref.Expr.Log
3133 @c * Ref.Expr.Log::                Expression for logging values to diagnostic buffers.
3134 @cindex Log expression
3135 @cindex Logging
3136
3137 Executing a @code{log} expression may, depending on runtime configuration,
3138 cause a value to be appended to an internal diagnostic logging buffer provided
3139 by the runtime or emitted to a system console. Log expressions are enabled or
3140 disabled dynamically at run-time on a per-task and per-item
3141 basis. @xref{Ref.Run.Log}.
3142
3143 @example
3144 @end example
3145
3146 @node       Ref.Expr.Note
3147 @subsection Ref.Expr.Note
3148 @c * Ref.Expr.Note::                Expression for logging values during failure.
3149 @cindex Note expression
3150 @cindex Logging
3151 @cindex Unwinding
3152 @cindex Failure
3153
3154 A @code{note} expression has no effect during normal execution. The purpose of
3155 a @code{note} expression is to provide additional diagnostic information to the
3156 logging subsystem during task failure. @xref{Ref.Expr.Log}. Using @code{note}
3157 expressions, normal diagnostic logging can be kept relatively sparse, while
3158 still providing verbose diagnostic ``back-traces'' when a task fails.
3159
3160 When a task is failing, control frames @emph{unwind} from the innermost frame
3161 to the outermost, and from the innermost lexical block within an unwinding
3162 frame to the outermost. When unwinding a lexical block, the runtime processes
3163 all the @code{note} expressions in the block sequentially, from the first
3164 expression of the block to the last.  During processing, a @code{note}
3165 expression has equivalent meaning to a @code{log} expression: it causes the
3166 runtime to append the argument of the @code{note} to the internal logging
3167 diagnostic buffer.
3168
3169 An example of a @code{note} expression:
3170 @example
3171 fn read_file_lines(path: str) -> [str] @{
3172     note path;
3173     let r: [str];
3174     let f: file = open_read(path);
3175     for each s: str in lines(f) @{
3176         vec::append(r,s);
3177     @}
3178     ret r;
3179 @}
3180 @end example
3181
3182 In this example, if the task fails while attempting to open or read a file,
3183 the runtime will log the path name that was being read. If the function
3184 completes normally, the runtime will not log the path.
3185
3186 A value that is marked by a @code{note} expression is @emph{not} copied aside
3187 when control passes through the @code{note}. In other words, if a @code{note}
3188 expression notes a particular @var{lval}, and code after the @code{note}
3189 mutates that slot, and then a subsequent failure occurs, the @emph{mutated}
3190 value will be logged during unwinding, @emph{not} the original value that was
3191 denoted by the @var{lval} at the moment control passed through the @code{note}
3192 expression.
3193
3194 @node       Ref.Expr.While
3195 @subsection Ref.Expr.While
3196 @c * Ref.Expr.While::              Expression for simple conditional looping.
3197 @cindex While expression
3198 @cindex Loops
3199 @cindex Control-flow
3200
3201 A @code{while} expression is a loop construct. A @code{while} loop may be
3202 either a simple @code{while} or a @code{do}-@code{while} loop.
3203
3204 In the case of a simple @code{while}, the loop begins by evaluating the
3205 boolean loop conditional expression. If the loop conditional expression
3206 evaluates to @code{true}, the loop body block executes and control returns to
3207 the loop conditional expression. If the loop conditional expression evaluates
3208 to @code{false}, the @code{while} expression completes.
3209
3210 In the case of a @code{do}-@code{while}, the loop begins with an execution of
3211 the loop body. After the loop body executes, it evaluates the loop conditional
3212 expression. If it evaluates to @code{true}, control returns to the beginning
3213 of the loop body. If it evaluates to @code{false}, control exits the loop.
3214
3215 An example of a simple @code{while} expression:
3216 @example
3217 while (i < 10) @{
3218     print("hello\n");
3219     i = i + 1;
3220 @}
3221 @end example
3222
3223 An example of a @code{do}-@code{while} expression:
3224 @example
3225 do @{
3226     print("hello\n");
3227     i = i + 1;
3228 @} while (i < 10);
3229 @end example
3230
3231 @node       Ref.Expr.Break
3232 @subsection Ref.Expr.Break
3233 @c * Ref.Expr.Break::              Expression for terminating a loop.
3234 @cindex Break expression
3235 @cindex Loops
3236 @cindex Control-flow
3237
3238 Executing a @code{break} expression immediately terminates the innermost loop
3239 enclosing it. It is only permitted in the body of a loop.
3240
3241 @node       Ref.Expr.Cont
3242 @subsection Ref.Expr.Cont
3243 @c * Ref.Expr.Cont::               Expression for terminating a single loop iteration.
3244 @cindex Continue expression
3245 @cindex Loops
3246 @cindex Control-flow
3247
3248 Executing a @code{cont} expression immediately terminates the current iteration
3249 of the innermost loop enclosing it, returning control to the loop
3250 @emph{head}. In the case of a @code{while} loop, the head is the conditional
3251 expression controlling the loop. In the case of a @code{for} or @code{for
3252 each} loop, the head is the iterator or vector-element increment controlling the
3253 loop.
3254
3255 A @code{cont} expression is only permitted in the body of a loop.
3256
3257
3258 @node       Ref.Expr.For
3259 @subsection Ref.Expr.For
3260 @c * Ref.Expr.For::                Expression for looping over strings and vectors.
3261 @cindex For expression
3262 @cindex Loops
3263 @cindex Control-flow
3264
3265 A @dfn{for loop} is controlled by a vector or string. The for loop
3266 bounds-checks the underlying sequence @emph{once} when initiating the loop,
3267 then repeatedly copies each value of the underlying sequence into the element
3268 variable, executing the loop body once per copy.
3269
3270 Example a for loop:
3271 @example
3272 let v: [foo] = [a, b, c];
3273
3274 for e: foo in v @{
3275     bar(e);
3276 @}
3277 @end example
3278
3279 @node          Ref.Expr.Foreach
3280 @subsection    Ref.Expr.Foreach
3281 @c * Ref.Expr.Foreach::           Expression for general conditional looping.
3282 @cindex Foreach expression
3283 @cindex Loops
3284 @cindex Control-flow
3285
3286 An @dfn{foreach loop} is denoted by the @code{for each} keywords, and is
3287 controlled by an iterator. The loop executes once for each value @code{put} by
3288 the iterator.  When the iterator returns or fails, the loop terminates.
3289
3290 Example of a foreach loop:
3291 @example
3292 let txt: str;
3293 let lines: [str];
3294 for each s: str in str::split(txt, "\n") @{
3295     vec::push(lines, s);
3296 @}
3297 @end example
3298
3299
3300 @node       Ref.Expr.If
3301 @subsection Ref.Expr.If
3302 @c * Ref.Expr.If::                 Expression for simple conditional branching.
3303 @cindex If expression
3304 @cindex Control-flow
3305
3306 An @code{if} expression is a conditional branch in program control. The form of
3307 an @code{if} expression is a condition expression, followed by a consequent
3308 block, any number of @code{else if} conditions and blocks, and an optional
3309 trailing @code{else} block. The condition expressions must have type
3310 @code{bool}. If a condition expression evaluates to @code{true}, the
3311 consequent block is executed and any subsequent @code{else if} or @code{else}
3312 block is skipped. If a condition expression evaluates to @code{false}, the
3313 consequent block is skipped and any subsequent @code{else if} condition is
3314 evaluated. If all @code{if} and @code{else if} conditions evaluate to @code{false}
3315 then any @code{else} block is executed.
3316
3317 @node       Ref.Expr.Alt
3318 @subsection Ref.Expr.Alt
3319 @c * Ref.Expr.Alt::                Expression for complex conditional branching.
3320 @cindex Alt expression
3321 @cindex Control-flow
3322 @cindex Switch expression, see @i{Alt expression}
3323
3324 An @code{alt} expression is a multi-directional branch in program control.
3325 There are two kinds of @code{alt} expression: pattern @code{alt} expressions
3326 and @code{alt type} expressions.
3327
3328 The form of each kind of @code{alt} is similar: an initial @emph{head} that
3329 describes the criteria for branching, followed by a sequence of zero or more
3330 @emph{arms}, each of which describes a @emph{case} and provides a @emph{block}
3331 of expressions associated with the case. When an @code{alt} is executed,
3332 control enters the head, determines which of the cases to branch to, branches
3333 to the block associated with the chosen case, and then proceeds to the
3334 expression following the @code{alt} when the case block completes.
3335
3336 @menu
3337 * Ref.Expr.Alt.Pat::          Expression for branching on pattern matches.
3338 * Ref.Expr.Alt.Type::         Expression for branching on types.
3339 @end menu
3340
3341 @node          Ref.Expr.Alt.Pat
3342 @subsubsection Ref.Expr.Alt.Pat
3343 @c * Ref.Expr.Alt.Pat::            Expression for branching on pattern matches.
3344 @cindex Pattern alt expression
3345 @cindex Control-flow
3346
3347 A pattern @code{alt} expression branches on a @emph{pattern}. The exact form of
3348 matching that occurs depends on the pattern. Patterns consist of some
3349 combination of literals, tag constructors, variable binding specifications and
3350 placeholders (@code{_}). A pattern @code{alt} has a @emph{head expression},
3351 which is the value to compare to the patterns. The type of the patterns must
3352 equal the type of the head expression.
3353
3354 To execute a pattern @code{alt} expression, first the head expression is
3355 evaluated, then its value is sequentially compared to the patterns in the arms
3356 until a match is found. The first arm with a matching @code{case} pattern is
3357 chosen as the branch target of the @code{alt}, any variables bound by the
3358 pattern are assigned to local slots in the arm's block, and control enters the
3359 block.
3360
3361 An example of a pattern @code{alt} expression:
3362
3363 @example
3364 type list<X> = tag(nil, cons(X, @@list<X>));
3365
3366 let x: list<int> = cons(10, cons(11, nil));
3367
3368 alt x @{
3369     case (cons(a, cons(b, _))) @{
3370         process_pair(a,b);
3371     @}
3372     case (cons(v=10, _)) @{
3373         process_ten(v);
3374     @}
3375     case (_) @{
3376         fail;
3377     @}
3378 @}
3379 @end example
3380
3381
3382 @node          Ref.Expr.Alt.Type
3383 @subsubsection Ref.Expr.Alt.Type
3384 @c * Ref.Expr.Alt.Type::           Expression for branching on type.
3385 @cindex Type alt expression
3386 @cindex Control-flow
3387
3388 An @code{alt type} expression is similar to a pattern @code{alt}, but branches
3389 on the @emph{type} of its head expression, rather than the value. The head
3390 expression of an @code{alt type} expression must be of type @code{any}, and the
3391 arms of the expression are slot patterns rather than value patterns. Control
3392 branches to the arm with a @code{case} that matches the @emph{actual type} of
3393 the value in the @code{any}.
3394
3395 An example of an @code{alt type} expression:
3396
3397 @example
3398 let x: any = foo();
3399
3400 alt type (x) @{
3401     case (int i) @{
3402         ret i;
3403     @}
3404     case (list<int> li) @{
3405         ret int_list_sum(li);
3406     @}
3407     case (list<X> lx) @{
3408         ret list_len(lx);
3409     @}
3410     case (_) @{
3411         ret 0;
3412     @}
3413 @}
3414 @end example
3415
3416
3417 @node       Ref.Expr.Prove
3418 @subsection Ref.Expr.Prove
3419 @c * Ref.Expr.Prove::              Expression for static assertion of typestate.
3420 @cindex Prove expression
3421 @cindex Typestate system
3422
3423 A @code{prove} expression has no run-time effect. Its purpose is to statically
3424 check (and document) that its argument constraint holds at its expression entry
3425 point. If its argument typestate does not hold, under the typestate algorithm,
3426 the program containing it will fail to compile.
3427
3428 @node       Ref.Expr.Check
3429 @subsection Ref.Expr.Check
3430 @c * Ref.Expr.Check::              Expression for dynamic assertion of typestate.
3431 @cindex Check expression
3432 @cindex Typestate system
3433
3434 A @code{check} expression connects dynamic assertions made at run-time to the
3435 static typestate system. A @code{check} expression takes a constraint to check
3436 at run-time. If the constraint holds at run-time, control passes through the
3437 @code{check} and on to the next expression in the enclosing block. If the
3438 condition fails to hold at run-time, the @code{check} expression behaves as a
3439 @code{fail} expression.
3440
3441 The typestate algorithm is built around @code{check} expressions, and in
3442 particular the fact that control @emph{will not pass} a check expression with a
3443 condition that fails to hold. The typestate algorithm can therefore assume
3444 that the (static) postcondition of a @code{check} expression includes the
3445 checked constraint itself. From there, the typestate algorithm can perform
3446 dataflow calculations on subsequent expressions, propagating conditions forward
3447 and statically comparing implied states and their
3448 specifications. @xref{Ref.Typestate}.
3449
3450 @example
3451 pure fn even(x: int) -> bool @{
3452     ret x & 1 == 0;
3453 @}
3454
3455 fn print_even(x: int) : even(x) @{
3456     print(x);
3457 @}
3458
3459 fn test() @{
3460     let y: int = 8;
3461
3462     // Cannot call print_even(y) here.
3463
3464     check even(y);
3465
3466     // Can call print_even(y) here, since even(y) now holds.
3467     print_even(y);
3468 @}
3469 @end example
3470
3471 @node       Ref.Expr.Claim
3472 @subsection Ref.Expr.Claim
3473 @c * Ref.Expr.Claim::              Expression for static (unsafe) or dynamic assertion of typestate.
3474 @cindex Claim expression
3475 @cindex Typestate system
3476
3477 A @code{claim} expression is an unsafe variant on a @code{check} expression
3478 that is not actually checked at runtime. Thus, using a @code{claim} implies a
3479 proof obligation to ensure---without compiler assistance---that an assertion
3480 always holds.
3481
3482 Setting a runtime flag can turn all @code{claim} expressions
3483 into @code{check} expressions in a compiled Rust program, but the default is to not check the assertion
3484 contained in a @code{claim}. The idea behind @code{claim} is that performance profiling might identify a
3485 few bottlenecks in the code where actually checking a given callee's predicate
3486 is too expensive; @code{claim} allows the code to typecheck without removing
3487 the predicate check at every other call site.
3488
3489 @node       Ref.Expr.IfCheck
3490 @subsection Ref.Expr.IfCheck
3491 @c * Ref.Expr.IfCheck::            Expression for dynamic testing of typestate.
3492 @cindex If check expression
3493 @cindex Typestate system
3494 @cindex Control-flow
3495
3496 An @code{if check} expression combines a @code{if} expression and a @code{check}
3497 expression in an indivisible unit that can be used to build more complex
3498 conditional control-flow than the @code{check} expression affords.
3499
3500 In fact, @code{if check} is a ``more primitive'' expression than @code{check};
3501 instances of the latter can be rewritten as instances of the former. The
3502 following two examples are equivalent:
3503
3504 @sp 1
3505 Example using @code{check}:
3506 @example
3507 check even(x);
3508 print_even(x);
3509 @end example
3510
3511 @sp 1
3512 Equivalent example using @code{if check}:
3513 @example
3514 if check even(x) @{
3515     print_even(x);
3516 @} else @{
3517     fail;
3518 @}
3519 @end example
3520
3521 @node       Ref.Expr.Assert
3522 @subsection Ref.Expr.Assert
3523 @c * Ref.Expr.Assert::            Expression that halts the program if a boolean condition fails to hold.
3524 @cindex Assertions
3525
3526 An @code{assert} expression is similar to a @code{check} expression, except
3527 the condition may be any boolean-typed expression, and the compiler makes no
3528 use of the knowledge that the condition holds if the program continues to
3529 execute after the @code{assert}.
3530
3531 @node       Ref.Expr.AnonObj
3532 @subsection Ref.Expr.AnonObj
3533 @c * Ref.Expr.AnonObj::           Expression that extends an object with additional methods.
3534 @cindex Anonymous objects
3535
3536 An @emph{anonymous object} expression extends an existing object with methods.
3537
3538 @page
3539 @node    Ref.Run
3540 @section Ref.Run
3541 @c * Ref.Run::                     Organization of runtime services.
3542 @cindex Runtime library
3543
3544 The Rust @dfn{runtime} is a relatively compact collection of C and Rust code
3545 that provides fundamental services and datatypes to all Rust tasks at
3546 run-time. It is smaller and simpler than many modern language runtimes. It is
3547 tightly integrated into the language's execution model of memory, tasks,
3548 communication, reflection, logging and signal handling.
3549
3550 @menu
3551 * Ref.Run.Mem::                 Runtime memory management service.
3552 * Ref.Run.Type::                Runtime built-in type services.
3553 * Ref.Run.Comm::                Runtime communication service.
3554 * Ref.Run.Log::                 Runtime logging system.
3555 * Ref.Run.Sig::                 Runtime signal handler.
3556 @end menu
3557
3558 @node       Ref.Run.Mem
3559 @subsection Ref.Run.Mem
3560 @c * Ref.Run.Mem::                 Runtime memory management service.
3561 @cindex Memory allocation
3562
3563 The runtime memory-management system is based on a @emph{service-provider
3564 interface}, through which the runtime requests blocks of memory from its
3565 environment and releases them back to its environment when they are no longer
3566 in use. The default implementation of the service-provider interface consists
3567 of the C runtime functions @code{malloc} and @code{free}.
3568
3569 The runtime memory-management system in turn supplies Rust tasks with
3570 facilities for allocating, extending and releasing stacks, as well as
3571 allocating and freeing boxed values.
3572
3573 @node       Ref.Run.Type
3574 @subsection Ref.Run.Type
3575 @c * Ref.Run.Mem::                 Runtime built-in type services.
3576 @cindex Built-in types
3577
3578 The runtime provides C and Rust code to assist with various built-in types,
3579 such as vectors, strings, bignums, and the low level communication system
3580 (ports, channels, tasks).
3581
3582 Support for other built-in types such as simple types, tuples, records, and
3583 tags is open-coded by the Rust compiler.
3584
3585 @node       Ref.Run.Comm
3586 @subsection Ref.Run.Comm
3587 @c * Ref.Run.Comm::                Runtime communication service.
3588 @cindex Communication
3589 @cindex Process
3590 @cindex Thread
3591
3592 The runtime provides code to manage inter-task communication.  This includes
3593 the system of task-lifecycle state transitions depending on the contents of
3594 queues, as well as code to copy values between queues and their recipients and
3595 to serialize values for transmission over operating-system inter-process
3596 communication facilities.
3597
3598 @node       Ref.Run.Log
3599 @subsection Ref.Run.Log
3600 @c * Ref.Run.Log::                 Runtime logging system.
3601 @cindex Logging
3602
3603 The runtime contains a system for directing logging expressions to a logging
3604 console and/or internal logging buffers. @xref{Ref.Expr.Log}.  Logging
3605 expressions can be enabled or disabled via a two-dimensional filtering process:
3606
3607 @itemize
3608
3609 @sp 1
3610 @item
3611 By Item
3612
3613 Each @emph{item} (module, function, iterator, object, type) in Rust has a
3614 static path within its crate module, and can have logging enabled or
3615 disabled on a path-prefix basis.
3616
3617 @sp 1
3618 @item
3619 By Task
3620
3621 Each @emph{task} in a running Rust program has a unique ownership relation
3622 through the task ownership tree, and can have logging enabled or disabled on
3623 an ownership-ancestry basis.
3624 @end itemize
3625
3626 Logging is integrated into the language for efficiency reasons, as well as the
3627 need to filter logs based on these two built-in dimensions.
3628
3629 @node       Ref.Run.Sig
3630 @subsection Ref.Run.Sig
3631 @c * Ref.Run.Sig::               Runtime signal handler.
3632 @cindex Signals
3633
3634 The runtime signal-handling system is driven by a signal-dispatch table and a
3635 signal queue associated with each task. Sending a signal to a task inserts the
3636 signal into the task's signal queue and marks the task as having a pending
3637 signal. At the next scheduling opportunity, the runtime processes signals in
3638 the task's queue using its dispatch table. The signal queue memory is charged
3639 to the task; if the queue grows too big, the task will fail.
3640
3641 @c ############################################################
3642 @c end main body of nodes
3643 @c ############################################################
3644
3645 @page
3646 @node    Index
3647 @chapter Index
3648
3649 @printindex cp
3650
3651 @bye
3652
3653 @c Local Variables:
3654 @c mode: texinfo
3655 @c fill-column: 78;
3656 @c indent-tabs-mode: nil
3657 @c buffer-file-coding-system: utf-8-unix
3658 @c compile-command: "make -C $RBUILD -k 2>&1 | sed -e 's/\\/x\\//x:\\//g'";
3659 @c End: