1 % The Rust References and Lifetimes Guide
5 References are one of the more flexible and powerful tools available in
6 Rust. A reference can point anywhere: into the managed or exchange
7 heap, into the stack, and even into the interior of another data structure. A
8 reference is as flexible as a C pointer or C++ reference. However,
9 unlike C and C++ compilers, the Rust compiler includes special static checks
10 that ensure that programs use references safely. Another advantage of
11 references is that they are invisible to the garbage collector, so
12 working with references helps reduce the overhead of automatic memory
15 Despite their complete safety, a reference's representation at runtime
16 is the same as that of an ordinary pointer in a C program. They introduce zero
17 overhead. The compiler does all safety checks at compile time.
19 Although references have rather elaborate theoretical
20 underpinnings (region pointers), the core concepts will be familiar to
21 anyone who has worked with C or C++. Therefore, the best way to explain
22 how they are used—and their limitations—is probably just to work
23 through several examples.
27 References, sometimes known as *borrowed pointers*, are only valid for
28 a limited duration. References never claim any kind of ownership
29 over the data that they point to: instead, they are used for cases
30 where you would like to use data for a short time.
32 As an example, consider a simple struct type `Point`:
35 struct Point {x: f64, y: f64}
38 We can use this simple definition to allocate points in many different ways. For
39 example, in this code, each of these three local variables contains a
40 point, but allocated in a different place:
43 # struct Point {x: f64, y: f64}
44 let on_the_stack : Point = Point {x: 3.0, y: 4.0};
45 let managed_box : @Point = @Point {x: 5.0, y: 1.0};
46 let owned_box : Box<Point> = box Point {x: 7.0, y: 9.0};
49 Suppose we wanted to write a procedure that computed the distance between any
50 two points, no matter where they were stored. For example, we might like to
51 compute the distance between `on_the_stack` and `managed_box`, or between
52 `managed_box` and `owned_box`. One option is to define a function that takes
53 two arguments of type `Point`—that is, it takes the points by value. But if we
54 define it this way, calling the function will cause the points to be
55 copied. For points, this is probably not so bad, but often copies are
56 expensive. Worse, if the data type contains mutable fields, copying can change
57 the semantics of your program in unexpected ways. So we'd like to define a
58 function that takes the points by pointer. We can use references to do
62 # struct Point {x: f64, y: f64}
63 # fn sqrt(f: f64) -> f64 { 0.0 }
64 fn compute_distance(p1: &Point, p2: &Point) -> f64 {
65 let x_d = p1.x - p2.x;
66 let y_d = p1.y - p2.y;
67 sqrt(x_d * x_d + y_d * y_d)
71 Now we can call `compute_distance()` in various ways:
74 # struct Point {x: f64, y: f64}
75 # let on_the_stack : Point = Point{x: 3.0, y: 4.0};
76 # let managed_box : @Point = @Point{x: 5.0, y: 1.0};
77 # let owned_box : Box<Point> = box Point{x: 7.0, y: 9.0};
78 # fn compute_distance(p1: &Point, p2: &Point) -> f64 { 0.0 }
79 compute_distance(&on_the_stack, managed_box);
80 compute_distance(managed_box, owned_box);
83 Here, the `&` operator takes the address of the variable
84 `on_the_stack`; this is because `on_the_stack` has the type `Point`
85 (that is, a struct value) and we have to take its address to get a
86 value. We also call this _borrowing_ the local variable
87 `on_the_stack`, because we have created an alias: that is, another
88 name for the same data.
90 In contrast, we can pass the boxes `managed_box` and `owned_box` to
91 `compute_distance` directly. The compiler automatically converts a box like
92 `@Point` or `~Point` to a reference like `&Point`. This is another form
93 of borrowing: in this case, the caller lends the contents of the managed or
94 owned box to the callee.
96 Whenever a caller lends data to a callee, there are some limitations on what
97 the caller can do with the original. For example, if the contents of a
98 variable have been lent out, you cannot send that variable to another task. In
99 addition, the compiler will reject any code that might cause the borrowed
100 value to be freed or overwrite its component fields with values of different
101 types (I'll get into what kinds of actions those are shortly). This rule
102 should make intuitive sense: you must wait for a borrower to return the value
103 that you lent it (that is, wait for the reference to go out of scope)
104 before you can make full use of it again.
106 # Other uses for the & operator
108 In the previous example, the value `on_the_stack` was defined like so:
111 # struct Point {x: f64, y: f64}
112 let on_the_stack: Point = Point {x: 3.0, y: 4.0};
115 This declaration means that code can only pass `Point` by value to other
116 functions. As a consequence, we had to explicitly take the address of
117 `on_the_stack` to get a reference. Sometimes however it is more
118 convenient to move the & operator into the definition of `on_the_stack`:
121 # struct Point {x: f64, y: f64}
122 let on_the_stack2: &Point = &Point {x: 3.0, y: 4.0};
125 Applying `&` to an rvalue (non-assignable location) is just a convenient
126 shorthand for creating a temporary and taking its address. A more verbose
127 way to write the same code is:
130 # struct Point {x: f64, y: f64}
131 let tmp = Point {x: 3.0, y: 4.0};
132 let on_the_stack2 : &Point = &tmp;
135 # Taking the address of fields
137 As in C, the `&` operator is not limited to taking the address of
138 local variables. It can also take the address of fields or
139 individual array elements. For example, consider this type definition
143 struct Point {x: f64, y: f64} // as before
144 struct Size {w: f64, h: f64} // as before
145 struct Rectangle {origin: Point, size: Size}
148 Now, as before, we can define rectangles in a few different ways:
151 # struct Point {x: f64, y: f64}
152 # struct Size {w: f64, h: f64} // as before
153 # struct Rectangle {origin: Point, size: Size}
154 let rect_stack = &Rectangle {origin: Point {x: 1.0, y: 2.0},
155 size: Size {w: 3.0, h: 4.0}};
156 let rect_managed = @Rectangle {origin: Point {x: 3.0, y: 4.0},
157 size: Size {w: 3.0, h: 4.0}};
158 let rect_owned = box Rectangle {origin: Point {x: 5.0, y: 6.0},
159 size: Size {w: 3.0, h: 4.0}};
162 In each case, we can extract out individual subcomponents with the `&`
163 operator. For example, I could write:
166 # struct Point {x: f64, y: f64} // as before
167 # struct Size {w: f64, h: f64} // as before
168 # struct Rectangle {origin: Point, size: Size}
169 # let rect_stack = &Rectangle {origin: Point {x: 1.0, y: 2.0}, size: Size {w: 3.0, h: 4.0}};
170 # let rect_managed = @Rectangle {origin: Point {x: 3.0, y: 4.0}, size: Size {w: 3.0, h: 4.0}};
171 # let rect_owned = box Rectangle {origin: Point {x: 5.0, y: 6.0}, size: Size {w: 3.0, h: 4.0}};
172 # fn compute_distance(p1: &Point, p2: &Point) -> f64 { 0.0 }
173 compute_distance(&rect_stack.origin, &rect_managed.origin);
176 which would borrow the field `origin` from the rectangle on the stack
177 as well as from the managed box, and then compute the distance between them.
179 # Borrowing managed boxes and rooting
181 We’ve seen a few examples so far of borrowing heap boxes, both managed
182 and owned. Up till this point, we’ve glossed over issues of
183 safety. As stated in the introduction, at runtime a reference
184 is simply a pointer, nothing more. Therefore, avoiding C's problems
185 with dangling pointers requires a compile-time safety check.
187 The basis for the check is the notion of _lifetimes_. A lifetime is a
188 static approximation of the span of execution during which the pointer
189 is valid: it always corresponds to some expression or block within the
190 program. Code inside that expression can use the pointer without
191 restrictions. But if the pointer escapes from that expression (for
192 example, if the expression contains an assignment expression that
193 assigns the pointer to a mutable field of a data structure with a
194 broader scope than the pointer itself), the compiler reports an
195 error. We'll be discussing lifetimes more in the examples to come, and
196 a more thorough introduction is also available.
198 When the `&` operator creates a reference, the compiler must
199 ensure that the pointer remains valid for its entire
200 lifetime. Sometimes this is relatively easy, such as when taking the
201 address of a local variable or a field that is stored on the stack:
206 let mut x = X { f: 3 };
207 let y = &mut x.f; // -+ L
212 Here, the lifetime of the reference `y` is simply L, the
213 remainder of the function body. The compiler need not do any other
214 work to prove that code will not free `x.f`. This is true even if the
217 The situation gets more complex when borrowing data inside heap boxes:
220 # struct X { f: int }
222 let mut x = @X { f: 3 };
223 let y = &x.f; // -+ L
228 In this example, the value `x` is a heap box, and `y` is therefore a
229 pointer into that heap box. Again the lifetime of `y` is L, the
230 remainder of the function body. But there is a crucial difference:
231 suppose `x` were to be reassigned during the lifetime L? If the
232 compiler isn't careful, the managed box could become *unrooted*, and
233 would therefore be subject to garbage collection. A heap box that is
234 unrooted is one such that no pointer values in the heap point to
235 it. It would violate memory safety for the box that was originally
236 assigned to `x` to be garbage-collected, since a non-heap
237 pointer *`y`* still points into it.
239 > *Note:* Our current implementation implements the garbage collector
240 > using reference counting and cycle detection.
242 For this reason, whenever an `&` expression borrows the interior of a
243 managed box stored in a mutable location, the compiler inserts a
244 temporary that ensures that the managed box remains live for the
245 entire lifetime. So, the above example would be compiled as if it were
249 # struct X { f: int }
251 let mut x = @X {f: 3};
253 let y = &x1.f; // -+ L
258 Now if `x` is reassigned, the pointer `y` will still remain valid. This
259 process is called *rooting*.
261 # Borrowing owned boxes
263 The previous example demonstrated *rooting*, the process by which the
264 compiler ensures that managed boxes remain live for the duration of a
265 borrow. Unfortunately, rooting does not work for borrows of owned
266 boxes, because it is not possible to have two references to an owned
269 For owned boxes, therefore, the compiler will only allow a borrow *if
270 the compiler can guarantee that the owned box will not be reassigned
271 or moved for the lifetime of the pointer*. This does not necessarily
272 mean that the owned box is stored in immutable memory. For example,
273 the following function is legal:
276 # fn some_condition() -> bool { true }
277 # struct Foo { f: int }
278 fn example3() -> int {
279 let mut x = box Foo {f: 3};
280 if some_condition() {
281 let y = &x.f; // -+ L
290 Here, as before, the interior of the variable `x` is being borrowed
291 and `x` is declared as mutable. However, the compiler can prove that
292 `x` is not assigned anywhere in the lifetime L of the variable
293 `y`. Therefore, it accepts the function, even though `x` is mutable
294 and in fact is mutated later in the function.
296 It may not be clear why we are so concerned about mutating a borrowed
297 variable. The reason is that the runtime system frees any owned box
298 _as soon as its owning reference changes or goes out of
299 scope_. Therefore, a program like this is illegal (and would be
300 rejected by the compiler):
303 fn example3() -> int {
304 let mut x = box X {f: 3};
306 x = box X {f: 4}; // Error reported here.
311 To make this clearer, consider this diagram showing the state of
312 memory immediately before the re-assignment of `x`:
318 | box {f:int} | ----+
321 +-------------+ | +---------+
326 Once the reassignment occurs, the memory will look like this:
331 x +-------------+ +---------+
332 | box {f:int} | -------> | f: 4 |
333 y +-------------+ +---------+
335 +-------------+ | +---------+
340 Here you can see that the variable `y` still points at the old box,
341 which has been freed.
343 In fact, the compiler can apply the same kind of reasoning to any
344 memory that is _(uniquely) owned by the stack frame_. So we could
345 modify the previous example to introduce additional owned pointers
346 and structs, and the compiler will still be able to detect possible
350 fn example3() -> int {
352 struct S { f: Box<R> }
354 let mut x = box S {f: box R {g: 3}};
356 x = box S {f: box R {g: 4}}; // Error reported here.
357 x.f = box R {g: 5}; // Error reported here.
362 In this case, two errors are reported, one when the variable `x` is
363 modified and another when `x.f` is modified. Either modification would
364 invalidate the pointer `y`.
366 # Borrowing and enums
368 The previous example showed that the type system forbids any borrowing
369 of owned boxes found in aliasable, mutable memory. This restriction
370 prevents pointers from pointing into freed memory. There is one other
371 case where the compiler must be very careful to ensure that pointers
372 remain valid: pointers into the interior of an `enum`.
374 As an example, let’s look at the following `shape` type that can
375 represent both rectangles and circles:
378 struct Point {x: f64, y: f64}; // as before
379 struct Size {w: f64, h: f64}; // as before
381 Circle(Point, f64), // origin, radius
382 Rectangle(Point, Size) // upper-left, dimensions
386 Now we might write a function to compute the area of a shape. This
387 function takes a reference to a shape, to avoid the need for
391 # struct Point {x: f64, y: f64}; // as before
392 # struct Size {w: f64, h: f64}; // as before
394 # Circle(Point, f64), // origin, radius
395 # Rectangle(Point, Size) // upper-left, dimensions
397 # static tau: f64 = 6.28;
398 fn compute_area(shape: &Shape) -> f64 {
400 Circle(_, radius) => 0.5 * tau * radius * radius,
401 Rectangle(_, ref size) => size.w * size.h
406 The first case matches against circles. Here, the pattern extracts the
407 radius from the shape variant and the action uses it to compute the
408 area of the circle. (Like any up-to-date engineer, we use the [tau
409 circle constant][tau] and not that dreadfully outdated notion of pi).
411 [tau]: http://www.math.utah.edu/~palais/pi.html
413 The second match is more interesting. Here we match against a
414 rectangle and extract its size: but rather than copy the `size`
415 struct, we use a by-reference binding to create a pointer to it. In
416 other words, a pattern binding like `ref size` binds the name `size`
417 to a pointer of type `&size` into the _interior of the enum_.
419 To make this more clear, let's look at a diagram of memory layout in
420 the case where `shape` points at a rectangle:
425 +-------+ +---------------+
426 | shape | ------> | rectangle( |
427 +-------+ | {x: f64, |
428 | size | -+ | y: f64}, |
429 +-------+ +----> | {w: f64, |
434 Here you can see that rectangular shapes are composed of five words of
435 memory. The first is a tag indicating which variant this enum is
436 (`rectangle`, in this case). The next two words are the `x` and `y`
437 fields for the point and the remaining two are the `w` and `h` fields
438 for the size. The binding `size` is then a pointer into the inside of
441 Perhaps you can see where the danger lies: if the shape were somehow
442 to be reassigned, perhaps to a circle, then although the memory used
443 to store that shape value would still be valid, _it would have a
444 different type_! The following diagram shows what memory would look
445 like if code overwrote `shape` with a circle:
450 +-------+ +---------------+
451 | shape | ------> | circle( |
452 +-------+ | {x: f64, |
453 | size | -+ | y: f64}, |
454 +-------+ +----> | f64) |
459 As you can see, the `size` pointer would be pointing at a `f64`
460 instead of a struct. This is not good: dereferencing the second field
461 of a `f64` as if it were a struct with two fields would be a memory
464 So, in fact, for every `ref` binding, the compiler will impose the
465 same rules as the ones we saw for borrowing the interior of an owned
466 box: it must be able to guarantee that the `enum` will not be
467 overwritten for the duration of the borrow. In fact, the compiler
468 would accept the example we gave earlier. The example is safe because
469 the shape pointer has type `&Shape`, which means "reference to
470 immutable memory containing a `shape`". If, however, the type of that
471 pointer were `&mut Shape`, then the ref binding would be ill-typed.
472 Just as with owned boxes, the compiler will permit `ref` bindings
473 into data owned by the stack frame even if the data are mutable,
474 but otherwise it requires that the data reside in immutable memory.
476 # Returning references
478 So far, all of the examples we have looked at, use references in a
479 “downward” direction. That is, a method or code block creates a
480 reference, then uses it within the same scope. It is also
481 possible to return references as the result of a function, but
482 as we'll see, doing so requires some explicit annotation.
484 For example, we could write a subroutine like this:
487 struct Point {x: f64, y: f64}
488 fn get_x<'r>(p: &'r Point) -> &'r f64 { &p.x }
491 Here, the function `get_x()` returns a pointer into the structure it
492 was given. The type of the parameter (`&'r Point`) and return type
493 (`&'r f64`) both use a new syntactic form that we have not seen so
494 far. Here the identifier `r` names the lifetime of the pointer
495 explicitly. So in effect, this function declares that it takes a
496 pointer with lifetime `r` and returns a pointer with that same
499 In general, it is only possible to return references if they
500 are derived from a parameter to the procedure. In that case, the
501 pointer result will always have the same lifetime as one of the
502 parameters; named lifetimes indicate which parameter that
505 In the previous examples, function parameter types did not include a
506 lifetime name. In those examples, the compiler simply creates a fresh
507 name for the lifetime automatically: that is, the lifetime name is
508 guaranteed to refer to a distinct lifetime from the lifetimes of all
511 Named lifetimes that appear in function signatures are conceptually
512 the same as the other lifetimes we have seen before, but they are a bit
513 abstract: they don’t refer to a specific expression within `get_x()`,
514 but rather to some expression within the *caller of `get_x()`*. The
515 lifetime `r` is actually a kind of *lifetime parameter*: it is defined
516 by the caller to `get_x()`, just as the value for the parameter `p` is
517 defined by that caller.
519 In any case, whatever the lifetime of `r` is, the pointer produced by
520 `&p.x` always has the same lifetime as `p` itself: a pointer to a
521 field of a struct is valid as long as the struct is valid. Therefore,
522 the compiler accepts the function `get_x()`.
524 To emphasize this point, let’s look at a variation on the example, this
525 time one that does not compile:
528 struct Point {x: f64, y: f64}
529 fn get_x_sh(p: @Point) -> &f64 {
530 &p.x // Error reported here
534 Here, the function `get_x_sh()` takes a managed box as input and
535 returns a reference. As before, the lifetime of the reference
536 that will be returned is a parameter (specified by the
537 caller). That means that `get_x_sh()` promises to return a reference
538 that is valid for as long as the caller would like: this is
539 subtly different from the first example, which promised to return a
540 pointer that was valid for as long as its pointer argument was valid.
542 Within `get_x_sh()`, we see the expression `&p.x` which takes the
543 address of a field of a managed box. The presence of this expression
544 implies that the compiler must guarantee that, so long as the
545 resulting pointer is valid, the managed box will not be reclaimed by
546 the garbage collector. But recall that `get_x_sh()` also promised to
547 return a pointer that was valid for as long as the caller wanted it to
548 be. Clearly, `get_x_sh()` is not in a position to make both of these
549 guarantees; in fact, it cannot guarantee that the pointer will remain
550 valid at all once it returns, as the parameter `p` may or may not be
551 live in the caller. Therefore, the compiler will report an error here.
553 In general, if you borrow a managed (or owned) box to create a
554 reference, it will only be valid within the function
555 and cannot be returned. This is why the typical way to return references
556 is to take references as input (the only other case in
557 which it can be legal to return a reference is if it
558 points at a static constant).
562 Lifetimes can be named and referenced. For example, the special lifetime
563 `'static`, which does not go out of scope, can be used to create global
564 variables and communicate between tasks (see the manual for use cases).
566 ## Parameter Lifetimes
568 Named lifetimes allow for grouping of parameters by lifetime.
569 For example, consider this function:
572 # struct Point {x: f64, y: f64}; // as before
573 # struct Size {w: f64, h: f64}; // as before
575 # Circle(Point, f64), // origin, radius
576 # Rectangle(Point, Size) // upper-left, dimensions
578 # fn compute_area(shape: &Shape) -> f64 { 0.0 }
579 fn select<'r, T>(shape: &'r Shape, threshold: f64,
580 a: &'r T, b: &'r T) -> &'r T {
581 if compute_area(shape) > threshold {a} else {b}
585 This function takes three references and assigns each the same
586 lifetime `r`. In practice, this means that, in the caller, the
587 lifetime `r` will be the *intersection of the lifetime of the three
588 region parameters*. This may be overly conservative, as in this
592 # struct Point {x: f64, y: f64}; // as before
593 # struct Size {w: f64, h: f64}; // as before
595 # Circle(Point, f64), // origin, radius
596 # Rectangle(Point, Size) // upper-left, dimensions
598 # fn compute_area(shape: &Shape) -> f64 { 0.0 }
599 # fn select<'r, T>(shape: &Shape, threshold: f64,
600 # a: &'r T, b: &'r T) -> &'r T {
601 # if compute_area(shape) > threshold {a} else {b}
604 fn select_based_on_unit_circle<'r, T>( // |-+ B
605 threshold: f64, a: &'r T, b: &'r T) -> &'r T { // | |
607 let shape = Circle(Point {x: 0., y: 0.}, 1.); // | |
608 select(&shape, threshold, a, b) // | |
613 In this call to `select()`, the lifetime of the first parameter shape
614 is B, the function body. Both of the second two parameters `a` and `b`
615 share the same lifetime, `r`, which is a lifetime parameter of
616 `select_based_on_unit_circle()`. The caller will infer the
617 intersection of these two lifetimes as the lifetime of the returned
618 value, and hence the return value of `select()` will be assigned a
619 lifetime of B. This will in turn lead to a compilation error, because
620 `select_based_on_unit_circle()` is supposed to return a value with the
623 To address this, we can modify the definition of `select()` to
624 distinguish the lifetime of the first parameter from the lifetime of
625 the latter two. After all, the first parameter is not being
626 returned. Here is how the new `select()` might look:
629 # struct Point {x: f64, y: f64}; // as before
630 # struct Size {w: f64, h: f64}; // as before
632 # Circle(Point, f64), // origin, radius
633 # Rectangle(Point, Size) // upper-left, dimensions
635 # fn compute_area(shape: &Shape) -> f64 { 0.0 }
636 fn select<'r, 'tmp, T>(shape: &'tmp Shape, threshold: f64,
637 a: &'r T, b: &'r T) -> &'r T {
638 if compute_area(shape) > threshold {a} else {b}
642 Here you can see that `shape`'s lifetime is now named `tmp`. The
643 parameters `a`, `b`, and the return value all have the lifetime `r`.
644 However, since the lifetime `tmp` is not returned, it would be more
645 concise to just omit the named lifetime for `shape` altogether:
648 # struct Point {x: f64, y: f64}; // as before
649 # struct Size {w: f64, h: f64}; // as before
651 # Circle(Point, f64), // origin, radius
652 # Rectangle(Point, Size) // upper-left, dimensions
654 # fn compute_area(shape: &Shape) -> f64 { 0.0 }
655 fn select<'r, T>(shape: &Shape, threshold: f64,
656 a: &'r T, b: &'r T) -> &'r T {
657 if compute_area(shape) > threshold {a} else {b}
661 This is equivalent to the previous definition.
663 ## Labeled Control Structures
665 Named lifetime notation can also be used to control the flow of execution:
668 'h: for i in range(0,10) {
670 if i % 2 == 0 { continue 'h; }
671 if i == 9 { break 'h; }
677 > *Note:* Labelled breaks are not currently supported within `while` loops.
679 Named labels are hygienic and can be used safely within macros.
680 See the macros guide section on hygiene for more details.
684 So there you have it: a (relatively) brief tour of the lifetime
685 system. For more details, we refer to the (yet to be written) reference
686 document on references, which will explain the full notation
687 and give more examples.