1 % Rust Borrowed Pointers Tutorial
5 Borrowed pointers are one of the more flexible and powerful tools
6 available in Rust. A borrowed pointer can be used to point anywhere:
7 into the shared and exchange heaps, into the stack, and even into the
8 interior of another data structure. With regard to flexibility, it is
9 comparable to a C pointer or C++ reference. However, unlike C and C++,
10 the Rust compiler includes special checks that ensure that borrowed
11 pointers are being used safely. Another advantage of borrowed pointers
12 is that they are invisible to the garbage collector, so working with
13 borrowed pointers helps keep things efficient.
15 Despite the fact that they are completely safe, at runtime, a borrowed
16 pointer is “just a pointer”. They introduce zero overhead. All safety
17 checks are done at compilation time.
19 Although borrowed pointers have rather elaborate theoretical
20 underpinnings (region pointers), the core concepts will be familiar to
21 anyone who worked with C or C++. Therefore, the best way to explain
22 how they are used—and their limitations—is probably just to work
23 through several examples.
27 Borrowed pointers are called borrowed because they are only valid for
28 a limit duration. Borrowed pointers never claim any kind of ownership
29 over the data that they point at: instead, they are used for cases
30 where you like to make use of data for a short time.
32 As an example, consider a simple record type `point`:
35 type point = {x: float, y: float};
38 We can use this simple definition to allocate points in many ways. For
39 example, in this code, each of these three local variables contains a
40 point, but allocated in a different place:
43 # type point = {x: float, y: float};
44 let on_the_stack : point = {x: 3.0, y: 4.0};
45 let shared_box : @point = @{x: 5.0, y: 1.0};
46 let unique_box : ~point = ~{x: 7.0, y: 9.0};
49 Suppose we wanted to write a procedure that computed the distance
50 between any two points, no matter where they were stored. For example,
51 we might like to compute the distance between `on_the_stack` and
52 `shared_box`, or between `shared_box` and `unique_box`. One option is
53 to define a function that takes two arguments of type point—that is,
54 it takes the points by value. But this will cause the points to be
55 copied when we call the function. For points, this is probably not so
56 bad, but often copies are expensive or, worse, if there are mutable
57 fields, they can change the semantics of your program. So we’d like to
58 define a function that takes the points by pointer. We can use
59 borrowed pointers to do this:
62 # type point = {x: float, y: float};
63 # fn sqrt(f: float) -> float { 0f }
64 fn compute_distance(p1: &point, p2: &point) -> float {
65 let x_d = p1.x - p2.x;
66 let y_d = p1.y - p2.y;
67 sqrt(x_d * x_d + y_d * y_d)
71 Now we can call `compute_distance()` in various ways:
74 # type point = {x: float, y: float};
75 # let on_the_stack : point = {x: 3.0, y: 4.0};
76 # let shared_box : @point = @{x: 5.0, y: 1.0};
77 # let unique_box : ~point = ~{x: 7.0, y: 9.0};
78 # fn compute_distance(p1: &point, p2: &point) -> float { 0f }
79 compute_distance(&on_the_stack, shared_box)
80 compute_distance(shared_box, unique_box)
83 Here the `&` operator is used to take the address of the variable
84 `on_the_stack`; this is because `on_the_stack` has the type `point`
85 (that is, a record value) and we have to take its address to get a
86 value. We also call this _borrowing_ the local variable
87 `on_the_stack`, because we are created an alias: that is, another
88 route to the same data.
90 In the case of the boxes `shared_box` and `unique_box`, however, no
91 explicit action is necessary. The compiler will automatically convert
92 a box like `@point` or `~point` to a borrowed pointer like
93 `&point`. This is another form of borrowing; in this case, the
94 contents of the shared/unique box is being lent out.
96 Whenever a value is borrowed, there are some limitations on what you
97 can do with the original. For example, if the contents of a variable
98 have been lent out, you cannot send that variable to another task, nor
99 will you be permitted to take actions that might cause the borrowed
100 value to be freed or to change its type (I’ll get into what kinds of
101 actions those are shortly). This rule should make intuitive sense: you
102 must wait for a borrowed value to be returned (that is, for the
103 borrowed pointer to go out of scope) before you can make full use of
106 # Other uses for the & operator
108 In the previous example, the value `on_the_stack` was defined like so:
111 # type point = {x: float, y: float};
112 let on_the_stack : point = {x: 3.0, y: 4.0};
115 This results in a by-value variable. As a consequence, we had to
116 explicitly take the address of `on_the_stack` to get a borrowed
117 pointer. Sometimes however it is more convenient to move the &
118 operator into the definition of `on_the_stack`:
121 # type point = {x: float, y: float};
122 let on_the_stack2 : &point = &{x: 3.0, y: 4.0};
125 Applying `&` to an rvalue (non-assignable location) is just a convenient
126 shorthand for creating a temporary and taking its address:
129 # type point = {x: float, y: float};
130 let tmp = {x: 3.0, y: 4.0};
131 let on_the_stack2 : &point = &tmp;
134 Taking the address of fields
136 As in C, the `&` operator is not limited to taking the address of
137 local variables. It can also be used to take the address of fields or
138 individual array elements. For example, consider this type definition
142 type point = {x: float, y: float}; // as before
143 type size = {w: float, h: float}; // as before
144 type rectangle = {origin: point, size: size};
147 Now again I can define rectangles in a few different ways:
150 let rect_stack = &{origin: {x: 1, y: 2}, size: {w: 3, h: 4}};
151 let rect_shared = @{origin: {x: 3, y: 4}, size: {w: 3, h: 4}};
152 let rect_unique = ~{origin: {x: 5, y: 6}, size: {w: 3, h: 4}};
155 In each case I can use the `&` operator to extact out individual
156 subcomponents. For example, I could write:
159 # type point = {x: float, y: float};
160 # type size = {w: float, h: float}; // as before
161 # type rectangle = {origin: point, size: size};
162 # let rect_stack = &{origin: {x: 1, y: 2}, size: {w: 3, h: 4}};
163 # let rect_shared = @{origin: {x: 3, y: 4}, size: {w: 3, h: 4}};
164 # let rect_unique = ~{origin: {x: 5, y: 6}, size: {w: 3, h: 4}};
165 # fn compute_distance(p1: &point, p2: &point) -> float { 0f }
166 compute_distance(&rect_stack.origin, &rect_shared.origin);
169 which would borrow the field `origin` from the rectangle on the stack
170 from the shared box and then compute the distance between them.
172 # Borrowing shared boxes and rooting
174 We’ve seen a few examples so far where heap boxes (both shared and
175 unique) are borrowed. Up till this point, we’ve glossed over issues of
176 safety. As stated in the introduction, at runtime a borrowed pointer
177 is simply a pointer, nothing more. Therefore, if we wish to avoid the
178 issues that C has with dangling pointers (and we do!), a compile-time
179 safety check is required.
181 The basis for the check is the notion of _lifetimes_. A lifetime is
182 basically a static approximation of the period in which the pointer is
183 valid: it always corresponds to some expression or block within the
184 program. Within that expression, the pointer can be used freely, but
185 if the pointer somehow leaks outside of that expression, the compiler
186 will report an error. We’ll be discussing lifetimes more in the
187 examples to come, and a more thorough introduction is also available.
189 When a borrowed pointer is created, the compiler must ensure that it
190 will remain valid for its entire lifetime. Sometimes this is
191 relatively easy, such as when taking the address of a local variable
192 or a field that is stored on the stack:
197 let y = &mut x.f; // -+ L
202 Here, the lifetime of the borrowed pointer is simply L, the remainder
203 of the function body. No extra work is required to ensure that `x.f`
204 will not be freed. This is true even if `x` is mutated.
206 The situation gets more complex when borrowing data that resides in
212 let y = &x.f; // -+ L
217 In this example, the value `x` is in fact a heap box, and `y` is
218 therefore a pointer into that heap box. Again the lifetime of `y` will
219 be L, the remainder of the function body. But there is a crucial
220 difference: suppose `x` were reassigned during the lifetime L? If
221 we’re not careful, that could mean that the shared box would become
222 unrooted and therefore be subject to garbage collection
224 > ***Note:***In our current implementation, the garbage collector is
225 > implemented using reference counting and cycle detection.
227 For this reason, whenever the interior of a shared box stored in a
228 mutable location is borrowed, the compiler will insert a temporary
229 that ensures that the shared box remains live for the entire
230 lifetime. So, the above example would be compiled as:
236 let y = &x1.f; // -+ L
241 Now if `x` is reassigned, the pointer `y` will still remain valid. This
242 process is called “rooting”.
244 # Borrowing unique boxes
246 The previous example demonstrated `rooting`, the process by which the
247 compiler ensures that shared boxes remain live for the duration of a
248 borrow. Unfortunately, rooting does not work if the data being
249 borrowed is a unique box, as it is not possible to have two references
252 For unique boxes, therefore, the compiler will only allow a borrow `if
253 the compiler can guarantee that the unique box will not be reassigned
254 or moved for the lifetime of the pointer`. This does not necessarily
255 mean that the unique box is stored in immutable memory. For example,
256 the following function is legal:
259 # fn some_condition() -> bool { true }
260 fn example3() -> int {
262 if some_condition() {
263 let y = &x.f; // -+ L
272 Here, as before, the interior of the variable `x` is being borrowed
273 and `x` is declared as mutable. However, the compiler can clearly see
274 that `x` is not assigned anywhere in the lifetime L of the variable
275 `y`. Therefore, it accepts the function, even though `x` is mutable
276 and in fact is mutated later in the function.
278 It may not be clear why we are so concerned about the variable which
279 was borrowed being mutated. The reason is that unique boxes are freed
280 _as soon as their owning reference is changed or goes out of
281 scope_. Therefore, a program like this is illegal (and would be
282 rejected by the compiler):
285 fn example3() -> int {
288 x = ~{f: 4}; // Error reported here.
293 To make this clearer, consider this diagram showing the state of
294 memory immediately before the re-assignment of `x`:
303 +----------+ | +---------+
308 Once the reassignment occurs, the memory will look like this:
313 x +----------+ +---------+
314 | ~{f:int} | -------> | f: 4 |
315 y +----------+ +---------+
317 +----------+ | +---------+
322 Here you can see that the variable `y` still points at the old box,
323 which has been freed.
325 In fact, the compiler can apply this same kind of reasoning can be
326 applied to any memory which is _(uniquely) owned by the stack
327 frame_. So we could modify the previous example to introduce
328 additional unique pointers and records, and the compiler will still be
329 able to detect possible mutations:
332 fn example3() -> int {
333 let mut x = ~{mut f: ~{g: 3}};
335 x = ~{mut f: ~{g: 4}}; // Error reported here.
336 x.f = ~{g: 5}; // Error reported here.
341 In this case, two errors are reported, one when the variable `x` is
342 modified and another when `x.f` is modified. Either modification would
343 cause the pointer `y` to be invalidated.
345 Things get tricker when the unique box is not uniquely owned by the
346 stack frame (or when the compiler doesn’t know who the owner
347 is). Consider a program like this:
350 fn example5a(x: @{mut f: ~{g: int}} ...) -> int {
351 let y = &x.f.g; // Error reported here.
356 Here the heap looks something like:
359 Stack Shared Heap Exchange Heap
361 x +------+ +-------------+ +------+
362 | @... | ----> | mut f: ~... | --+-> | g: 3 |
363 y +------+ +-------------+ | +------+
364 | &int | -------------------------+
368 In this case, the owning reference to the value being borrowed is in
369 fact `x.f`. Moreover, `x.f` is both mutable and aliasable. Aliasable
370 means that it is possible that there are other pointers to that same
371 shared box, so even if the compiler were to prevent `x.f` from being
372 mutated, the field might still be changed through some alias of
373 `x`. Therefore, to be safe, the compiler only accepts pure actions
374 during the lifetime of `y`. We’ll have a final example on purity but
375 inn unique fields, as in the following example:
377 Besides ensuring purity, the only way to borrow the interior of a
378 unique found in aliasable memory is to ensure that it is stored within
379 unique fields, as in the following example:
382 fn example5b(x: @{f: ~{g: int}}) -> int {
389 Here, the field `f` is not declared as mutable. But that is enough for
390 the compiler to know that, even if aliases to `x` exist, the field `f`
391 cannot be changed and hence the unique box `g` will remain valid.
393 If you do have a unique box in a mutable field, and you wish to borrow
394 it, one option is to use the swap operator to bring that unique box
398 fn example5c(x: @{mut f: ~int}) -> int {
400 v <-> x.f; // Swap v and x.f
403 x.f <- v; // Replace x.f
409 Of course, this has the side effect of modifying your shared box for
410 the duration of the borrow, so it works best when you know that you
411 won’t be accessing that same box again.
413 # Borrowing and enums
415 The previous example showed that borrowing unique boxes found in
416 aliasable, mutable memory is not permitted, so as to prevent pointers
417 into freed memory. There is one other case where the compiler must be
418 very careful to ensure that pointers remain valid: pointers into the
421 As an example, let’s look at the following `shape` type that can
422 represent both rectangles and circles:
425 type point = {x: float, y: float}; // as before
426 type size = {w: float, h: float}; // as before
428 circle(point, float), // origin, radius
429 rectangle(point, size) // upper-left, dimensions
433 Now I might write a function to compute the area of a shape. This
434 function takes a borrowed pointer to a shape to avoid the need of
438 # type point = {x: float, y: float}; // as before
439 # type size = {w: float, h: float}; // as before
441 # circle(point, float), // origin, radius
442 # rectangle(point, size) // upper-left, dimensions
444 # const tau: float = 6.28f;
445 fn compute_area(shape: &shape) -> float {
447 circle(_, radius) => 0.5 * tau * radius * radius,
448 rectangle(_, ref size) => size.w * size.h
453 The first case matches against circles. Here the radius is extracted
454 from the shape variant and used to compute the area of the circle
455 (Like any up-to-date engineer, we use the [tau circle constant][tau]
456 and not that dreadfully outdated notion of pi).
458 [tau]: http://www.math.utah.edu/~palais/pi.html
460 The second match is more interesting. Here we match against a
461 rectangle and extract its size: but rather than copy the `size` struct,
462 we use a by-reference binding to create a pointer to it. In other
463 words, a pattern binding like `ref size` in fact creates a pointer of
464 type `&size` into the _interior of the enum_.
466 To make this more clear, let’s look at a diagram of how things are
467 laid out in memory in the case where `shape` points at a rectangle:
472 +-------+ +---------------+
473 | shape | ------> | rectangle( |
474 +-------+ | {x: float, |
475 | size | -+ | y: float}, |
476 +-------+ +----> | {w: float, |
481 Here you can see that rectangular shapes are composed of five words of
482 memory. The first is a tag indicating which variant this enum is
483 (`rectangle`, in this case). The next two words are the `x` and `y`
484 fields for the point and the remaining two are the `w` and `h` fields
485 for the size. The binding `size` is then a pointer into the inside of
488 Perhaps you can see where the danger lies: if the shape were somehow
489 to be reassigned, perhaps to a circle, then although the memory used
490 to store that shape value would still be valid, _it would have a
491 different type_! This is shown in the following diagram, depicting what
492 the state of memory would be if shape were overwritten with a circle:
497 +-------+ +---------------+
498 | shape | ------> | circle( |
499 +-------+ | {x: float, |
500 | size | -+ | y: float}, |
501 +-------+ +----> | float) |
506 As you can see, the `size` pointer would not be pointing at a `float` and
507 not a record. This is not good.
509 So, in fact, for every `ref` binding, the compiler will impose the
510 same rules as the ones we saw for borrowing the interior of a unique
511 box: it must be able to guarantee that the enum will not be
512 overwritten for the duration of the borrow. In fact, the example I
513 gave earlier would be considered safe. This is because the shape
514 pointer has type `&shape`, which means “borrowed pointer to immutable
515 memory containing a shape”. If however the type of that pointer were
516 `&const shape` or `&mut shape`, then the ref binding would not be
517 permitted. Just as with unique boxes, the compiler will permit ref
518 bindings into data owned by the stack frame even if it is mutable, but
519 otherwise it requires that the data reside in immutable memory.
521 > ***Note:*** Right now, all pattern bindings are by-reference. We
522 > expect this to change so that copies are the default and references
523 > must be noted explicitly.
525 # Returning borrowed pointers
527 So far, all of the examples we’ve looked at use borrowed pointers in a
528 “downward” direction. That is, the borrowed pointer is created and
529 then used during the method or code block which created it. In some
530 cases, it is also possible to return borrowed pointers to the caller,
531 but as we’ll see this is more limited.
533 For example, we could write a subroutine like this:
536 type point = {x: float, y: float};
537 fn get_x(p: &point) -> &float { &p.x }
540 Here, the function `get_x()` returns a pointer into the structure it was
541 given. You’ll note that _both_ the parameter and the return value are
542 borrowed pointers; this is important. In general, it is only possible
543 to return borrowed pointers if they are derived from a borrowed
544 pointer which was given as input to the procedure.
546 In the example, `get_x()` took a borrowed pointer to a `point` as
547 input. In general, for all borrowed pointers that appear in the
548 signature of a function (such as the parameter and return types), the
549 compiler assigns the same symbolic lifetime L (we will see later that
550 there are ways to differentiate the lifetimes of different parameters
551 if that should be necessary). This means that, from the compiler’s
552 point of view, `get_x()` takes and returns two pointers with the same
553 lifetime. Now, unlike other lifetimes, this lifetime is a bit
554 abstract: it doesn’t refer to a specific expression within `get_x()`,
555 but rather to some expression within the caller. This is called a
556 _lifetime parameter_, because the lifetime L is effectively defined by
557 the caller to `get_x()`, just as the value for the parameter `p` is
558 defined by the caller.
560 In any case, whatever the lifetime L is, the pointer produced by
561 `&p.x` always has the same lifetime as `p` itself, as a pointer to a
562 field of a record is valid as long as the record is valid. Therefore,
563 the compiler is satisfied with the function `get_x()`.
565 To drill in this point, let’s look at a variation on the example, this
566 time one which does not compile:
569 type point = {x: float, y: float};
570 fn get_x_sh(p: @point) -> &float {
571 &p.x // Error reported here
575 Here, the function `get_x_sh()` takes a shared box as input and
576 returns a borrowed pointer. As before, the lifetime of the borrowed
577 pointer that will be returned is a parameter (specified by the
578 caller). That means that effectively `get_x_sh()` is promising to
579 return a borrowed pointer that is valid for as long as the caller
580 would like: this is subtly different from the first example, which
581 promised to return a pointer that was valid for as long as the pointer
584 Within `get_x_sh()`, we see the expression `&p.x` which takes the
585 address of a field of a shared box. This implies that the compiler
586 must guarantee that, so long as the resulting pointer is valid, the
587 shared box will not be reclaimed by the garbage collector. But recall
588 that get_x_sh() also promised to return a pointer that was valid for
589 as long as the caller wanted it to be. Clearly, `get_x_sh()` is not in
590 a position to make both of these guarantees; in fact, it cannot
591 guarantee that the pointer will remain valid at all once it returns,
592 as the parameter `p` may or may not be live in the caller. Therefore,
593 the compiler will report an error here.
595 In general, if you borrow a shared (or unique) box to create a
596 borrowed pointer, the pointer will only be valid within the function
597 and cannot be returned. Generally, the only way to return borrowed
598 pointers is to take borrowed pointers as input.
602 So far we have always used the notation `&T` for a borrowed
603 pointer. However, sometimes if a function takes many parameters, it is
604 useful to be able to group those parameters by lifetime. For example,
605 consider this function:
608 # type point = {x: float, y: float}; // as before
609 # type size = {w: float, h: float}; // as before
611 # circle(point, float), // origin, radius
612 # rectangle(point, size) // upper-left, dimensions
614 # fn compute_area(shape: &shape) -> float { 0f }
615 fn select<T>(shape: &shape, threshold: float,
616 a: &T, b: &T) -> &T {
617 if compute_area(shape) > threshold {a} else {b}
621 This function takes three borrowed pointers. Because of the way that
622 the system works, each will be assigned the same lifetime: the default
623 lifetime parameter. In practice, this means that, in the caller, the
624 lifetime of the returned value will be the intersection of the
625 lifetime of the three region parameters. This may be overloy
626 conservative, as in this example:
629 # type point = {x: float, y: float}; // as before
630 # type size = {w: float, h: float}; // as before
632 # circle(point, float), // origin, radius
633 # rectangle(point, size) // upper-left, dimensions
635 # fn compute_area(shape: &shape) -> float { 0f }
636 # fn select<T>(shape: &shape, threshold: float,
637 # a: &T, b: &T) -> &T {
638 # if compute_area(shape) > threshold {a} else {b}
642 fn select_based_on_unit_circle<T>( // |-+ B
643 threshold: float, a: &T, b: &T) -> &T { // | |
645 let shape = circle({x: 0, y: 0}, 1); // | |
646 select(&shape, threshold, a, b) // | |
651 In this call to `select()`, the lifetime of the first parameter shape
652 is B, the function body. Both of the second two parameters `a` and `b`
653 share the same lifetime, L, which is the lifetime parameter of
654 `select_based_on_unit_circle()`. The caller will infer the
655 intersection of these three lifetimes as the lifetime of the returned
656 value, and hence the return value of `shape()` will be assigned a
657 return value of B. This will in turn lead to a compilation error,
658 because `select_based_on_unit_circle()` is supposed to return a value
661 To address this, we could modify the definition of `select()` to
662 distinguish the lifetime of the first parameter from the lifetime of
663 the latter two. After all, the first parameter is not being
664 returned. To do so, we make use of the notation `</T`, which is a
665 borrowed pointer with an explicit lifetime. This effectively creates a
666 second lifetime parameter for the function; named lifetime parameters
667 do not need to be declared, you just use them. Here is how the new
668 `select()` might look:
671 # type point = {x: float, y: float}; // as before
672 # type size = {w: float, h: float}; // as before
674 # circle(point, float), // origin, radius
675 # rectangle(point, size) // upper-left, dimensions
677 # fn compute_area(shape: &shape) -> float { 0f }
678 fn select<T>(shape: &tmp/shape, threshold: float,
679 a: &T, b: &T) -> &T {
680 if compute_area(shape) > threshold {a} else {b}
684 Here you can see the lifetime of shape is now being called `tmp`. The
685 parameters `a`, `b`, and the return value all remain with the default
688 You could also write `select()` using all named lifetime parameters,
689 which might look like:
692 # type point = {x: float, y: float}; // as before
693 # type size = {w: float, h: float}; // as before
695 # circle(point, float), // origin, radius
696 # rectangle(point, size) // upper-left, dimensions
698 # fn compute_area(shape: &shape) -> float { 0f }
699 fn select<T>(shape: &tmp/shape, threshold: float,
700 a: &r/T, b: &r/T) -> &r/T {
701 if compute_area(shape) > threshold {a} else {b}
705 This is equivalent to the previous definition.
709 As mentioned before, the Rust compiler offers a kind of escape hatch
710 that permits borrowing of any data, but only if the actions that occur
711 during the lifetime of the borrow are pure. Pure actions are those
712 which only modify data owned by the current stack frame. The compiler
713 can therefore permit arbitrary pointers into the heap, secure in the
714 knowledge that no pure action will ever cause them to become
715 invalidated (the compiler must still track data on the stack which is
716 borrowed and enforce those rules normally, of course).
718 Let’s revisit a previous example and show how purity can affect the
719 compiler’s result. Here is `example5a()`, which borrows the interior of
720 a unique box found in an aliasable, mutable location, only now we’ve
721 replaced the `...` with some specific code:
724 fn example5a(x: @{mut f: ~{g: int}} ...) -> int {
725 let y = &x.f.g; // Unsafe
730 The new code simply returns an incremented version of `y`. This clearly
731 doesn’t do mutate anything in the heap, so the compiler is satisfied.
733 But suppose we wanted to pull the increment code into a helper, like
737 fn add_one(x: &int) -> int { *x + 1 }
740 We can now update `example5a()` to use `add_one()`:
743 # fn add_one(x: &int) -> int { *x + 1 }
744 fn example5a(x: @{mut f: ~{g: int}} ...) -> int {
746 add_one(y) // Error reported here
750 But now the compiler will report an error again. The reason is that it
751 only considers one function at a time (like most type checkers), and
752 so it does not know that `add_one()` only takes pure actions. We can
753 help the compiler by labeling `add_one()` as pure:
756 pure fn add_one(x: &int) -> int { *x + 1 }
759 With this change, the modified version of `example5a()` will again compile.
763 So there you have it. A (relatively) brief tour of borrowed pointer
764 system. For more details, I refer to the (yet to be written) reference
765 document on borrowed pointers, which will explain the full notation
766 and give more examples.