src/doc/trpl/ownership.md

   1 % The Rust Ownership Guide
   2
   3 This guide presents Rust's ownership system. This is one of Rust's most unique
   4 and compelling features, with which Rust developers should become quite
   5 acquainted. Ownership is how Rust achieves its largest goal, memory safety.
   6 The ownership system has a few distinct concepts: *ownership*, *borrowing*,
   7 and *lifetimes*. We'll talk about each one in turn.
   8
   9 # Meta
  10
  11 Before we get to the details, two important notes about the ownership system.
  12
  13 Rust has a focus on safety and speed. It accomplishes these goals through many
  14 *zero-cost abstractions*, which means that in Rust, abstractions cost as little
  15 as possible in order to make them work. The ownership system is a prime example
  16 of a zero cost abstraction. All of the analysis we'll talk about in this guide
  17 is _done at compile time_. You do not pay any run-time cost for any of these
  18 features.
  19
  20 However, this system does have a certain cost: learning curve. Many new users
  21 to Rust experience something we like to call "fighting with the borrow
  22 checker," where the Rust compiler refuses to compile a program that the author
  23 thinks is valid. This often happens because the programmer's mental model of
  24 how ownership should work doesn't match the actual rules that Rust implements.
  25 You probably will experience similar things at first. There is good news,
  26 however: more experienced Rust developers report that once they work with the
  27 rules of the ownership system for a period of time, they fight the borrow
  28 checker less and less.
  29
  30 With that in mind, let's learn about ownership.
  31
  32 # Ownership
  33
  34 At its core, ownership is about *resources*. For the purposes of the vast
  35 majority of this guide, we will talk about a specific resource: memory. The
  36 concept generalizes to any kind of resource, like a file handle, but to make it
  37 more concrete, we'll focus on memory.
  38
  39 When your program allocates some memory, it needs some way to deallocate that
  40 memory. Imagine a function `foo` that allocates four bytes of memory, and then
  41 never deallocates that memory. We call this problem *leaking* memory, because
  42 each time we call `foo`, we're allocating another four bytes. Eventually, with
  43 enough calls to `foo`, we will run our system out of memory. That's no good. So
  44 we need some way for `foo` to deallocate those four bytes. It's also important
  45 that we don't deallocate too many times, either. Without getting into the
  46 details, attempting to deallocate memory multiple times can lead to problems.
  47 In other words, any time some memory is allocated, we need to make sure that we
  48 deallocate that memory once and only once. Too many times is bad, not enough
  49 times is bad. The counts must match.
  50
  51 There's one other important detail with regards to allocating memory. Whenever
  52 we request some amount of memory, what we are given is a handle to that memory.
  53 This handle (often called a *pointer*, when we're referring to memory) is how
  54 we interact with the allocated memory. As long as we have that handle, we can
  55 do something with the memory. Once we're done with the handle, we're also done
  56 with the memory, as we can't do anything useful without a handle to it.
  57
  58 Historically, systems programming languages require you to track these
  59 allocations, deallocations, and handles yourself. For example, if we want some
  60 memory from the heap in a language like C, we do this:
  61
  62 ```c
  63 {
  64     int *x = malloc(sizeof(int));
  65
  66     // we can now do stuff with our handle x
  67     *x = 5;
  68
  69     free(x);
  70 }
  71 ```
  72
  73 The call to `malloc` allocates some memory. The call to `free` deallocates the
  74 memory. There's also bookkeeping about allocating the correct amount of memory.
  75
  76 Rust combines these two aspects of allocating memory (and other resources) into
  77 a concept called *ownership*. Whenever we request some memory, that handle we
  78 receive is called the *owning handle*. Whenever that handle goes out of scope,
  79 Rust knows that you cannot do anything with the memory anymore, and so
  80 therefore deallocates the memory for you. Here's the equivalent example in
  81 Rust:
  82
  83 ```rust
  84 # use std::boxed::Box;
  85 {
  86     let x = Box::new(5i);
  87 }
  88 ```
  89
  90 The `Box::new` function creates a `Box<T>` (specifically `Box<int>` in this
  91 case) by allocating a small segment of memory on the heap with enough space to
  92 fit an `int`. But where in the code is the box deallocated? We said before that
  93 we must have a deallocation for each allocation. Rust handles this for you. It
  94 knows that our handle, `x`, is the owning reference to our box. Rust knows that
  95 `x` will go out of scope at the end of the block, and so it inserts a call to
  96 deallocate the memory at the end of the scope. Because the compiler does this
  97 for us, it's impossible to forget. We always have exactly one deallocation
  98   paired with each of our allocations.
  99
 100 This is pretty straightforward, but what happens when we want to pass our box
 101 to a function? Let's look at some code:
 102
 103 ```rust
 104 # use std::boxed::Box;
 105 fn main() {
 106     let x = Box::new(5i);
 107
 108     add_one(x);
 109 }
 110
 111 fn add_one(mut num: Box<int>) {
 112     *num += 1;
 113 }
 114 ```
 115
 116 This code works, but it's not ideal. For example, let's add one more line of
 117 code, where we print out the value of `x`:
 118
 119 ```{rust,ignore}
 120 # use std::boxed::Box;
 121 fn main() {
 122     let x = Box::new(5i);
 123
 124     add_one(x);
 125
 126     println!("{}", x);
 127 }
 128
 129 fn add_one(mut num: Box<int>) {
 130     *num += 1;
 131 }
 132 ```
 133
 134 This does not compile, and gives us an error:
 135
 136 ```text
 137 error: use of moved value: `x`
 138    println!("{}", x);
 139                   ^
 140 ```
 141
 142 Remember, we need one deallocation for every allocation. When we try to pass
 143 our box to `add_one`, we would have two handles to the memory: `x` in `main`,
 144 and `num` in `add_one`. If we deallocated the memory when each handle went out
 145 of scope, we would have two deallocations and one allocation, and that's wrong.
 146 So when we call `add_one`, Rust defines `num` as the owner of the handle. And
 147 so, now that we've given ownership to `num`, `x` is invalid. `x`'s value has
 148 "moved" from `x` to `num`. Hence the error: use of moved value `x`.
 149
 150 To fix this, we can have `add_one` give ownership back when it's done with the
 151 box:
 152
 153 ```rust
 154 # use std::boxed::Box;
 155 fn main() {
 156     let x = Box::new(5i);
 157
 158     let y = add_one(x);
 159
 160     println!("{}", y);
 161 }
 162
 163 fn add_one(mut num: Box<int>) -> Box<int> {
 164     *num += 1;
 165
 166     num
 167 }
 168 ```
 169
 170 This code will compile and run just fine. Now, we return a `box`, and so the
 171 ownership is transferred back to `y` in `main`. We only have ownership for the
 172 duration of our function before giving it back. This pattern is very common,
 173 and so Rust introduces a concept to describe a handle which temporarily refers
 174 to something another handle owns. It's called *borrowing*, and it's done with
 175 *references*, designated by the `&` symbol.
 176
 177 # Borrowing
 178
 179 Here's the current state of our `add_one` function:
 180
 181 ```rust
 182 fn add_one(mut num: Box<int>) -> Box<int> {
 183     *num += 1;
 184
 185     num
 186 }
 187 ```
 188
 189 This function takes ownership, because it takes a `Box`, which owns its
 190 contents. But then we give ownership right back.
 191
 192 In the physical world, you can give one of your possessions to someone for a
 193 short period of time. You still own your possession, you're just letting someone
 194 else use it for a while. We call that *lending* something to someone, and that
 195 person is said to be *borrowing* that something from you.
 196
 197 Rust's ownership system also allows an owner to lend out a handle for a limited
 198 period. This is also called *borrowing*. Here's a version of `add_one` which
 199 borrows its argument rather than taking ownership:
 200
 201 ```rust
 202 fn add_one(num: &mut int) {
 203     *num += 1;
 204 }
 205 ```
 206
 207 This function borrows an `int` from its caller, and then increments it. When
 208 the function is over, and `num` goes out of scope, the borrow is over.
 209
 210 # Lifetimes
 211
 212 Lending out a reference to a resource that someone else owns can be
 213 complicated, however. For example, imagine this set of operations:
 214
 215 1. I acquire a handle to some kind of resource.
 216 2. I lend you a reference to the resource.
 217 3. I decide I'm done with the resource, and deallocate it, while you still have
 218    your reference.
 219 4. You decide to use the resource.
 220
 221 Uh oh! Your reference is pointing to an invalid resource. This is called a
 222 *dangling pointer* or "use after free," when the resource is memory.
 223
 224 To fix this, we have to make sure that step four never happens after step
 225 three. The ownership system in Rust does this through a concept called
 226 *lifetimes*, which describe the scope that a reference is valid for.
 227
 228 Let's look at that function which borrows an `int` again:
 229
 230 ```rust
 231 fn add_one(num: &int) -> int {
 232     *num + 1
 233 }
 234 ```
 235
 236 Rust has a feature called *lifetime elision*, which allows you to not write
 237 lifetime annotations in certain circumstances. This is one of them. We will
 238 cover the others later. Without eliding the lifetimes, `add_one` looks like
 239 this:
 240
 241 ```rust
 242 fn add_one<'a>(num: &'a int) -> int {
 243     *num + 1
 244 }
 245 ```
 246
 247 The `'a` is called a *lifetime*. Most lifetimes are used in places where
 248 short names like `'a`, `'b` and `'c` are clearest, but it's often useful to
 249 have more descriptive names. Let's dig into the syntax in a bit more detail:
 250
 251 ```{rust,ignore}
 252 fn add_one<'a>(...)
 253 ```
 254
 255 This part _declares_ our lifetimes. This says that `add_one` has one lifetime,
 256 `'a`. If we had two, it would look like this:
 257
 258 ```{rust,ignore}
 259 fn add_two<'a, 'b>(...)
 260 ```
 261
 262 Then in our parameter list, we use the lifetimes we've named:
 263
 264 ```{rust,ignore}
 265 ...(num: &'a int) -> ...
 266 ```
 267
 268 If you compare `&int` to `&'a int`, they're the same, it's just that the
 269 lifetime `'a` has snuck in between the `&` and the `int`. We read `&int` as "a
 270 reference to an int" and `&'a int` as "a reference to an int with the lifetime 'a.'"
 271
 272 Why do lifetimes matter? Well, for example, here's some code:
 273
 274 ```rust
 275 struct Foo<'a> {
 276     x: &'a int,
 277 }
 278
 279 fn main() {
 280     let y = &5i; // this is the same as `let _y = 5; let y = &_y;
 281     let f = Foo { x: y };
 282
 283     println!("{}", f.x);
 284 }
 285 ```
 286
 287 As you can see, `struct`s can also have lifetimes. In a similar way to functions,
 288
 289 ```{rust}
 290 struct Foo<'a> {
 291 # x: &'a int,
 292 # }
 293 ```
 294
 295 declares a lifetime, and
 296
 297 ```rust
 298 # struct Foo<'a> {
 299 x: &'a int,
 300 # }
 301 ```
 302
 303 uses it. So why do we need a lifetime here? We need to ensure that any reference
 304 to a `Foo` cannot outlive the reference to an `int` it contains.
 305
 306 ## Thinking in scopes
 307
 308 A way to think about lifetimes is to visualize the scope that a reference is
 309 valid for. For example:
 310
 311 ```rust
 312 fn main() {
 313     let y = &5i;    // -+ y goes into scope
 314                     //  |
 315     // stuff        //  |
 316                     //  |
 317 }                   // -+ y goes out of scope
 318 ```
 319
 320 Adding in our `Foo`:
 321
 322 ```rust
 323 struct Foo<'a> {
 324     x: &'a int,
 325 }
 326
 327 fn main() {
 328     let y = &5i;          // -+ y goes into scope
 329     let f = Foo { x: y }; // -+ f goes into scope
 330     // stuff              //  |
 331                           //  |
 332 }                         // -+ f and y go out of scope
 333 ```
 334
 335 Our `f` lives within the scope of `y`, so everything works. What if it didn't?
 336 This code won't work:
 337
 338 ```{rust,ignore}
 339 struct Foo<'a> {
 340     x: &'a int,
 341 }
 342
 343 fn main() {
 344     let x;                    // -+ x goes into scope
 345                               //  |
 346     {                         //  |
 347         let y = &5i;          // ---+ y goes into scope
 348         let f = Foo { x: y }; // ---+ f goes into scope
 349         x = &f.x;             //  | | error here
 350     }                         // ---+ f and y go out of scope
 351                               //  |
 352     println!("{}", x);        //  |
 353 }                             // -+ x goes out of scope
 354 ```
 355
 356 Whew! As you can see here, the scopes of `f` and `y` are smaller than the scope
 357 of `x`. But when we do `x = &f.x`, we make `x` a reference to something that's
 358 about to go out of scope.
 359
 360 Named lifetimes are a way of giving these scopes a name. Giving something a
 361 name is the first step towards being able to talk about it.
 362
 363 ## 'static
 364
 365 The lifetime named *static* is a special lifetime. It signals that something
 366 has the lifetime of the entire program. Most Rust programmers first come across
 367 `'static` when dealing with strings:
 368
 369 ```rust
 370 let x: &'static str = "Hello, world.";
 371 ```
 372
 373 String literals have the type `&'static str` because the reference is always
 374 alive: they are baked into the data segment of the final binary. Another
 375 example are globals:
 376
 377 ```rust
 378 static FOO: int = 5i;
 379 let x: &'static int = &FOO;
 380 ```
 381
 382 This adds an `int` to the data segment of the binary, and FOO is a reference to
 383 it.
 384
 385 # Shared Ownership
 386
 387 In all the examples we've considered so far, we've assumed that each handle has
 388 a singular owner. But sometimes, this doesn't work. Consider a car. Cars have
 389 four wheels. We would want a wheel to know which car it was attached to. But
 390 this won't work:
 391
 392 ```{rust,ignore}
 393 struct Car {
 394     name: String,
 395 }
 396
 397 struct Wheel {
 398     size: int,
 399     owner: Car,
 400 }
 401
 402 fn main() {
 403     let car = Car { name: "DeLorean".to_string() };
 404
 405     for _ in range(0u, 4) {
 406         Wheel { size: 360, owner: car };
 407     }
 408 }
 409 ```
 410
 411 We try to make four `Wheel`s, each with a `Car` that it's attached to. But the
 412 compiler knows that on the second iteration of the loop, there's a problem:
 413
 414 ```text
 415 error: use of moved value: `car`
 416     Wheel { size: 360, owner: car };
 417                               ^~~
 418 note: `car` moved here because it has type `Car`, which is non-copyable
 419     Wheel { size: 360, owner: car };
 420                               ^~~
 421 ```
 422
 423 We need our `Car` to be pointed to by multiple `Wheel`s. We can't do that with
 424 `Box<T>`, because it has a single owner. We can do it with `Rc<T>` instead:
 425
 426 ```rust
 427 use std::rc::Rc;
 428
 429 struct Car {
 430     name: String,
 431 }
 432
 433 struct Wheel {
 434     size: int,
 435     owner: Rc<Car>,
 436 }
 437
 438 fn main() {
 439     let car = Car { name: "DeLorean".to_string() };
 440
 441     let car_owner = Rc::new(car);
 442
 443     for _ in range(0u, 4) {
 444         Wheel { size: 360, owner: car_owner.clone() };
 445     }
 446 }
 447 ```
 448
 449 We wrap our `Car` in an `Rc<T>`, getting an `Rc<Car>`, and then use the
 450 `clone()` method to make new references. We've also changed our `Wheel` to have
 451 an `Rc<Car>` rather than just a `Car`.
 452
 453 This is the simplest kind of multiple ownership possible. For example, there's
 454 also `Arc<T>`, which uses more expensive atomic instructions to be the
 455 thread-safe counterpart of `Rc<T>`.
 456
 457 ## Lifetime Elision
 458
 459 Earlier, we mentioned *lifetime elision*, a feature of Rust which allows you to
 460 not write lifetime annotations in certain circumstances. All references have a
 461 lifetime, and so if you elide a lifetime (like `&T` instead of `&'a T`), Rust
 462 will do three things to determine what those lifetimes should be.
 463
 464 When talking about lifetime elision, we use the term *input lifetime* and
 465 *output lifetime*. An *input lifetime* is a lifetime associated with a parameter
 466 of a function, and an *output lifetime* is a lifetime associated with the return
 467 value of a function. For example, this function has an input lifetime:
 468
 469 ```{rust,ignore}
 470 fn foo<'a>(bar: &'a str)
 471 ```
 472
 473 This one has an output lifetime:
 474
 475 ```{rust,ignore}
 476 fn foo<'a>() -> &'a str
 477 ```
 478
 479 This one has a lifetime in both positions:
 480
 481 ```{rust,ignore}
 482 fn foo<'a>(bar: &'a str) -> &'a str
 483 ```
 484
 485 Here are the three rules:
 486
 487 * Each elided lifetime in a function's arguments becomes a distinct lifetime
 488   parameter.
 489
 490 * If there is exactly one input lifetime, elided or not, that lifetime is
 491   assigned to all elided lifetimes in the return values of that function.
 492
 493 * If there are multiple input lifetimes, but one of them is `&self` or `&mut
 494   self`, the lifetime of `self` is assigned to all elided output lifetimes.
 495
 496 Otherwise, it is an error to elide an output lifetime.
 497
 498 ### Examples
 499
 500 Here are some examples of functions with elided lifetimes, and the version of
 501 what the elided lifetimes are expand to:
 502
 503 ```{rust,ignore}
 504 fn print(s: &str);                                      // elided
 505 fn print<'a>(s: &'a str);                               // expanded
 506
 507 fn debug(lvl: uint, s: &str);                           // elided
 508 fn debug<'a>(lvl: uint, s: &'a str);                    // expanded
 509
 510 // In the preceeding example, `lvl` doesn't need a lifetime because it's not a
 511 // reference (`&`). Only things relating to references (such as a `struct`
 512 // which contains a reference) need lifetimes.
 513
 514 fn substr(s: &str, until: uint) -> &str;                // elided
 515 fn substr<'a>(s: &'a str, until: uint) -> &'a str;      // expanded
 516
 517 fn get_str() -> &str;                                   // ILLEGAL, no inputs
 518
 519 fn frob(s: &str, t: &str) -> &str;                      // ILLEGAL, two inputs
 520
 521 fn get_mut(&mut self) -> &mut T;                        // elided
 522 fn get_mut<'a>(&'a mut self) -> &'a mut T;              // expanded
 523
 524 fn args<T:ToCStr>(&mut self, args: &[T]) -> &mut Command                  // elided
 525 fn args<'a, 'b, T:ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded
 526
 527 fn new(buf: &mut [u8]) -> BufWriter;                    // elided
 528 fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a>          // expanded
 529 ```
 530
 531 # Related Resources
 532
 533 Coming Soon.