src/doc/guide-testing.md

   1 % The Rust Testing Guide
   2
   3 > Program testing can be a very effective way to show the presence of bugs, but
   4 > it is hopelessly inadequate for showing their absence.
   5 >
   6 > Edsger W. Dijkstra, "The Humble Programmer" (1972)
   7
   8 Let's talk about how to test Rust code. What we will not be talking about is
   9 the right way to test Rust code. There are many schools of thought regarding
  10 the right and wrong way to write tests. All of these approaches use the same
  11 basic tools, and so we'll show you the syntax for using them.
  12
  13 # The `test` attribute
  14
  15 At its simplest, a test in Rust is a function that's annotated with the `test`
  16 attribute. Let's make a new project with Cargo called `adder`:
  17
  18 ```bash
  19 $ cargo new adder
  20 $ cd adder
  21 ```
  22
  23 Cargo will automatically generate a simple test when you make a new project.
  24 Here's the contents of `src/lib.rs`:
  25
  26 ```rust
  27 #[test]
  28 fn it_works() {
  29 }
  30 ```
  31
  32 Note the `#[test]`. This attribute indicates that this is a test function. It
  33 currently has no body. That's good enough to pass! We can run the tests with
  34 `cargo test`:
  35
  36 ```bash
  37 $ cargo test
  38    Compiling adder v0.0.1 (file:///home/you/projects/adder)
  39      Running target/adder-91b3e234d4ed382a
  40
  41 running 1 test
  42 test it_works ... ok
  43
  44 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
  45
  46    Doc-tests adder
  47
  48 running 0 tests
  49
  50 test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
  51 ```
  52
  53 Cargo compiled and ran our tests. There are two sets of output here: one
  54 for the test we wrote, and another for documentation tests. We'll talk about
  55 those later. For now, see this line:
  56
  57 ```text
  58 test it_works ... ok
  59 ```
  60
  61 Note the `it_works`. This comes from the name of our function:
  62
  63 ```rust
  64 fn it_works() {
  65 # }
  66 ```
  67
  68 We also get a summary line:
  69
  70 ```text
  71 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
  72 ```
  73
  74 So why does our do-nothing test pass? Any test which doesn't `panic!` passes,
  75 and any test that does `panic!` fails. Let's make our test fail:
  76
  77 ```rust
  78 #[test]
  79 fn it_works() {
  80     assert!(false);
  81 }
  82 ```
  83
  84 `assert!` is a macro provided by Rust which takes one argument: if the argument
  85 is `true`, nothing happens. If the argument is false, it `panic!`s. Let's run
  86 our tests again:
  87
  88 ```bash
  89 $ cargo test
  90    Compiling adder v0.0.1 (file:///home/you/projects/adder)
  91      Running target/adder-91b3e234d4ed382a
  92
  93 running 1 test
  94 test it_works ... FAILED
  95
  96 failures:
  97
  98 ---- it_works stdout ----
  99         task 'it_works' panicked at 'assertion failed: false', /home/steve/tmp/adder/src/lib.rs:3
 100
 101
 102
 103 failures:
 104     it_works
 105
 106 test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured
 107
 108 task '<main>' panicked at 'Some tests failed', /home/steve/src/rust/src/libtest/lib.rs:247
 109 ```
 110
 111 Rust indicates that our test failed:
 112
 113 ```text
 114 test it_works ... FAILED
 115 ```
 116
 117 And that's reflected in the summary line:
 118
 119 ```text
 120 test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured
 121 ```
 122
 123 We also get a non-zero status code:
 124
 125 ```bash
 126 $ echo $?
 127 101
 128 ```
 129
 130 This is useful if you want to integrate `cargo test` into other tooling.
 131
 132 We can invert our test's failure with another attribute: `should_fail`:
 133
 134 ```rust
 135 #[test]
 136 #[should_fail]
 137 fn it_works() {
 138     assert!(false);
 139 }
 140 ```
 141
 142 This test will now succeed if we `panic!` and fail if we complete. Let's try it:
 143
 144 ```bash
 145 $ cargo test
 146    Compiling adder v0.0.1 (file:///home/you/projects/adder)
 147      Running target/adder-91b3e234d4ed382a
 148
 149 running 1 test
 150 test it_works ... ok
 151
 152 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
 153
 154    Doc-tests adder
 155
 156 running 0 tests
 157
 158 test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
 159 ```
 160
 161 Rust provides another macro, `assert_eq!`, that compares two arguments for
 162 equality:
 163
 164 ```rust
 165 #[test]
 166 #[should_fail]
 167 fn it_works() {
 168     assert_eq!("Hello", "world");
 169 }
 170 ```
 171
 172 Does this test pass or fail? Because of the `should_fail` attribute, it
 173 passes:
 174
 175 ```bash
 176 $ cargo test
 177    Compiling adder v0.0.1 (file:///home/you/projects/adder)
 178      Running target/adder-91b3e234d4ed382a
 179
 180 running 1 test
 181 test it_works ... ok
 182
 183 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
 184
 185    Doc-tests adder
 186
 187 running 0 tests
 188
 189 test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
 190 ```
 191
 192 `should_fail` tests can be fragile, as it's hard to guarantee that the test
 193 didn't fail for an unexpected reason. To help with this, an optional `expected`
 194 parameter can be added to the `should_fail` attribute. The test harness will
 195 make sure that the failure message contains the provided text. A safer version
 196 of the example above would be:
 197
 198 ```
 199 #[test]
 200 #[should_fail(expected = "assertion failed")]
 201 fn it_works() {
 202     assert_eq!("Hello", "world");
 203 }
 204 ```
 205
 206 That's all there is to the basics! Let's write one 'real' test:
 207
 208 ```{rust,ignore}
 209 pub fn add_two(a: i32) -> i32 {
 210     a + 2
 211 }
 212
 213 #[test]
 214 fn it_works() {
 215     assert_eq!(4, add_two(2));
 216 }
 217 ```
 218
 219 This is a very common use of `assert_eq!`: call some function with
 220 some known arguments and compare it to the expected output.
 221
 222 # The `test` module
 223
 224 There is one way in which our existing example is not idiomatic: it's
 225 missing the test module. The idiomatic way of writing our example
 226 looks like this:
 227
 228 ```{rust,ignore}
 229 pub fn add_two(a: i32) -> i32 {
 230     a + 2
 231 }
 232
 233 #[cfg(test)]
 234 mod tests {
 235     use super::add_two;
 236
 237     #[test]
 238     fn it_works() {
 239         assert_eq!(4, add_two(2));
 240     }
 241 }
 242 ```
 243
 244 There's a few changes here. The first is the introduction of a `mod tests` with
 245 a `cfg` attribute. The module allows us to group all of our tests together, and
 246 to also define helper functions if needed, that don't become a part of the rest
 247 of our crate. The `cfg` attribute only compiles our test code if we're
 248 currently trying to run the tests. This can save compile time, and also ensures
 249 that our tests are entirely left out of a normal build.
 250
 251 The second change is the `use` declaration. Because we're in an inner module,
 252 we need to bring our test function into scope. This can be annoying if you have
 253 a large module, and so this is a common use of the `glob` feature. Let's change
 254 our `src/lib.rs` to make use of it:
 255
 256 ```{rust,ignore}
 257 #![feature(globs)]
 258
 259 pub fn add_two(a: i32) -> i32 {
 260     a + 2
 261 }
 262
 263 #[cfg(test)]
 264 mod tests {
 265     use super::*;
 266
 267     #[test]
 268     fn it_works() {
 269         assert_eq!(4, add_two(2));
 270     }
 271 }
 272 ```
 273
 274 Note the `feature` attribute, as well as the different `use` line. Now we run
 275 our tests:
 276
 277 ```bash
 278 $ cargo test
 279     Updating registry `https://github.com/rust-lang/crates.io-index`
 280    Compiling adder v0.0.1 (file:///home/you/projects/adder)
 281      Running target/adder-91b3e234d4ed382a
 282
 283 running 1 test
 284 test test::it_works ... ok
 285
 286 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
 287
 288    Doc-tests adder
 289
 290 running 0 tests
 291
 292 test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
 293 ```
 294
 295 It works!
 296
 297 The current convention is to use the `test` module to hold your "unit"-style
 298 tests. Anything that just tests one small bit of functionality makes sense to
 299 go here. But what about "integration"-style tests instead? For that, we have
 300 the `tests` directory
 301
 302 # The `tests` directory
 303
 304 To write an integration test, let's make a `tests` directory, and
 305 put a `tests/lib.rs` file inside, with this as its contents:
 306
 307 ```{rust,ignore}
 308 extern crate adder;
 309
 310 #[test]
 311 fn it_works() {
 312     assert_eq(4, adder::add_two(2));
 313 }
 314 ```
 315
 316 This looks similar to our previous tests, but slightly different. We now have
 317 an `extern crate adder` at the top. This is because the tests in the `tests`
 318 directory are an entirely separate crate, and so we need to import our library.
 319 This is also why `tests` is a suitable place to write integration-style tests:
 320 they use the library like any other consumer of it would.
 321
 322 Let's run them:
 323
 324 ```bash
 325 $ cargo test
 326    Compiling adder v0.0.1 (file:///home/you/projects/adder)
 327      Running target/adder-91b3e234d4ed382a
 328
 329 running 1 test
 330 test test::it_works ... ok
 331
 332 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
 333
 334      Running target/lib-c18e7d3494509e74
 335
 336 running 1 test
 337 test it_works ... ok
 338
 339 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
 340
 341    Doc-tests adder
 342
 343 running 0 tests
 344
 345 test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
 346 ```
 347
 348 Now we have three sections: our previous test is also run, as well as our new
 349 one.
 350
 351 That's all there is to the `tests` directory. The `test` module isn't needed
 352 here, since the whole thing is focused on tests.
 353
 354 Let's finally check out that third section: documentation tests.
 355
 356 # Documentation tests
 357
 358 Nothing is better than documentation with examples. Nothing is worse than
 359 examples that don't actually work, because the code has changed since the
 360 documentation has been written. To this end, Rust supports automatically
 361 running examples in your documentation. Here's a fleshed-out `src/lib.rs`
 362 with examples:
 363
 364 ```{rust,ignore}
 365 //! The `adder` crate provides functions that add numbers to other numbers.
 366 //!
 367 //! # Examples
 368 //!
 369 //! ```
 370 //! assert_eq!(4, adder::add_two(2));
 371 //! ```
 372
 373 #![feature(globs)]
 374
 375 /// This function adds two to its argument.
 376 ///
 377 /// # Examples
 378 ///
 379 /// ```
 380 /// use adder::add_two;
 381 ///
 382 /// assert_eq!(4, add_two(2));
 383 /// ```
 384 pub fn add_two(a: i32) -> i32 {
 385     a + 2
 386 }
 387
 388 #[cfg(test)]
 389 mod tests {
 390     use super::*;
 391
 392     #[test]
 393     fn it_works() {
 394         assert_eq!(4, add_two(2));
 395     }
 396 }
 397 ```
 398
 399 Note the module-level documentation with `//!` and the function-level
 400 documentation with `///`. Rust's documentation supports Markdown in comments,
 401 and so triple graves mark code blocks. It is conventional to include the
 402 `# Examples` section, exactly like that, with examples following.
 403
 404 Let's run the tests again:
 405
 406 ```bash
 407 $ cargo test
 408    Compiling adder v0.0.1 (file:///home/steve/tmp/adder)
 409      Running target/adder-91b3e234d4ed382a
 410
 411 running 1 test
 412 test test::it_works ... ok
 413
 414 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
 415
 416      Running target/lib-c18e7d3494509e74
 417
 418 running 1 test
 419 test it_works ... ok
 420
 421 test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
 422
 423    Doc-tests adder
 424
 425 running 2 tests
 426 test add_two_0 ... ok
 427 test _0 ... ok
 428
 429 test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured
 430 ```
 431
 432 Now we have all three kinds of tests running! Note the names of the
 433 documentation tests: the `_0` is generated for the module test, and `add_two_0`
 434 for the function test. These will auto increment with names like `add_two_1` as
 435 you add more examples.
 436
 437 # Benchmark tests
 438
 439 Rust also supports benchmark tests, which can test the performance of your
 440 code. Let's make our `src/lib.rs` look like this (comments elided):
 441
 442 ```{rust,ignore}
 443 #![feature(globs)]
 444
 445 extern crate test;
 446
 447 pub fn add_two(a: i32) -> i32 {
 448     a + 2
 449 }
 450
 451 #[cfg(test)]
 452 mod tests {
 453     use super::*;
 454     use test::Bencher;
 455
 456     #[test]
 457     fn it_works() {
 458         assert_eq!(4, add_two(2));
 459     }
 460
 461     #[bench]
 462     fn bench_add_two(b: &mut Bencher) {
 463         b.iter(|| add_two(2));
 464     }
 465 }
 466 ```
 467
 468 We've imported the `test` crate, which contains our benchmarking support.
 469 We have a new function as well, with the `bench` attribute. Unlike regular
 470 tests, which take no arguments, benchmark tests take a `&mut Bencher`. This
 471 `Bencher` provides an `iter` method, which takes a closure. This closure
 472 contains the code we'd like to benchmark.
 473
 474 We can run benchmark tests with `cargo bench`:
 475
 476 ```bash
 477 $ cargo bench
 478    Compiling adder v0.0.1 (file:///home/steve/tmp/adder)
 479      Running target/release/adder-91b3e234d4ed382a
 480
 481 running 2 tests
 482 test tests::it_works ... ignored
 483 test tests::bench_add_two ... bench:         1 ns/iter (+/- 0)
 484
 485 test result: ok. 0 passed; 0 failed; 1 ignored; 1 measured
 486 ```
 487
 488 Our non-benchmark test was ignored. You may have noticed that `cargo bench`
 489 takes a bit longer than `cargo test`. This is because Rust runs our benchmark
 490 a number of times, and then takes the average. Because we're doing so little
 491 work in this example, we have a `1 ns/iter (+/- 0)`, but this would show
 492 the variance if there was one.
 493
 494 Advice on writing benchmarks:
 495
 496
 497 * Move setup code outside the `iter` loop; only put the part you want to measure inside
 498 * Make the code do "the same thing" on each iteration; do not accumulate or change state
 499 * Make the outer function idempotent too; the benchmark runner is likely to run
 500   it many times
 501 *  Make the inner `iter` loop short and fast so benchmark runs are fast and the
 502    calibrator can adjust the run-length at fine resolution
 503 * Make the code in the `iter` loop do something simple, to assist in pinpointing
 504   performance improvements (or regressions)
 505
 506 ## Gotcha: optimizations
 507
 508 There's another tricky part to writing benchmarks: benchmarks compiled with
 509 optimizations activated can be dramatically changed by the optimizer so that
 510 the benchmark is no longer benchmarking what one expects. For example, the
 511 compiler might recognize that some calculation has no external effects and
 512 remove it entirely.
 513
 514 ```{rust,ignore}
 515 extern crate test;
 516 use test::Bencher;
 517
 518 #[bench]
 519 fn bench_xor_1000_ints(b: &mut Bencher) {
 520     b.iter(|| {
 521         range(0u, 1000).fold(0, |old, new| old ^ new);
 522     });
 523 }
 524 ```
 525
 526 gives the following results
 527
 528 ```text
 529 running 1 test
 530 test bench_xor_1000_ints ... bench:         0 ns/iter (+/- 0)
 531
 532 test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
 533 ```
 534
 535 The benchmarking runner offers two ways to avoid this. Either, the closure that
 536 the `iter` method receives can return an arbitrary value which forces the
 537 optimizer to consider the result used and ensures it cannot remove the
 538 computation entirely. This could be done for the example above by adjusting the
 539 `b.iter` call to
 540
 541 ```rust
 542 # struct X;
 543 # impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;
 544 b.iter(|| {
 545     // note lack of `;` (could also use an explicit `return`).
 546     range(0u, 1000).fold(0, |old, new| old ^ new)
 547 });
 548 ```
 549
 550 Or, the other option is to call the generic `test::black_box` function, which
 551 is an opaque "black box" to the optimizer and so forces it to consider any
 552 argument as used.
 553
 554 ```rust
 555 extern crate test;
 556
 557 # fn main() {
 558 # struct X;
 559 # impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;
 560 b.iter(|| {
 561     let mut n = 1000_u32;
 562
 563     test::black_box(&mut n); // pretend to modify `n`
 564
 565     range(0, n).fold(0, |a, b| a ^ b)
 566 })
 567 # }
 568 ```
 569
 570 Neither of these read or modify the value, and are very cheap for small values.
 571 Larger values can be passed indirectly to reduce overhead (e.g.
 572 `black_box(&huge_struct)`).
 573
 574 Performing either of the above changes gives the following benchmarking results
 575
 576 ```text
 577 running 1 test
 578 test bench_xor_1000_ints ... bench:       1 ns/iter (+/- 0)
 579
 580 test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
 581 ```
 582
 583 However, the optimizer can still modify a testcase in an undesirable manner
 584 even when using either of the above.