src/doc/style/ownership/builders.md

   1 % The builder pattern
   2
   3 Some data structures are complicated to construct, due to their construction needing:
   4
   5 * a large number of inputs
   6 * compound data (e.g. slices)
   7 * optional configuration data
   8 * choice between several flavors
   9
  10 which can easily lead to a large number of distinct constructors with
  11 many arguments each.
  12
  13 If `T` is such a data structure, consider introducing a `T` _builder_:
  14
  15 1. Introduce a separate data type `TBuilder` for incrementally configuring a `T`
  16    value. When possible, choose a better name: e.g. `Command` is the builder for
  17    `Process`.
  18 2. The builder constructor should take as parameters only the data _required_ to
  19    make a `T`.
  20 3. The builder should offer a suite of convenient methods for configuration,
  21    including setting up compound inputs (like slices) incrementally.
  22    These methods should return `self` to allow chaining.
  23 4. The builder should provide one or more "_terminal_" methods for actually building a `T`.
  24
  25 The builder pattern is especially appropriate when building a `T` involves side
  26 effects, such as spawning a thread or launching a process.
  27
  28 In Rust, there are two variants of the builder pattern, differing in the
  29 treatment of ownership, as described below.
  30
  31 ### Non-consuming builders (preferred):
  32
  33 In some cases, constructing the final `T` does not require the builder itself to
  34 be consumed. The follow variant on
  35 [`std::process::Command`](https://doc.rust-lang.org/stable/std/process/struct.Command.html)
  36 is one example:
  37
  38 ```rust,ignore
  39 // NOTE: the actual Command API does not use owned Strings;
  40 // this is a simplified version.
  41
  42 pub struct Command {
  43     program: String,
  44     args: Vec<String>,
  45     cwd: Option<String>,
  46     // etc
  47 }
  48
  49 impl Command {
  50     pub fn new(program: String) -> Command {
  51         Command {
  52             program: program,
  53             args: Vec::new(),
  54             cwd: None,
  55         }
  56     }
  57
  58     /// Add an argument to pass to the program.
  59     pub fn arg<'a>(&'a mut self, arg: String) -> &'a mut Command {
  60         self.args.push(arg);
  61         self
  62     }
  63
  64     /// Add multiple arguments to pass to the program.
  65     pub fn args<'a>(&'a mut self, args: &[String])
  66                     -> &'a mut Command {
  67         self.args.push_all(args);
  68         self
  69     }
  70
  71     /// Set the working directory for the child process.
  72     pub fn cwd<'a>(&'a mut self, dir: String) -> &'a mut Command {
  73         self.cwd = Some(dir);
  74         self
  75     }
  76
  77     /// Executes the command as a child process, which is returned.
  78     pub fn spawn(&self) -> std::io::Result<Process> {
  79         ...
  80     }
  81 }
  82 ```
  83
  84 Note that the `spawn` method, which actually uses the builder configuration to
  85 spawn a process, takes the builder by immutable reference. This is possible
  86 because spawning the process does not require ownership of the configuration
  87 data.
  88
  89 Because the terminal `spawn` method only needs a reference, the configuration
  90 methods take and return a mutable borrow of `self`.
  91
  92 #### The benefit
  93
  94 By using borrows throughout, `Command` can be used conveniently for both
  95 one-liner and more complex constructions:
  96
  97 ```rust,ignore
  98 // One-liners
  99 Command::new("/bin/cat").arg("file.txt").spawn();
 100
 101 // Complex configuration
 102 let mut cmd = Command::new("/bin/ls");
 103 cmd.arg(".");
 104
 105 if size_sorted {
 106     cmd.arg("-S");
 107 }
 108
 109 cmd.spawn();
 110 ```
 111
 112 ### Consuming builders:
 113
 114 Sometimes builders must transfer ownership when constructing the final type
 115 `T`, meaning that the terminal methods must take `self` rather than `&self`:
 116
 117 ```rust,ignore
 118 // A simplified excerpt from std::thread::Builder
 119
 120 impl ThreadBuilder {
 121     /// Name the thread-to-be. Currently the name is used for identification
 122     /// only in failure messages.
 123     pub fn named(mut self, name: String) -> ThreadBuilder {
 124         self.name = Some(name);
 125         self
 126     }
 127
 128     /// Redirect thread-local stdout.
 129     pub fn stdout(mut self, stdout: Box<Writer + Send>) -> ThreadBuilder {
 130         self.stdout = Some(stdout);
 131         //   ^~~~~~ this is owned and cannot be cloned/re-used
 132         self
 133     }
 134
 135     /// Creates and executes a new child thread.
 136     pub fn spawn(self, f: proc():Send) {
 137         // consume self
 138         ...
 139     }
 140 }
 141 ```
 142
 143 Here, the `stdout` configuration involves passing ownership of a `Writer`,
 144 which must be transferred to the thread upon construction (in `spawn`).
 145
 146 When the terminal methods of the builder require ownership, there is a basic tradeoff:
 147
 148 * If the other builder methods take/return a mutable borrow, the complex
 149   configuration case will work well, but one-liner configuration becomes
 150   _impossible_.
 151
 152 * If the other builder methods take/return an owned `self`, one-liners
 153   continue to work well but complex configuration is less convenient.
 154
 155 Under the rubric of making easy things easy and hard things possible, _all_
 156 builder methods for a consuming builder should take and returned an owned
 157 `self`. Then client code works as follows:
 158
 159 ```rust,ignore
 160 // One-liners
 161 ThreadBuilder::new().named("my_thread").spawn(proc() { ... });
 162
 163 // Complex configuration
 164 let mut thread = ThreadBuilder::new();
 165 thread = thread.named("my_thread_2"); // must re-assign to retain ownership
 166
 167 if reroute {
 168     thread = thread.stdout(mywriter);
 169 }
 170
 171 thread.spawn(proc() { ... });
 172 ```
 173
 174 One-liners work as before, because ownership is threaded through each of the
 175 builder methods until being consumed by `spawn`. Complex configuration,
 176 however, is more verbose: it requires re-assigning the builder at each step.