README.md

   1 # rustfmt
   2
   3 A tool for formatting Rust code according to style guidelines.
   4
   5
   6 ## How to use
   7
   8 You'll need a pretty up to date version of the **nightly** version of Rust.
   9 You will need a default.toml file in the current working directory when you run
  10 the rustfmt command. You can look at this repo for an example default.toml file.
  11
  12 `cargo build` to build.
  13
  14 `cargo test` to run all tests.
  15
  16 `cargo run -- filename` to run on a file, if the file includes out of line modules,
  17 then we reformat those too. So to run on a whole module or crate, you just need
  18 to run on the top file.
  19
  20 You'll probably want to specify the write mode. Currently, there are the replace,
  21 overwrite and display mode. The replace mode is the default and overwrites the
  22 original files after renaming them. In overwrite mode, rustfmt does not backup
  23 the source files. To print the output to stdout, use the display mode. The write
  24 mode can be set by passing the `--write-mode` flag on the command line.
  25
  26 `cargo run -- filename --write-mode=display` prints the output of rustfmt to the
  27 screen, for example.
  28
  29 ## Use cases
  30
  31 A formatting tool can be used in different ways and the different use cases can
  32 affect the design of the tool. The use cases I'm particularly concerned with are:
  33
  34 * running on a whole repo before check-in
  35   - in particular, to replace the `make tidy` pass on the Rust distro
  36 * running on code from another project that you are adding to your own
  37 * using for mass changes in code style over a project
  38
  39 Some valid use cases for a formatting tool which I am explicitly not trying to
  40 address (although it would be nice, if possible):
  41
  42 * running 'as you type' in an IDE
  43 * running on arbitrary snippets of code
  44 * running on Rust-like code, specifically code which doesn't parse
  45 * use as a pretty printer inside the compiler
  46 * refactoring
  47 * formatting totally unformatted source code
  48
  49
  50 ## Scope and vision
  51
  52 I do not subscribe to the notion that a formatting tool should only change
  53 whitespace. I believe that we should semantics preserving, but not necessarily
  54 syntax preserving, i.e., we can change the AST of a program.
  55
  56 I.e., we might change glob imports to list or single imports, re-order imports,
  57 move bounds to where clauses, combine multiple impls into a single impl, etc.
  58
  59 However, we will not change the names of variables or make any changes which
  60 *could* change the semantics. To be ever so slightly formal, we might imagine
  61 a compilers high level intermediate representation, we should strive to only
  62 make changes which do not change the HIR, even if they do change the AST.
  63
  64 I would like to be able to output refactoring scripts for making deeper changes
  65 though. (E.g., renaming variables to satisfy our style guidelines).
  66
  67 My long term goal is that all style lints can be moved from the compiler to
  68 rustfmt and, as well as warning, can either fix problems or emit refactoring
  69 scripts to do so.
  70
  71 ### Configurability
  72
  73 I believe reformatting should be configurable to some extent. We should read in
  74 options from a configuration file and reformat accordingly. We should supply at
  75 least a config file which matches the Rust style guidelines.
  76
  77 There should be multiple modes for running the tool. As well as simply replacing
  78 each file, we should be able to show the user a list of the changes we would
  79 make, or show a list of violations without corrections (the difference being
  80 that there are multiple ways to satisfy a given set of style guidelines, and we
  81 should distinguish violations from deviations from our own model).
  82
  83
  84 ## Implementation philosophy
  85
  86 Some details of the philosophy behind the implementation.
  87
  88
  89 ### Operate on the AST
  90
  91 A reformatting tool can be based on either the AST or a token stream (in Rust
  92 this is actually a stream of token trees, but its not a fundamental difference).
  93 There are pros and cons to the two approaches. I have chosen to use the AST
  94 approach. The primary reasons are that it allows us to do more sophisticated
  95 manipulations, rather than just change whitespace, and it gives us more context
  96 when making those changes.
  97
  98 The advantage of the tokens approach are that you can operate on non-parsable
  99 code. I don't care too much about that, it would be nice, but I think being able
 100 to perform sophisticated transformations is more important. In the future I hope to
 101 (optionally) be able to use type information for informing reformatting too. One
 102 specific case of unparsable code is macros. Using tokens is certainly easier
 103 here, but I believe it is perfectly solvable with the AST approach. At the limit,
 104 we can operate on just tokens in the macro case.
 105
 106 I believe that there is not in fact that much difference between the two
 107 approaches. Due to imperfect span information, under the AST approach, we
 108 sometimes are reduced to examining tokens or do some re-lexing of our own. Under
 109 the tokens approach you need to implement your own (much simpler) parser. I
 110 believe that as the tool gets more sophisticated, you end up doing more at the
 111 token-level, or having an increasingly sophisticated parser, until at the limit
 112 you have the same tool.
 113
 114 However, I believe starting from the AST gets you more quickly to a usable and
 115 useful tool.
 116
 117
 118 ### Heuristic rather than algorithmic
 119
 120 Many formatting tools use a very general algorithmic or even algebraic tool for
 121 pretty printing. This results in very elegant code, but I believe does not give
 122 the best results. I prefer a more ad hoc approach where each expression/item is
 123 formatted using custom rules. We hopefully don't end up with too much code due
 124 to good old fashioned abstraction and code sharing. This will give a bigger code
 125 base, but hopefully a better result.
 126
 127 It also means that there will be some cases we can't format and we have to give
 128 up. I think that is OK. Hopefully they are rare enough that manually fixing them
 129 is not painful. Better to have a tool that gives great code in 99% of cases and
 130 fails in 1% than a tool which gives 50% great code and 50% ugly code, but never
 131 fails.
 132
 133
 134 ### Incremental development
 135
 136 I want rustfmt to be useful as soon as possible and to always be useful. I
 137 specifically don't want to have to wait for a feature (or worse, the whole tool)
 138 to be perfect before it is useful. The main ways this is achieved is to output
 139 the source code where we can't yet reformat, be able to turn off new features
 140 until they are ready, and the 'do no harm' principle (see next section).
 141
 142
 143 ### First, do no harm
 144
 145 Until rustfmt it perfect, there will always be a trade-off between doing more and
 146 doing existing things well. I want to err on the side of the latter.
 147 Specifically, rustfmt should never take OK code and make it look worse. If we
 148 can't make it better, we should leave it as is. That might mean being less
 149 aggressive than we like or using configurability.
 150
 151
 152 ### Use the source code as guidance
 153
 154 There are often multiple ways to format code and satisfy standards. Where this
 155 is the case, we should use the source code as a hint for reformatting.
 156 Furthermore, where the code has been formatted in a particular way that satisfies
 157 the coding standard, it should not be changed (this is sometimes not possible or
 158 not worthwhile due to uniformity being desirable, but it is a useful goal).
 159
 160
 161 ### Architecture details
 162
 163 We use the AST from libsyntax. We use libsyntax's visit module to walk the AST
 164 to find starting points for reformatting. Eventually, we should reformat everything
 165 and we shouldn't need the visit module. We keep track of the last formatted
 166 position in the code, and when we reformat the next piece of code we make sure
 167 to output the span for all the code in between (handled by missed_spans.rs).
 168
 169 Our visitor keeps track of the desired current indent due to blocks (
 170 `block_indent`). Each `visit_*` method reformats code according to this indent
 171 and `IDEAL_WIDTH` and `MAX_WIDTH` (which should one day be supplied from a
 172 config file). Most reformatting done in the `visit_*` methods is a bit hackey
 173 and is meant to be temporary until it can be done properly.
 174
 175 There are a bunch of methods called `rewrite_*`. There do the bulk of the
 176 reformatting. These take the AST node to be reformatted (this may not literally
 177 be an AST node from libsyntax, there might be multiple parameters describing a
 178 logical node), the current indent, and the current width budget. They return a
 179 `String` (or sometimes an `Option<String>`) which formats the code in the box
 180 given by the indent and width budget. If the method fails, it returns `None` and
 181 the calling method then has to fallback in some way to give the callee more space.
 182
 183 So, in summary to format a node, we calculate the width budget and then walk down
 184 the tree from the node. At a leaf, we generate an actual string and then unwind,
 185 combining these strings as we go back up the tree.
 186
 187 For example, consider a method definition:
 188
 189 ```
 190     fn foo(a: A, b: B) {
 191         ...
 192     }
 193 ```
 194
 195 We start at indent 4, the rewrite function for the whole function knows it must
 196 write `fn foo(` before the arguments and `) {` after them, assuming the max width
 197 is 100, it thus asks the rewrite argument list function to rewrite with an indent
 198 of 11 and in a width of 86. Assuming that is possible (obviously in this case),
 199 it returns a string for the arguments and it can make a string for the function
 200 header. If the arguments couldn't be fitted in that space, we might try to
 201 fallback to a hanging indent, so we try again with indent 8 and width 89.
 202
 203
 204 ## Contributing
 205
 206 ### Test and file issues
 207
 208 It would be really useful to have people use rustfmt on their projects and file
 209 issues where it does something you don't expect.
 210
 211 A really useful thing to do that on a crate from the Rust repo. If it does
 212 something unexpected, file an issue; if not, make a PR to the Rust repo with the reformatted code. I hope to get the whole repo consistently rustfmt'ed and to
 213 replace `make tidy` with rustfmt as a medium-term goal.
 214
 215 ### Create test cases
 216
 217 Having a strong test suite for a tool like this is essential. It is very easy
 218 to create regressions. Any tests you can add are very much appreciated.
 219
 220 ### Hack!
 221
 222 Here are some [good starting issues](https://github.com/nrc/rustfmt/issues?q=is%3Aopen+is%3Aissue+label%3Aeasy).
 223 Note than some of those issues tagged 'easy' are not that easy and might be better
 224 second issues, rather than good first issues to fix.
 225
 226 If you've found areas which need polish and don't have issues, please submit a
 227 PR, don't feel there needs to be an issue.