Contributing.md

   1 # Contributing
   2
   3 There are many ways to contribute to Rustfmt. This document lays out what they
   4 are and has information for how to get started. If you have any questions about
   5 contributing or need help with anything, please ping nrc on irc, #rust-tools is
   6 probably the best channel. Feel free to also ask questions on issues, or file
   7 new issues specifically to get help.
   8
   9
  10 ## Test and file issues
  11
  12 It would be really useful to have people use rustfmt on their projects and file
  13 issues where it does something you don't expect.
  14
  15 A really useful thing to do that on a crate from the Rust repo. If it does
  16 something unexpected, file an issue; if not, make a PR to the Rust repo with the
  17 reformatted code. We hope to get the whole repo consistently rustfmt'ed and to
  18 replace `make tidy` with rustfmt as a medium-term goal. Issues with stack traces
  19 for bugs and/or minimal test cases are especially useful.
  20
  21 See this [blog post](http://ncameron.org/blog/rustfmt-ing-rust/) for more details.
  22
  23
  24 ## Create test cases
  25
  26 Having a strong test suite for a tool like this is essential. It is very easy
  27 to create regressions. Any tests you can add are very much appreciated.
  28
  29 The tests can be run with `cargo test`. This does a number of things:
  30 * runs the unit tests for a number of internal functions;
  31 * makes sure that rustfmt run on every file in `./tests/source/` is equal to its
  32   associated file in `./tests/target/`;
  33 * runs idempotence tests on the files in `./tests/target/`. These files should
  34   not be changed by rustfmt;
  35 * checks that rustfmt's code is not changed by running on itself. This ensures
  36   that the project bootstraps.
  37
  38 Creating a test is as easy as creating a new file in `./tests/source/` and an
  39 equally named one in `./tests/target/`. If it is only required that rustfmt
  40 leaves a piece of code unformatted, it may suffice to only create a target file.
  41
  42 Whenever there's a discrepancy between the expected output when running tests, a
  43 colourised diff will be printed so that the offending line(s) can quickly be
  44 identified.
  45
  46 Without explicit settings, the tests will be run using rustfmt's default
  47 configuration. It is possible to run a test using non-default settings by
  48 including configuration parameters in comments at the top of the file. For
  49 example: to use 3 spaces per tab, start your test with
  50 `// rustfmt-tab_spaces: 3`. Just remember that the comment is part of the input,
  51 so include in both the source and target files! It is also possible to
  52 explicitly specify the name of the expected output file in the target directory.
  53 Use `// rustfmt-target: filename.rs` for this. Finally, you can use a custom
  54 configuration by using the `rustfmt-config` directive. Rustfmt will then use
  55 that toml file located in `./tests/config/` for its configuration. Including
  56 `// rustfmt-config: small_tabs.toml` will run your test with the configuration
  57 file found at `./tests/config/small_tabs.toml`.
  58
  59
  60 ## Hack!
  61
  62 Here are some [good starting issues](https://github.com/nrc/rustfmt/issues?q=is%3Aopen+is%3Aissue+label%3Aeasy).
  63
  64 If you've found areas which need polish and don't have issues, please submit a
  65 PR, don't feel there needs to be an issue.
  66
  67
  68 ### Guidelines
  69
  70 Rustfmt bootstraps, that is part of its test suite is running itself on its
  71 source code. So, basically, the only style guideline is that you must pass the
  72 tests. That ensures that the Rustfmt source code adheres to our own conventions.
  73
  74 Talking of tests, if you add a new feature or fix a bug, please also add a test.
  75 It's really easy, see above for details. Please run `cargo test` before
  76 submitting a PR to ensure your patch passes all tests, it's pretty quick.
  77
  78 Please try to avoid leaving `TODO`s in the code. There are a few around, but I
  79 wish there weren't. You can leave `FIXME`s, preferably with an issue number.
  80
  81
  82 ### A quick tour of Rustfmt
  83
  84 Rustfmt is basically a pretty printer - that is, it's mode of operation is to
  85 take an AST (abstract syntax tree) and print it in a nice way (including staying
  86 under the maximum permitted width for a line). In order to get that AST, we
  87 first have to parse the source text, we use the Rust compiler's parser to do
  88 that (see [src/lib.rs]). We shy away from doing anything too fancy, such as
  89 algebraic approaches to pretty printing, instead relying on an heuristic
  90 approach, 'manually' crafting a string for each AST node. This results in quite
  91 a lot of code, but it is relatively simple.
  92
  93 The AST is a tree view of source code. It carries all the semantic information
  94 about the code, but not all of the syntax. In particular, we lose white space
  95 and comments (although doc comments are preserved). Rustfmt uses a view of the
  96 AST before macros are expanded, so there are still macro uses in the code. The
  97 arguments to macros are not an AST, but raw tokens - this makes them harder to
  98 format.
  99
 100 There are different nodes for every kind of item and expression in Rust. For
 101 more details see the source code in the compiler -
 102 [ast.rs](https://dxr.mozilla.org/rust/source/src/libsyntax/ast.rs) - and/or the
 103 [docs](http://manishearth.github.io/rust-internals-docs/syntax/ast/index.html).
 104
 105 Many nodes in the AST (but not all, annoyingly) have a `Span`. A `Span` is a
 106 range in the source code, it can easily be converted to a snippet of source
 107 text. When the AST does not contain enough information for us, we rely heavily
 108 on `Span`s. For example, we can look between spans to try and find comments, or
 109 parse a snippet to see how the user wrote their source code.
 110
 111 The downside of using the AST is that we miss some information - primarily white
 112 space and comments. White space is sometimes significant, although mostly we
 113 want to ignore it and make our own. We strive to reproduce all comments, but
 114 this is sometimes difficult. The crufty corners of Rustfmt are where we hack
 115 around the absence of comments in the AST and try to recreate them as best we
 116 can.
 117
 118 Our primary tool here is to look between spans for text we've missed. For
 119 example, in a function call `foo(a, b)`, we have spans for `a` and `b`, in this
 120 case there is only a comma and a single space between the end of `a` and the
 121 start of `b`, so there is nothing much to do. But if we look at
 122 `foo(a /* a comment */, b)`, then between `a` and `b` we find the comment.
 123
 124 At a higher level, Rustfmt has machinery so that we account for text between
 125 'top level' items. Then we can reproduce that text pretty much verbatim. We only
 126 count spans we actually reformat, so if we can't format a span it is not missed
 127 completely, but is reproduced in the output without being formatted. This is
 128 mostly handled in [src/missed_spans.rs]. See also `FmtVisitor::last_pos` in
 129 [src/visitor.rs].
 130
 131
 132 #### Some important elements
 133
 134 At the highest level, Rustfmt uses a `Visitor` implementation called `FmtVisitor`
 135 to walk the AST. This is in [src/visitor.rs]. This is really just used to walk
 136 items, rather than the bodies of functions. We also cover macros and attributes
 137 here. Most methods of the visitor call out to `Rewrite` implementations that
 138 then walk their own children.
 139
 140 The `Rewrite` trait is defined in [src/rewrite.rs]. It is implemented for many
 141 things that can be rewritten, mostly AST nodes. It has a single function,
 142 `rewrite`, which is called to rewrite `self` into an `Option<String>`. The
 143 arguments are `width` which is the horizontal space we write into, and `offset`
 144 which is how much we are currently indented from the lhs of the page. We also
 145 take a context which contains information used for parsing, the current block
 146 indent, and a configuration (see below).
 147
 148 To understand the indents, consider
 149
 150 ```
 151 impl Foo {
 152     fn foo(...) {
 153         bar(argument_one,
 154             baz());
 155     }
 156 }
 157 ```
 158
 159 When formatting the `bar` call we will format the arguments in order, after the
 160 first one we know we are working on multiple lines (imagine it is longer than
 161 written). So, when we come to the second argument, the indent we pass to
 162 `rewrite` is 12, which puts us under the first argument. The current block
 163 indent (stored in the context) is 8. The former is used for visual indenting
 164 (when objects are vertically aligned with some marker), the latter is used for
 165 block indenting (when objects are tabbed in from the lhs). The width available
 166 for `baz()` will be the maximum width, minus the space used for indenting, minus
 167 the space used for the `);`. (Note that actual argument formatting does not
 168 quite work like this, but it's close enough).
 169
 170 The `rewrite` function returns an `Option` - either we successfully rewrite and
 171 return the rewritten string for the caller to use, or we fail to rewrite and
 172 return `None`. This could be because Rustfmt encounters something it doesn't
 173 know how to reformat, but more often it is because Rustfmt can't fit the item
 174 into the required width. How to handle this is up to the caller. Often the
 175 caller just gives up, ultimately relying on the missed spans system to paste in
 176 the un-formatted source. A better solution (although not performed in many
 177 places) is for the caller to shuffle around some of it's other items to make
 178 more width, then call the function again with more space.
 179
 180 Since it is common for callers to bail out when a callee fails, we often use a
 181 `try_opt!` macro to make this pattern more succinct.
 182
 183 One way we might find out that we don't have enough space is when computing how much
 184 space we have. Something like `available_space = budget - overhead`. Since
 185 widths are unsized integers, this would cause underflow. Therefore we use
 186 checked subtraction: `available_space = try_opt!(budget.checked_sub(overhead))`.
 187 `checked_sub` returns an `Option`, and if we would underflow `try_opt!` returns
 188 `None`, otherwise we proceed with the computed space.
 189
 190 Much syntax in Rust is lists: lists of arguments, lists of fields, lists of
 191 array elements, etc. We have some generic code to handle lists, including how to
 192 space them in horizontal and vertical space, indentation, comments between
 193 items, trailing separators, etc. However, since there are so many options, the
 194 code is a bit complex. Look in [src/lists.rs]. `write_list` is the key function,
 195 and `ListFormatting` the key structure for configuration. You'll need to make a
 196 `ListItems` for input, this is usually done using `itemize_list`.
 197
 198 Rustfmt strives to be highly configurable. Often the first part of a patch is
 199 creating a configuration option for the feature you are implementing. All
 200 handling of configuration options is done in [src/config.rs]. Look for the
 201 `create_config!` macro at the end of the file for all the options. The rest of
 202 the file defines a bunch of enums used for options, and the machinery to produce
 203 the config struct and parse a config file, etc. Checking an option is done by
 204 accessing the correct field on the config struct, e.g., `config.max_width`. Most
 205 functions have a `Config`, or one can be accessed via a visitor or context of
 206 some kind.