• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..--

examples/25-Apr-2025-1,6581,653

src/25-Apr-2025-6,5405,702

tests/25-Apr-2025-5848

.cargo-checksum.jsonD25-Apr-20252.4 KiB11

Android.bpD25-Apr-2025784 3026

Cargo.tomlD25-Apr-20251.6 KiB6654

LICENSED25-Apr-20259.5 KiB177150

LICENSE-APACHED25-Apr-20259.5 KiB177150

LICENSE-MITD25-Apr-20251,023 2421

METADATAD25-Apr-2025402 1817

MODULE_LICENSE_APACHE2D25-Apr-20250

README.mdD25-Apr-202514.4 KiB313254

build.rsD25-Apr-2025132 64

cargo_embargo.jsonD25-Apr-2025140 109

README.md

1prettyplease::unparse
2=====================
3
4[<img alt="github" src="https://img.shields.io/badge/github-dtolnay/prettyplease-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/prettyplease)
5[<img alt="crates.io" src="https://img.shields.io/crates/v/prettyplease.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/prettyplease)
6[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-prettyplease-66c2a5?style=for-the-badge&labelColor=555555&logo=docs.rs" height="20">](https://docs.rs/prettyplease)
7[<img alt="build status" src="https://img.shields.io/github/actions/workflow/status/dtolnay/prettyplease/ci.yml?branch=master&style=for-the-badge" height="20">](https://github.com/dtolnay/prettyplease/actions?query=branch%3Amaster)
8
9A minimal `syn` syntax tree pretty-printer.
10
11<br>
12
13## Overview
14
15This is a pretty-printer to turn a `syn` syntax tree into a `String` of
16well-formatted source code. In contrast to rustfmt, this library is intended to
17be suitable for arbitrary generated code.
18
19Rustfmt prioritizes high-quality output that is impeccable enough that you'd be
20comfortable spending your career staring at its output &mdash; but that means
21some heavyweight algorithms, and it has a tendency to bail out on code that is
22hard to format (for example [rustfmt#3697], and there are dozens more issues
23like it). That's not necessarily a big deal for human-generated code because
24when code gets highly nested, the human will naturally be inclined to refactor
25into more easily formattable code. But for generated code, having the formatter
26just give up leaves it totally unreadable.
27
28[rustfmt#3697]: https://github.com/rust-lang/rustfmt/issues/3697
29
30This library is designed using the simplest possible algorithm and data
31structures that can deliver about 95% of the quality of rustfmt-formatted
32output. In my experience testing real-world code, approximately 97-98% of output
33lines come out identical between rustfmt's formatting and this crate's. The rest
34have slightly different linebreak decisions, but still clearly follow the
35dominant modern Rust style.
36
37The tradeoffs made by this crate are a good fit for generated code that you will
38*not* spend your career staring at. For example, the output of `bindgen`, or the
39output of `cargo-expand`. In those cases it's more important that the whole
40thing be formattable without the formatter giving up, than that it be flawless.
41
42<br>
43
44## Feature matrix
45
46Here are a few superficial comparisons of this crate against the AST
47pretty-printer built into rustc, and rustfmt. The sections below go into more
48detail comparing the output of each of these libraries.
49
50| | prettyplease | rustc | rustfmt |
51|:---|:---:|:---:|:---:|
52| non-pathological behavior on big or generated code | �� | ❌ | ❌ |
53| idiomatic modern formatting ("locally indistinguishable from rustfmt") | �� | ❌ | �� |
54| throughput | 60 MB/s | 39 MB/s | 2.8 MB/s |
55| number of dependencies | 3 | 72 | 66 |
56| compile time including dependencies | 2.4 sec | 23.1 sec | 29.8 sec |
57| buildable using a stable Rust compiler | �� | ❌ | ❌ |
58| published to crates.io | �� | ❌ | ❌ |
59| extensively configurable output | ❌ | ❌ | �� |
60| intended to accommodate hand-maintained source code | ❌ | ❌ | �� |
61
62<br>
63
64## Comparison to rustfmt
65
66- [input.rs](https://github.com/dtolnay/prettyplease/blob/0.1.0/examples/input.rs)
67- [output.prettyplease.rs](https://github.com/dtolnay/prettyplease/blob/0.1.0/examples/output.prettyplease.rs)
68- [output.rustfmt.rs](https://github.com/dtolnay/prettyplease/blob/0.1.0/examples/output.rustfmt.rs)
69
70If you weren't told which output file is which, it would be practically
71impossible to tell &mdash; **except** for line 435 in the rustfmt output, which
72is more than 1000 characters long because rustfmt just gave up formatting that
73part of the file:
74
75```rust
76            match segments[5] {
77                0 => write!(f, "::{}", ipv4),
78                0xffff => write!(f, "::ffff:{}", ipv4),
79                _ => unreachable!(),
80            }
81        } else { # [derive (Copy , Clone , Default)] struct Span { start : usize , len : usize , } let zeroes = { let mut longest = Span :: default () ; let mut current = Span :: default () ; for (i , & segment) in segments . iter () . enumerate () { if segment == 0 { if current . len == 0 { current . start = i ; } current . len += 1 ; if current . len > longest . len { longest = current ; } } else { current = Span :: default () ; } } longest } ; # [doc = " Write a colon-separated part of the address"] # [inline] fn fmt_subslice (f : & mut fmt :: Formatter < '_ > , chunk : & [u16]) -> fmt :: Result { if let Some ((first , tail)) = chunk . split_first () { write ! (f , "{:x}" , first) ? ; for segment in tail { f . write_char (':') ? ; write ! (f , "{:x}" , segment) ? ; } } Ok (()) } if zeroes . len > 1 { fmt_subslice (f , & segments [.. zeroes . start]) ? ; f . write_str ("::") ? ; fmt_subslice (f , & segments [zeroes . start + zeroes . len ..]) } else { fmt_subslice (f , & segments) } }
82    } else {
83        const IPV6_BUF_LEN: usize = (4 * 8) + 7;
84        let mut buf = [0u8; IPV6_BUF_LEN];
85        let mut buf_slice = &mut buf[..];
86```
87
88This is a pretty typical manifestation of rustfmt bailing out in generated code
89&mdash; a chunk of the input ends up on one line. The other manifestation is
90that you're working on some code, running rustfmt on save like a conscientious
91developer, but after a while notice it isn't doing anything. You introduce an
92intentional formatting issue, like a stray indent or semicolon, and run rustfmt
93to check your suspicion. Nope, it doesn't get cleaned up &mdash; rustfmt is just
94not formatting the part of the file you are working on.
95
96The prettyplease library is designed to have no pathological cases that force a
97bail out; the entire input you give it will get formatted in some "good enough"
98form.
99
100Separately, rustfmt can be problematic to integrate into projects. It's written
101using rustc's internal syntax tree, so it can't be built by a stable compiler.
102Its releases are not regularly published to crates.io, so in Cargo builds you'd
103need to depend on it as a git dependency, which precludes publishing your crate
104to crates.io also. You can shell out to a `rustfmt` binary, but that'll be
105whatever rustfmt version is installed on each developer's system (if any), which
106can lead to spurious diffs in checked-in generated code formatted by different
107versions. In contrast prettyplease is designed to be easy to pull in as a
108library, and compiles fast.
109
110<br>
111
112## Comparison to rustc_ast_pretty
113
114- [input.rs](https://github.com/dtolnay/prettyplease/blob/0.1.0/examples/input.rs)
115- [output.prettyplease.rs](https://github.com/dtolnay/prettyplease/blob/0.1.0/examples/output.prettyplease.rs)
116- [output.rustc.rs](https://github.com/dtolnay/prettyplease/blob/0.1.0/examples/output.rustc.rs)
117
118This is the pretty-printer that gets used when rustc prints source code, such as
119`rustc -Zunpretty=expanded`. It's used also by the standard library's
120`stringify!` when stringifying an interpolated macro_rules AST fragment, like an
121$:expr, and transitively by `dbg!` and many macros in the ecosystem.
122
123Rustc's formatting is mostly okay, but does not hew closely to the dominant
124contemporary style of Rust formatting. Some things wouldn't ever be written on
125one line, like this `match` expression, and certainly not with a comma in front
126of the closing brace:
127
128```rust
129fn eq(&self, other: &IpAddr) -> bool {
130    match other { IpAddr::V4(v4) => self == v4, IpAddr::V6(_) => false, }
131}
132```
133
134Some places use non-multiple-of-4 indentation, which is definitely not the norm:
135
136```rust
137pub const fn to_ipv6_mapped(&self) -> Ipv6Addr {
138    let [a, b, c, d] = self.octets();
139    Ipv6Addr{inner:
140                 c::in6_addr{s6_addr:
141                                 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xFF,
142                                  0xFF, a, b, c, d],},}
143}
144```
145
146And although there isn't an egregious example of it in the link because the
147input code is pretty tame, in general rustc_ast_pretty has pathological behavior
148on generated code. It has a tendency to use excessive horizontal indentation and
149rapidly run out of width:
150
151```rust
152::std::io::_print(::core::fmt::Arguments::new_v1(&[""],
153                                                 &match (&msg,) {
154                                                      _args =>
155                                                      [::core::fmt::ArgumentV1::new(_args.0,
156                                                                                    ::core::fmt::Display::fmt)],
157                                                  }));
158```
159
160The snippets above are clearly different from modern rustfmt style. In contrast,
161prettyplease is designed to have output that is practically indistinguishable
162from rustfmt-formatted code.
163
164<br>
165
166## Example
167
168```rust
169// [dependencies]
170// prettyplease = "0.2"
171// syn = { version = "2", default-features = false, features = ["full", "parsing"] }
172
173const INPUT: &str = stringify! {
174    use crate::{
175          lazy::{Lazy, SyncLazy, SyncOnceCell}, panic,
176        sync::{ atomic::{AtomicUsize, Ordering::SeqCst},
177            mpsc::channel, Mutex, },
178      thread,
179    };
180    impl<T, U> Into<U> for T where U: From<T> {
181        fn into(self) -> U { U::from(self) }
182    }
183};
184
185fn main() {
186    let syntax_tree = syn::parse_file(INPUT).unwrap();
187    let formatted = prettyplease::unparse(&syntax_tree);
188    print!("{}", formatted);
189}
190```
191
192<br>
193
194## Algorithm notes
195
196The approach and terminology used in the implementation are derived from [*Derek
197C. Oppen, "Pretty Printing" (1979)*][paper], on which rustc_ast_pretty is also
198based, and from rustc_ast_pretty's implementation written by Graydon Hoare in
1992011 (and modernized over the years by dozens of volunteer maintainers).
200
201[paper]: http://i.stanford.edu/pub/cstr/reports/cs/tr/79/770/CS-TR-79-770.pdf
202
203The paper describes two language-agnostic interacting procedures `Scan()` and
204`Print()`. Language-specific code decomposes an input data structure into a
205stream of `string` and `break` tokens, and `begin` and `end` tokens for
206grouping. Each `begin`&ndash;`end` range may be identified as either "consistent
207breaking" or "inconsistent breaking". If a group is consistently breaking, then
208if the whole contents do not fit on the line, *every* `break` token in the group
209will receive a linebreak. This is appropriate, for example, for Rust struct
210literals, or arguments of a function call. If a group is inconsistently
211breaking, then the `string` tokens in the group are greedily placed on the line
212until out of space, and linebroken only at those `break` tokens for which the
213next string would not fit. For example, this is appropriate for the contents of
214a braced `use` statement in Rust.
215
216Scan's job is to efficiently accumulate sizing information about groups and
217breaks. For every `begin` token we compute the distance to the matched `end`
218token, and for every `break` we compute the distance to the next `break`. The
219algorithm uses a ringbuffer to hold tokens whose size is not yet ascertained.
220The maximum size of the ringbuffer is bounded by the target line length and does
221not grow indefinitely, regardless of deep nesting in the input stream. That's
222because once a group is sufficiently big, the precise size can no longer make a
223difference to linebreak decisions and we can effectively treat it as "infinity".
224
225Print's job is to use the sizing information to efficiently assign a "broken" or
226"not broken" status to every `begin` token. At that point the output is easily
227constructed by concatenating `string` tokens and breaking at `break` tokens
228contained within a broken group.
229
230Leveraging these primitives (i.e. cleverly placing the all-or-nothing consistent
231breaks and greedy inconsistent breaks) to yield rustfmt-compatible formatting
232for all of Rust's syntax tree nodes is a fun challenge.
233
234Here is a visualization of some Rust tokens fed into the pretty printing
235algorithm. Consistently breaking `begin`&mdash;`end` pairs are represented by
236`«`&#8288;`»`, inconsistently breaking by `‹`&#8288;`›`, `break` by `·`, and the
237rest of the non-whitespace are `string`.
238
239```text
240use crate::«{·
241‹    lazy::«{·‹Lazy,· SyncLazy,· SyncOnceCell›·}»,·
242    panic,·
243    sync::«{·
244‹        atomic::«{·‹AtomicUsize,· Ordering::SeqCst›·}»,·
245        mpsc::channel,· Mutex›,·
246    }»,·
247    thread›,·
248}»;·
249«‹«impl<«·T‹›,· U‹›·»>» Into<«·U·»>· for T›·
250where·
251    U:‹ From<«·T·»>›,·
252253«    fn into(·«·self·») -> U {·
254‹        U::from(«·self·»)›·
255»    }·
256»}·
257```
258
259The algorithm described in the paper is not quite sufficient for producing
260well-formatted Rust code that is locally indistinguishable from rustfmt's style.
261The reason is that in the paper, the complete non-whitespace contents are
262assumed to be independent of linebreak decisions, with Scan and Print being only
263in control of the whitespace (spaces and line breaks). In Rust as idiomatically
264formattted by rustfmt, that is not the case. Trailing commas are one example;
265the punctuation is only known *after* the broken vs non-broken status of the
266surrounding group is known:
267
268```rust
269let _ = Struct { x: 0, y: true };
270
271let _ = Struct {
272    x: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
273    y: yyyyyyyyyyyyyyyyyyyyyyyyyyyyyy,   //<- trailing comma if the expression wrapped
274};
275```
276
277The formatting of `match` expressions is another case; we want small arms on the
278same line as the pattern, and big arms wrapped in a brace. The presence of the
279brace punctuation, comma, and semicolon are all dependent on whether the arm
280fits on the line:
281
282```rust
283match total_nanos.checked_add(entry.nanos as u64) {
284    Some(n) => tmp = n,   //<- small arm, inline with comma
285    None => {
286        total_secs = total_secs
287            .checked_add(total_nanos / NANOS_PER_SEC as u64)
288            .expect("overflow in iter::sum over durations");
289    }   //<- big arm, needs brace added, and also semicolon^
290}
291```
292
293The printing algorithm implementation in this crate accommodates all of these
294situations with conditional punctuation tokens whose selection can be deferred
295and populated after it's known that the group is or is not broken.
296
297<br>
298
299#### License
300
301<sup>
302Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3032.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
304</sup>
305
306<br>
307
308<sub>
309Unless you explicitly state otherwise, any contribution intentionally submitted
310for inclusion in this crate by you, as defined in the Apache-2.0 license, shall
311be dual licensed as above, without any additional terms or conditions.
312</sub>
313