]> git.lizzy.rs Git - rust.git/blob - compiler/rustc_mir_transform/src/dest_prop.rs
Rollup merge of #93293 - nvzqz:nonzero-min-max, r=joshtriplett
[rust.git] / compiler / rustc_mir_transform / src / dest_prop.rs
1 //! Propagates assignment destinations backwards in the CFG to eliminate redundant assignments.
2 //!
3 //! # Motivation
4 //!
5 //! MIR building can insert a lot of redundant copies, and Rust code in general often tends to move
6 //! values around a lot. The result is a lot of assignments of the form `dest = {move} src;` in MIR.
7 //! MIR building for constants in particular tends to create additional locals that are only used
8 //! inside a single block to shuffle a value around unnecessarily.
9 //!
10 //! LLVM by itself is not good enough at eliminating these redundant copies (eg. see
11 //! <https://github.com/rust-lang/rust/issues/32966>), so this leaves some performance on the table
12 //! that we can regain by implementing an optimization for removing these assign statements in rustc
13 //! itself. When this optimization runs fast enough, it can also speed up the constant evaluation
14 //! and code generation phases of rustc due to the reduced number of statements and locals.
15 //!
16 //! # The Optimization
17 //!
18 //! Conceptually, this optimization is "destination propagation". It is similar to the Named Return
19 //! Value Optimization, or NRVO, known from the C++ world, except that it isn't limited to return
20 //! values or the return place `_0`. On a very high level, independent of the actual implementation
21 //! details, it does the following:
22 //!
23 //! 1) Identify `dest = src;` statements that can be soundly eliminated.
24 //! 2) Replace all mentions of `src` with `dest` ("unifying" them and propagating the destination
25 //!    backwards).
26 //! 3) Delete the `dest = src;` statement (by making it a `nop`).
27 //!
28 //! Step 1) is by far the hardest, so it is explained in more detail below.
29 //!
30 //! ## Soundness
31 //!
32 //! Given an `Assign` statement `dest = src;`, where `dest` is a `Place` and `src` is an `Rvalue`,
33 //! there are a few requirements that must hold for the optimization to be sound:
34 //!
35 //! * `dest` must not contain any *indirection* through a pointer. It must access part of the base
36 //!   local. Otherwise it might point to arbitrary memory that is hard to track.
37 //!
38 //!   It must also not contain any indexing projections, since those take an arbitrary `Local` as
39 //!   the index, and that local might only be initialized shortly before `dest` is used.
40 //!
41 //! * `src` must be a bare `Local` without any indirections or field projections (FIXME: Is this a
42 //!   fundamental restriction or just current impl state?). It can be copied or moved by the
43 //!   assignment.
44 //!
45 //! * The `dest` and `src` locals must never be [*live*][liveness] at the same time. If they are, it
46 //!   means that they both hold a (potentially different) value that is needed by a future use of
47 //!   the locals. Unifying them would overwrite one of the values.
48 //!
49 //!   Note that computing liveness of locals that have had their address taken is more difficult:
50 //!   Short of doing full escape analysis on the address/pointer/reference, the pass would need to
51 //!   assume that any operation that can potentially involve opaque user code (such as function
52 //!   calls, destructors, and inline assembly) may access any local that had its address taken
53 //!   before that point.
54 //!
55 //! Here, the first two conditions are simple structural requirements on the `Assign` statements
56 //! that can be trivially checked. The liveness requirement however is more difficult and costly to
57 //! check.
58 //!
59 //! ## Previous Work
60 //!
61 //! A [previous attempt] at implementing an optimization like this turned out to be a significant
62 //! regression in compiler performance. Fixing the regressions introduced a lot of undesirable
63 //! complexity to the implementation.
64 //!
65 //! A [subsequent approach] tried to avoid the costly computation by limiting itself to acyclic
66 //! CFGs, but still turned out to be far too costly to run due to suboptimal performance within
67 //! individual basic blocks, requiring a walk across the entire block for every assignment found
68 //! within the block. For the `tuple-stress` benchmark, which has 458745 statements in a single
69 //! block, this proved to be far too costly.
70 //!
71 //! Since the first attempt at this, the compiler has improved dramatically, and new analysis
72 //! frameworks have been added that should make this approach viable without requiring a limited
73 //! approach that only works for some classes of CFGs:
74 //! - rustc now has a powerful dataflow analysis framework that can handle forwards and backwards
75 //!   analyses efficiently.
76 //! - Layout optimizations for generators have been added to improve code generation for
77 //!   async/await, which are very similar in spirit to what this optimization does. Both walk the
78 //!   MIR and record conflicting uses of locals in a `BitMatrix`.
79 //!
80 //! Also, rustc now has a simple NRVO pass (see `nrvo.rs`), which handles a subset of the cases that
81 //! this destination propagation pass handles, proving that similar optimizations can be performed
82 //! on MIR.
83 //!
84 //! ## Pre/Post Optimization
85 //!
86 //! It is recommended to run `SimplifyCfg` and then `SimplifyLocals` some time after this pass, as
87 //! it replaces the eliminated assign statements with `nop`s and leaves unused locals behind.
88 //!
89 //! [liveness]: https://en.wikipedia.org/wiki/Live_variable_analysis
90 //! [previous attempt]: https://github.com/rust-lang/rust/pull/47954
91 //! [subsequent approach]: https://github.com/rust-lang/rust/pull/71003
92
93 use crate::MirPass;
94 use itertools::Itertools;
95 use rustc_data_structures::unify::{InPlaceUnificationTable, UnifyKey};
96 use rustc_index::{
97     bit_set::{BitMatrix, BitSet},
98     vec::IndexVec,
99 };
100 use rustc_middle::mir::visit::{MutVisitor, PlaceContext, Visitor};
101 use rustc_middle::mir::{dump_mir, PassWhere};
102 use rustc_middle::mir::{
103     traversal, Body, InlineAsmOperand, Local, LocalKind, Location, Operand, Place, PlaceElem,
104     Rvalue, Statement, StatementKind, Terminator, TerminatorKind,
105 };
106 use rustc_middle::ty::TyCtxt;
107 use rustc_mir_dataflow::impls::{MaybeInitializedLocals, MaybeLiveLocals};
108 use rustc_mir_dataflow::Analysis;
109
110 // Empirical measurements have resulted in some observations:
111 // - Running on a body with a single block and 500 locals takes barely any time
112 // - Running on a body with ~400 blocks and ~300 relevant locals takes "too long"
113 // ...so we just limit both to somewhat reasonable-ish looking values.
114 const MAX_LOCALS: usize = 500;
115 const MAX_BLOCKS: usize = 250;
116
117 pub struct DestinationPropagation;
118
119 impl<'tcx> MirPass<'tcx> for DestinationPropagation {
120     fn is_enabled(&self, sess: &rustc_session::Session) -> bool {
121         //  FIXME(#79191, #82678): This is unsound.
122         //
123         // Only run at mir-opt-level=3 or higher for now (we don't fix up debuginfo and remove
124         // storage statements at the moment).
125         sess.opts.debugging_opts.unsound_mir_opts && sess.mir_opt_level() >= 3
126     }
127
128     fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
129         let def_id = body.source.def_id();
130
131         let candidates = find_candidates(body);
132         if candidates.is_empty() {
133             debug!("{:?}: no dest prop candidates, done", def_id);
134             return;
135         }
136
137         // Collect all locals we care about. We only compute conflicts for these to save time.
138         let mut relevant_locals = BitSet::new_empty(body.local_decls.len());
139         for CandidateAssignment { dest, src, loc: _ } in &candidates {
140             relevant_locals.insert(dest.local);
141             relevant_locals.insert(*src);
142         }
143
144         // This pass unfortunately has `O(l² * s)` performance, where `l` is the number of locals
145         // and `s` is the number of statements and terminators in the function.
146         // To prevent blowing up compile times too much, we bail out when there are too many locals.
147         let relevant = relevant_locals.count();
148         debug!(
149             "{:?}: {} locals ({} relevant), {} blocks",
150             def_id,
151             body.local_decls.len(),
152             relevant,
153             body.basic_blocks().len()
154         );
155         if relevant > MAX_LOCALS {
156             warn!(
157                 "too many candidate locals in {:?} ({}, max is {}), not optimizing",
158                 def_id, relevant, MAX_LOCALS
159             );
160             return;
161         }
162         if body.basic_blocks().len() > MAX_BLOCKS {
163             warn!(
164                 "too many blocks in {:?} ({}, max is {}), not optimizing",
165                 def_id,
166                 body.basic_blocks().len(),
167                 MAX_BLOCKS
168             );
169             return;
170         }
171
172         let mut conflicts = Conflicts::build(tcx, body, &relevant_locals);
173
174         let mut replacements = Replacements::new(body.local_decls.len());
175         for candidate @ CandidateAssignment { dest, src, loc } in candidates {
176             // Merge locals that don't conflict.
177             if !conflicts.can_unify(dest.local, src) {
178                 debug!("at assignment {:?}, conflict {:?} vs. {:?}", loc, dest.local, src);
179                 continue;
180             }
181
182             if replacements.for_src(candidate.src).is_some() {
183                 debug!("src {:?} already has replacement", candidate.src);
184                 continue;
185             }
186
187             if !tcx.consider_optimizing(|| {
188                 format!("DestinationPropagation {:?} {:?}", def_id, candidate)
189             }) {
190                 break;
191             }
192
193             replacements.push(candidate);
194             conflicts.unify(candidate.src, candidate.dest.local);
195         }
196
197         replacements.flatten(tcx);
198
199         debug!("replacements {:?}", replacements.map);
200
201         Replacer { tcx, replacements, place_elem_cache: Vec::new() }.visit_body(body);
202
203         // FIXME fix debug info
204     }
205 }
206
207 #[derive(Debug, Eq, PartialEq, Copy, Clone)]
208 struct UnifyLocal(Local);
209
210 impl From<Local> for UnifyLocal {
211     fn from(l: Local) -> Self {
212         Self(l)
213     }
214 }
215
216 impl UnifyKey for UnifyLocal {
217     type Value = ();
218     #[inline]
219     fn index(&self) -> u32 {
220         self.0.as_u32()
221     }
222     #[inline]
223     fn from_index(u: u32) -> Self {
224         Self(Local::from_u32(u))
225     }
226     fn tag() -> &'static str {
227         "UnifyLocal"
228     }
229 }
230
231 struct Replacements<'tcx> {
232     /// Maps locals to their replacement.
233     map: IndexVec<Local, Option<Place<'tcx>>>,
234
235     /// Whose locals' live ranges to kill.
236     kill: BitSet<Local>,
237 }
238
239 impl<'tcx> Replacements<'tcx> {
240     fn new(locals: usize) -> Self {
241         Self { map: IndexVec::from_elem_n(None, locals), kill: BitSet::new_empty(locals) }
242     }
243
244     fn push(&mut self, candidate: CandidateAssignment<'tcx>) {
245         trace!("Replacements::push({:?})", candidate);
246         let entry = &mut self.map[candidate.src];
247         assert!(entry.is_none());
248
249         *entry = Some(candidate.dest);
250         self.kill.insert(candidate.src);
251         self.kill.insert(candidate.dest.local);
252     }
253
254     /// Applies the stored replacements to all replacements, until no replacements would result in
255     /// locals that need further replacements when applied.
256     fn flatten(&mut self, tcx: TyCtxt<'tcx>) {
257         // Note: This assumes that there are no cycles in the replacements, which is enforced via
258         // `self.unified_locals`. Otherwise this can cause an infinite loop.
259
260         for local in self.map.indices() {
261             if let Some(replacement) = self.map[local] {
262                 // Substitute the base local of `replacement` until fixpoint.
263                 let mut base = replacement.local;
264                 let mut reversed_projection_slices = Vec::with_capacity(1);
265                 while let Some(replacement_for_replacement) = self.map[base] {
266                     base = replacement_for_replacement.local;
267                     reversed_projection_slices.push(replacement_for_replacement.projection);
268                 }
269
270                 let projection: Vec<_> = reversed_projection_slices
271                     .iter()
272                     .rev()
273                     .flat_map(|projs| projs.iter())
274                     .chain(replacement.projection.iter())
275                     .collect();
276                 let projection = tcx.intern_place_elems(&projection);
277
278                 // Replace with the final `Place`.
279                 self.map[local] = Some(Place { local: base, projection });
280             }
281         }
282     }
283
284     fn for_src(&self, src: Local) -> Option<Place<'tcx>> {
285         self.map[src]
286     }
287 }
288
289 struct Replacer<'tcx> {
290     tcx: TyCtxt<'tcx>,
291     replacements: Replacements<'tcx>,
292     place_elem_cache: Vec<PlaceElem<'tcx>>,
293 }
294
295 impl<'tcx> MutVisitor<'tcx> for Replacer<'tcx> {
296     fn tcx(&self) -> TyCtxt<'tcx> {
297         self.tcx
298     }
299
300     fn visit_local(&mut self, local: &mut Local, context: PlaceContext, location: Location) {
301         if context.is_use() && self.replacements.for_src(*local).is_some() {
302             bug!(
303                 "use of local {:?} should have been replaced by visit_place; context={:?}, loc={:?}",
304                 local,
305                 context,
306                 location,
307             );
308         }
309     }
310
311     fn visit_place(&mut self, place: &mut Place<'tcx>, context: PlaceContext, location: Location) {
312         if let Some(replacement) = self.replacements.for_src(place.local) {
313             // Rebase `place`s projections onto `replacement`'s.
314             self.place_elem_cache.clear();
315             self.place_elem_cache.extend(replacement.projection.iter().chain(place.projection));
316             let projection = self.tcx.intern_place_elems(&self.place_elem_cache);
317             let new_place = Place { local: replacement.local, projection };
318
319             debug!("Replacer: {:?} -> {:?}", place, new_place);
320             *place = new_place;
321         }
322
323         self.super_place(place, context, location);
324     }
325
326     fn visit_statement(&mut self, statement: &mut Statement<'tcx>, location: Location) {
327         self.super_statement(statement, location);
328
329         match &statement.kind {
330             // FIXME: Don't delete storage statements, merge the live ranges instead
331             StatementKind::StorageDead(local) | StatementKind::StorageLive(local)
332                 if self.replacements.kill.contains(*local) =>
333             {
334                 statement.make_nop()
335             }
336
337             StatementKind::Assign(box (dest, rvalue)) => {
338                 match rvalue {
339                     Rvalue::Use(Operand::Copy(place) | Operand::Move(place)) => {
340                         // These might've been turned into self-assignments by the replacement
341                         // (this includes the original statement we wanted to eliminate).
342                         if dest == place {
343                             debug!("{:?} turned into self-assignment, deleting", location);
344                             statement.make_nop();
345                         }
346                     }
347                     _ => {}
348                 }
349             }
350
351             _ => {}
352         }
353     }
354 }
355
356 struct Conflicts<'a> {
357     relevant_locals: &'a BitSet<Local>,
358
359     /// The conflict matrix. It is always symmetric and the adjacency matrix of the corresponding
360     /// conflict graph.
361     matrix: BitMatrix<Local, Local>,
362
363     /// Preallocated `BitSet` used by `unify`.
364     unify_cache: BitSet<Local>,
365
366     /// Tracks locals that have been merged together to prevent cycles and propagate conflicts.
367     unified_locals: InPlaceUnificationTable<UnifyLocal>,
368 }
369
370 impl<'a> Conflicts<'a> {
371     fn build<'tcx>(
372         tcx: TyCtxt<'tcx>,
373         body: &'_ Body<'tcx>,
374         relevant_locals: &'a BitSet<Local>,
375     ) -> Self {
376         // We don't have to look out for locals that have their address taken, since
377         // `find_candidates` already takes care of that.
378
379         let conflicts = BitMatrix::from_row_n(
380             &BitSet::new_empty(body.local_decls.len()),
381             body.local_decls.len(),
382         );
383
384         let mut init = MaybeInitializedLocals
385             .into_engine(tcx, body)
386             .iterate_to_fixpoint()
387             .into_results_cursor(body);
388         let mut live =
389             MaybeLiveLocals.into_engine(tcx, body).iterate_to_fixpoint().into_results_cursor(body);
390
391         let mut reachable = None;
392         dump_mir(tcx, None, "DestinationPropagation-dataflow", &"", body, |pass_where, w| {
393             let reachable = reachable.get_or_insert_with(|| traversal::reachable_as_bitset(body));
394
395             match pass_where {
396                 PassWhere::BeforeLocation(loc) if reachable.contains(loc.block) => {
397                     init.seek_before_primary_effect(loc);
398                     live.seek_after_primary_effect(loc);
399
400                     writeln!(w, "        // init: {:?}", init.get())?;
401                     writeln!(w, "        // live: {:?}", live.get())?;
402                 }
403                 PassWhere::AfterTerminator(bb) if reachable.contains(bb) => {
404                     let loc = body.terminator_loc(bb);
405                     init.seek_after_primary_effect(loc);
406                     live.seek_before_primary_effect(loc);
407
408                     writeln!(w, "        // init: {:?}", init.get())?;
409                     writeln!(w, "        // live: {:?}", live.get())?;
410                 }
411
412                 PassWhere::BeforeBlock(bb) if reachable.contains(bb) => {
413                     init.seek_to_block_start(bb);
414                     live.seek_to_block_start(bb);
415
416                     writeln!(w, "    // init: {:?}", init.get())?;
417                     writeln!(w, "    // live: {:?}", live.get())?;
418                 }
419
420                 PassWhere::BeforeCFG | PassWhere::AfterCFG | PassWhere::AfterLocation(_) => {}
421
422                 PassWhere::BeforeLocation(_) | PassWhere::AfterTerminator(_) => {
423                     writeln!(w, "        // init: <unreachable>")?;
424                     writeln!(w, "        // live: <unreachable>")?;
425                 }
426
427                 PassWhere::BeforeBlock(_) => {
428                     writeln!(w, "    // init: <unreachable>")?;
429                     writeln!(w, "    // live: <unreachable>")?;
430                 }
431             }
432
433             Ok(())
434         });
435
436         let mut this = Self {
437             relevant_locals,
438             matrix: conflicts,
439             unify_cache: BitSet::new_empty(body.local_decls.len()),
440             unified_locals: {
441                 let mut table = InPlaceUnificationTable::new();
442                 // Pre-fill table with all locals (this creates N nodes / "connected" components,
443                 // "graph"-ically speaking).
444                 for local in 0..body.local_decls.len() {
445                     assert_eq!(table.new_key(()), UnifyLocal(Local::from_usize(local)));
446                 }
447                 table
448             },
449         };
450
451         let mut live_and_init_locals = Vec::new();
452
453         // Visit only reachable basic blocks. The exact order is not important.
454         for (block, data) in traversal::preorder(body) {
455             // We need to observe the dataflow state *before* all possible locations (statement or
456             // terminator) in each basic block, and then observe the state *after* the terminator
457             // effect is applied. As long as neither `init` nor `borrowed` has a "before" effect,
458             // we will observe all possible dataflow states.
459
460             // Since liveness is a backwards analysis, we need to walk the results backwards. To do
461             // that, we first collect in the `MaybeInitializedLocals` results in a forwards
462             // traversal.
463
464             live_and_init_locals.resize_with(data.statements.len() + 1, || {
465                 BitSet::new_empty(body.local_decls.len())
466             });
467
468             // First, go forwards for `MaybeInitializedLocals` and apply intra-statement/terminator
469             // conflicts.
470             for (i, statement) in data.statements.iter().enumerate() {
471                 this.record_statement_conflicts(statement);
472
473                 let loc = Location { block, statement_index: i };
474                 init.seek_before_primary_effect(loc);
475
476                 live_and_init_locals[i].clone_from(init.get());
477             }
478
479             this.record_terminator_conflicts(data.terminator());
480             let term_loc = Location { block, statement_index: data.statements.len() };
481             init.seek_before_primary_effect(term_loc);
482             live_and_init_locals[term_loc.statement_index].clone_from(init.get());
483
484             // Now, go backwards and union with the liveness results.
485             for statement_index in (0..=data.statements.len()).rev() {
486                 let loc = Location { block, statement_index };
487                 live.seek_after_primary_effect(loc);
488
489                 live_and_init_locals[statement_index].intersect(live.get());
490
491                 trace!("record conflicts at {:?}", loc);
492
493                 this.record_dataflow_conflicts(&mut live_and_init_locals[statement_index]);
494             }
495
496             init.seek_to_block_end(block);
497             live.seek_to_block_end(block);
498             let mut conflicts = init.get().clone();
499             conflicts.intersect(live.get());
500             trace!("record conflicts at end of {:?}", block);
501
502             this.record_dataflow_conflicts(&mut conflicts);
503         }
504
505         this
506     }
507
508     fn record_dataflow_conflicts(&mut self, new_conflicts: &mut BitSet<Local>) {
509         // Remove all locals that are not candidates.
510         new_conflicts.intersect(self.relevant_locals);
511
512         for local in new_conflicts.iter() {
513             self.matrix.union_row_with(&new_conflicts, local);
514         }
515     }
516
517     fn record_local_conflict(&mut self, a: Local, b: Local, why: &str) {
518         trace!("conflict {:?} <-> {:?} due to {}", a, b, why);
519         self.matrix.insert(a, b);
520         self.matrix.insert(b, a);
521     }
522
523     /// Records locals that must not overlap during the evaluation of `stmt`. These locals conflict
524     /// and must not be merged.
525     fn record_statement_conflicts(&mut self, stmt: &Statement<'_>) {
526         match &stmt.kind {
527             // While the left and right sides of an assignment must not overlap, we do not mark
528             // conflicts here as that would make this optimization useless. When we optimize, we
529             // eliminate the resulting self-assignments automatically.
530             StatementKind::Assign(_) => {}
531
532             StatementKind::SetDiscriminant { .. }
533             | StatementKind::StorageLive(..)
534             | StatementKind::StorageDead(..)
535             | StatementKind::Retag(..)
536             | StatementKind::FakeRead(..)
537             | StatementKind::AscribeUserType(..)
538             | StatementKind::Coverage(..)
539             | StatementKind::CopyNonOverlapping(..)
540             | StatementKind::Nop => {}
541         }
542     }
543
544     fn record_terminator_conflicts(&mut self, term: &Terminator<'_>) {
545         match &term.kind {
546             TerminatorKind::DropAndReplace {
547                 place: dropped_place,
548                 value,
549                 target: _,
550                 unwind: _,
551             } => {
552                 if let Some(place) = value.place()
553                     && !place.is_indirect()
554                     && !dropped_place.is_indirect()
555                 {
556                     self.record_local_conflict(
557                         place.local,
558                         dropped_place.local,
559                         "DropAndReplace operand overlap",
560                     );
561                 }
562             }
563             TerminatorKind::Yield { value, resume: _, resume_arg, drop: _ } => {
564                 if let Some(place) = value.place() {
565                     if !place.is_indirect() && !resume_arg.is_indirect() {
566                         self.record_local_conflict(
567                             place.local,
568                             resume_arg.local,
569                             "Yield operand overlap",
570                         );
571                     }
572                 }
573             }
574             TerminatorKind::Call {
575                 func,
576                 args,
577                 destination: Some((dest_place, _)),
578                 cleanup: _,
579                 from_hir_call: _,
580                 fn_span: _,
581             } => {
582                 // No arguments may overlap with the destination.
583                 for arg in args.iter().chain(Some(func)) {
584                     if let Some(place) = arg.place() {
585                         if !place.is_indirect() && !dest_place.is_indirect() {
586                             self.record_local_conflict(
587                                 dest_place.local,
588                                 place.local,
589                                 "call dest/arg overlap",
590                             );
591                         }
592                     }
593                 }
594             }
595             TerminatorKind::InlineAsm {
596                 template: _,
597                 operands,
598                 options: _,
599                 line_spans: _,
600                 destination: _,
601                 cleanup: _,
602             } => {
603                 // The intended semantics here aren't documented, we just assume that nothing that
604                 // could be written to by the assembly may overlap with any other operands.
605                 for op in operands {
606                     match op {
607                         InlineAsmOperand::Out { reg: _, late: _, place: Some(dest_place) }
608                         | InlineAsmOperand::InOut {
609                             reg: _,
610                             late: _,
611                             in_value: _,
612                             out_place: Some(dest_place),
613                         } => {
614                             // For output place `place`, add all places accessed by the inline asm.
615                             for op in operands {
616                                 match op {
617                                     InlineAsmOperand::In { reg: _, value } => {
618                                         if let Some(p) = value.place()
619                                             && !p.is_indirect()
620                                             && !dest_place.is_indirect()
621                                         {
622                                             self.record_local_conflict(
623                                                 p.local,
624                                                 dest_place.local,
625                                                 "asm! operand overlap",
626                                             );
627                                         }
628                                     }
629                                     InlineAsmOperand::Out {
630                                         reg: _,
631                                         late: _,
632                                         place: Some(place),
633                                     } => {
634                                         if !place.is_indirect() && !dest_place.is_indirect() {
635                                             self.record_local_conflict(
636                                                 place.local,
637                                                 dest_place.local,
638                                                 "asm! operand overlap",
639                                             );
640                                         }
641                                     }
642                                     InlineAsmOperand::InOut {
643                                         reg: _,
644                                         late: _,
645                                         in_value,
646                                         out_place,
647                                     } => {
648                                         if let Some(place) = in_value.place()
649                                             && !place.is_indirect()
650                                             && !dest_place.is_indirect()
651                                         {
652                                             self.record_local_conflict(
653                                                 place.local,
654                                                 dest_place.local,
655                                                 "asm! operand overlap",
656                                             );
657                                         }
658
659                                         if let Some(place) = out_place
660                                             && !place.is_indirect()
661                                             && !dest_place.is_indirect()
662                                         {
663                                             self.record_local_conflict(
664                                                 place.local,
665                                                 dest_place.local,
666                                                 "asm! operand overlap",
667                                             );
668                                         }
669                                     }
670                                     InlineAsmOperand::Out { reg: _, late: _, place: None }
671                                     | InlineAsmOperand::Const { value: _ }
672                                     | InlineAsmOperand::SymFn { value: _ }
673                                     | InlineAsmOperand::SymStatic { def_id: _ } => {}
674                                 }
675                             }
676                         }
677                         InlineAsmOperand::InOut {
678                             reg: _,
679                             late: _,
680                             in_value: _,
681                             out_place: None,
682                         }
683                         | InlineAsmOperand::In { reg: _, value: _ }
684                         | InlineAsmOperand::Out { reg: _, late: _, place: None }
685                         | InlineAsmOperand::Const { value: _ }
686                         | InlineAsmOperand::SymFn { value: _ }
687                         | InlineAsmOperand::SymStatic { def_id: _ } => {}
688                     }
689                 }
690             }
691
692             TerminatorKind::Goto { .. }
693             | TerminatorKind::Call { destination: None, .. }
694             | TerminatorKind::SwitchInt { .. }
695             | TerminatorKind::Resume
696             | TerminatorKind::Abort
697             | TerminatorKind::Return
698             | TerminatorKind::Unreachable
699             | TerminatorKind::Drop { .. }
700             | TerminatorKind::Assert { .. }
701             | TerminatorKind::GeneratorDrop
702             | TerminatorKind::FalseEdge { .. }
703             | TerminatorKind::FalseUnwind { .. } => {}
704         }
705     }
706
707     /// Checks whether `a` and `b` may be merged. Returns `false` if there's a conflict.
708     fn can_unify(&mut self, a: Local, b: Local) -> bool {
709         // After some locals have been unified, their conflicts are only tracked in the root key,
710         // so look that up.
711         let a = self.unified_locals.find(a).0;
712         let b = self.unified_locals.find(b).0;
713
714         if a == b {
715             // Already merged (part of the same connected component).
716             return false;
717         }
718
719         if self.matrix.contains(a, b) {
720             // Conflict (derived via dataflow, intra-statement conflicts, or inherited from another
721             // local during unification).
722             return false;
723         }
724
725         true
726     }
727
728     /// Merges the conflicts of `a` and `b`, so that each one inherits all conflicts of the other.
729     ///
730     /// `can_unify` must have returned `true` for the same locals, or this may panic or lead to
731     /// miscompiles.
732     ///
733     /// This is called when the pass makes the decision to unify `a` and `b` (or parts of `a` and
734     /// `b`) and is needed to ensure that future unification decisions take potentially newly
735     /// introduced conflicts into account.
736     ///
737     /// For an example, assume we have locals `_0`, `_1`, `_2`, and `_3`. There are these conflicts:
738     ///
739     /// * `_0` <-> `_1`
740     /// * `_1` <-> `_2`
741     /// * `_3` <-> `_0`
742     ///
743     /// We then decide to merge `_2` with `_3` since they don't conflict. Then we decide to merge
744     /// `_2` with `_0`, which also doesn't have a conflict in the above list. However `_2` is now
745     /// `_3`, which does conflict with `_0`.
746     fn unify(&mut self, a: Local, b: Local) {
747         trace!("unify({:?}, {:?})", a, b);
748
749         // Get the root local of the connected components. The root local stores the conflicts of
750         // all locals in the connected component (and *is stored* as the conflicting local of other
751         // locals).
752         let a = self.unified_locals.find(a).0;
753         let b = self.unified_locals.find(b).0;
754         assert_ne!(a, b);
755
756         trace!("roots: a={:?}, b={:?}", a, b);
757         trace!("{:?} conflicts: {:?}", a, self.matrix.iter(a).format(", "));
758         trace!("{:?} conflicts: {:?}", b, self.matrix.iter(b).format(", "));
759
760         self.unified_locals.union(a, b);
761
762         let root = self.unified_locals.find(a).0;
763         assert!(root == a || root == b);
764
765         // Make all locals that conflict with `a` also conflict with `b`, and vice versa.
766         self.unify_cache.clear();
767         for conflicts_with_a in self.matrix.iter(a) {
768             self.unify_cache.insert(conflicts_with_a);
769         }
770         for conflicts_with_b in self.matrix.iter(b) {
771             self.unify_cache.insert(conflicts_with_b);
772         }
773         for conflicts_with_a_or_b in self.unify_cache.iter() {
774             // Set both `a` and `b` for this local's row.
775             self.matrix.insert(conflicts_with_a_or_b, a);
776             self.matrix.insert(conflicts_with_a_or_b, b);
777         }
778
779         // Write the locals `a` conflicts with to `b`'s row.
780         self.matrix.union_rows(a, b);
781         // Write the locals `b` conflicts with to `a`'s row.
782         self.matrix.union_rows(b, a);
783     }
784 }
785
786 /// A `dest = {move} src;` statement at `loc`.
787 ///
788 /// We want to consider merging `dest` and `src` due to this assignment.
789 #[derive(Debug, Copy, Clone)]
790 struct CandidateAssignment<'tcx> {
791     /// Does not contain indirection or indexing (so the only local it contains is the place base).
792     dest: Place<'tcx>,
793     src: Local,
794     loc: Location,
795 }
796
797 /// Scans the MIR for assignments between locals that we might want to consider merging.
798 ///
799 /// This will filter out assignments that do not match the right form (as described in the top-level
800 /// comment) and also throw out assignments that involve a local that has its address taken or is
801 /// otherwise ineligible (eg. locals used as array indices are ignored because we cannot propagate
802 /// arbitrary places into array indices).
803 fn find_candidates<'tcx>(body: &Body<'tcx>) -> Vec<CandidateAssignment<'tcx>> {
804     let mut visitor = FindAssignments {
805         body,
806         candidates: Vec::new(),
807         ever_borrowed_locals: ever_borrowed_locals(body),
808         locals_used_as_array_index: locals_used_as_array_index(body),
809     };
810     visitor.visit_body(body);
811     visitor.candidates
812 }
813
814 struct FindAssignments<'a, 'tcx> {
815     body: &'a Body<'tcx>,
816     candidates: Vec<CandidateAssignment<'tcx>>,
817     ever_borrowed_locals: BitSet<Local>,
818     locals_used_as_array_index: BitSet<Local>,
819 }
820
821 impl<'tcx> Visitor<'tcx> for FindAssignments<'_, 'tcx> {
822     fn visit_statement(&mut self, statement: &Statement<'tcx>, location: Location) {
823         if let StatementKind::Assign(box (
824             dest,
825             Rvalue::Use(Operand::Copy(src) | Operand::Move(src)),
826         )) = &statement.kind
827         {
828             // `dest` must not have pointer indirection.
829             if dest.is_indirect() {
830                 return;
831             }
832
833             // `src` must be a plain local.
834             if !src.projection.is_empty() {
835                 return;
836             }
837
838             // Since we want to replace `src` with `dest`, `src` must not be required.
839             if is_local_required(src.local, self.body) {
840                 return;
841             }
842
843             // Can't optimize if either local ever has their address taken. This optimization does
844             // liveness analysis only based on assignments, and a local can be live even if its
845             // never assigned to again, because a reference to it might be live.
846             // FIXME: This can be smarter and take `StorageDead` into  account (which invalidates
847             // borrows).
848             if self.ever_borrowed_locals.contains(dest.local)
849                 || self.ever_borrowed_locals.contains(src.local)
850             {
851                 return;
852             }
853
854             assert_ne!(dest.local, src.local, "self-assignments are UB");
855
856             // We can't replace locals occurring in `PlaceElem::Index` for now.
857             if self.locals_used_as_array_index.contains(src.local) {
858                 return;
859             }
860
861             for elem in dest.projection {
862                 if let PlaceElem::Index(_) = elem {
863                     // `dest` contains an indexing projection.
864                     return;
865                 }
866             }
867
868             self.candidates.push(CandidateAssignment {
869                 dest: *dest,
870                 src: src.local,
871                 loc: location,
872             });
873         }
874     }
875 }
876
877 /// Some locals are part of the function's interface and can not be removed.
878 ///
879 /// Note that these locals *can* still be merged with non-required locals by removing that other
880 /// local.
881 fn is_local_required(local: Local, body: &Body<'_>) -> bool {
882     match body.local_kind(local) {
883         LocalKind::Arg | LocalKind::ReturnPointer => true,
884         LocalKind::Var | LocalKind::Temp => false,
885     }
886 }
887
888 /// Walks MIR to find all locals that have their address taken anywhere.
889 fn ever_borrowed_locals(body: &Body<'_>) -> BitSet<Local> {
890     let mut visitor = BorrowCollector { locals: BitSet::new_empty(body.local_decls.len()) };
891     visitor.visit_body(body);
892     visitor.locals
893 }
894
895 struct BorrowCollector {
896     locals: BitSet<Local>,
897 }
898
899 impl<'tcx> Visitor<'tcx> for BorrowCollector {
900     fn visit_rvalue(&mut self, rvalue: &Rvalue<'tcx>, location: Location) {
901         self.super_rvalue(rvalue, location);
902
903         match rvalue {
904             Rvalue::AddressOf(_, borrowed_place) | Rvalue::Ref(_, _, borrowed_place) => {
905                 if !borrowed_place.is_indirect() {
906                     self.locals.insert(borrowed_place.local);
907                 }
908             }
909
910             Rvalue::Cast(..)
911             | Rvalue::ShallowInitBox(..)
912             | Rvalue::Use(..)
913             | Rvalue::Repeat(..)
914             | Rvalue::Len(..)
915             | Rvalue::BinaryOp(..)
916             | Rvalue::CheckedBinaryOp(..)
917             | Rvalue::NullaryOp(..)
918             | Rvalue::UnaryOp(..)
919             | Rvalue::Discriminant(..)
920             | Rvalue::Aggregate(..)
921             | Rvalue::ThreadLocalRef(..) => {}
922         }
923     }
924
925     fn visit_terminator(&mut self, terminator: &Terminator<'tcx>, location: Location) {
926         self.super_terminator(terminator, location);
927
928         match terminator.kind {
929             TerminatorKind::Drop { place: dropped_place, .. }
930             | TerminatorKind::DropAndReplace { place: dropped_place, .. } => {
931                 self.locals.insert(dropped_place.local);
932             }
933
934             TerminatorKind::Abort
935             | TerminatorKind::Assert { .. }
936             | TerminatorKind::Call { .. }
937             | TerminatorKind::FalseEdge { .. }
938             | TerminatorKind::FalseUnwind { .. }
939             | TerminatorKind::GeneratorDrop
940             | TerminatorKind::Goto { .. }
941             | TerminatorKind::Resume
942             | TerminatorKind::Return
943             | TerminatorKind::SwitchInt { .. }
944             | TerminatorKind::Unreachable
945             | TerminatorKind::Yield { .. }
946             | TerminatorKind::InlineAsm { .. } => {}
947         }
948     }
949 }
950
951 /// `PlaceElem::Index` only stores a `Local`, so we can't replace that with a full `Place`.
952 ///
953 /// Collect locals used as indices so we don't generate candidates that are impossible to apply
954 /// later.
955 fn locals_used_as_array_index(body: &Body<'_>) -> BitSet<Local> {
956     let mut visitor = IndexCollector { locals: BitSet::new_empty(body.local_decls.len()) };
957     visitor.visit_body(body);
958     visitor.locals
959 }
960
961 struct IndexCollector {
962     locals: BitSet<Local>,
963 }
964
965 impl<'tcx> Visitor<'tcx> for IndexCollector {
966     fn visit_projection_elem(
967         &mut self,
968         local: Local,
969         proj_base: &[PlaceElem<'tcx>],
970         elem: PlaceElem<'tcx>,
971         context: PlaceContext,
972         location: Location,
973     ) {
974         if let PlaceElem::Index(i) = elem {
975             self.locals.insert(i);
976         }
977         self.super_projection_elem(local, proj_base, elem, context, location);
978     }
979 }