src/tools/miri/src/concurrency/weak_memory.rs

   1 //! Implementation of C++11-consistent weak memory emulation using store buffers
   2 //! based on Dynamic Race Detection for C++ ("the paper"):
   3 //! <https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2017/POPL.pdf>
   4 //!
   5 //! This implementation will never generate weak memory behaviours forbidden by the C++11 model,
   6 //! but it is incapable of producing all possible weak behaviours allowed by the model. There are
   7 //! certain weak behaviours observable on real hardware but not while using this.
   8 //!
   9 //! Note that this implementation does not fully take into account of C++20's memory model revision to SC accesses
  10 //! and fences introduced by P0668 (<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0668r5.html>).
  11 //! This implementation is not fully correct under the revised C++20 model and may generate behaviours C++20
  12 //! disallows (<https://github.com/rust-lang/miri/issues/2301>).
  13 //!
  14 //! A modification is made to the paper's model to partially address C++20 changes.
  15 //! Specifically, if an SC load reads from an atomic store of any ordering, then a later SC load cannot read from
  16 //! an earlier store in the location's modification order. This is to prevent creating a backwards S edge from the second
  17 //! load to the first, as a result of C++20's coherence-ordered before rules.
  18 //!
  19 //! Rust follows the C++20 memory model (except for the Consume ordering and some operations not performable through C++'s
  20 //! `std::atomic<T>` API). It is therefore possible for this implementation to generate behaviours never observable when the
  21 //! same program is compiled and run natively. Unfortunately, no literature exists at the time of writing which proposes
  22 //! an implementable and C++20-compatible relaxed memory model that supports all atomic operation existing in Rust. The closest one is
  23 //! A Promising Semantics for Relaxed-Memory Concurrency by Jeehoon Kang et al. (<https://www.cs.tau.ac.il/~orilahav/papers/popl17.pdf>)
  24 //! However, this model lacks SC accesses and is therefore unusable by Miri (SC accesses are everywhere in library code).
  25 //!
  26 //! If you find anything that proposes a relaxed memory model that is C++20-consistent, supports all orderings Rust's atomic accesses
  27 //! and fences accept, and is implementable (with operational semanitcs), please open a GitHub issue!
  28 //!
  29 //! One characteristic of this implementation, in contrast to some other notable operational models such as ones proposed in
  30 //! Taming Release-Acquire Consistency by Ori Lahav et al. (<https://plv.mpi-sws.org/sra/paper.pdf>) or Promising Semantics noted above,
  31 //! is that this implementation does not require each thread to hold an isolated view of the entire memory. Here, store buffers are per-location
  32 //! and shared across all threads. This is more memory efficient but does require store elements (representing writes to a location) to record
  33 //! information about reads, whereas in the other two models it is the other way round: reads points to the write it got its value from.
  34 //! Additionally, writes in our implementation do not have globally unique timestamps attached. In the other two models this timestamp is
  35 //! used to make sure a value in a thread's view is not overwritten by a write that occured earlier than the one in the existing view.
  36 //! In our implementation, this is detected using read information attached to store elements, as there is no data strucutre representing reads.
  37 //!
  38 //! The C++ memory model is built around the notion of an 'atomic object', so it would be natural
  39 //! to attach store buffers to atomic objects. However, Rust follows LLVM in that it only has
  40 //! 'atomic accesses'. Therefore Miri cannot know when and where atomic 'objects' are being
  41 //! created or destroyed, to manage its store buffers. Instead, we hence lazily create an
  42 //! atomic object on the first atomic access to a given region, and we destroy that object
  43 //! on the next non-atomic or imperfectly overlapping atomic access to that region.
  44 //! These lazy (de)allocations happen in memory_accessed() on non-atomic accesses, and
  45 //! get_or_create_store_buffer() on atomic accesses. This mostly works well, but it does
  46 //! lead to some issues (<https://github.com/rust-lang/miri/issues/2164>).
  47 //!
  48 //! One consequence of this difference is that safe/sound Rust allows for more operations on atomic locations
  49 //! than the C++20 atomic API was intended to allow, such as non-atomically accessing
  50 //! a previously atomically accessed location, or accessing previously atomically accessed locations with a differently sized operation
  51 //! (such as accessing the top 16 bits of an AtomicU32). These senarios are generally undiscussed in formalisations of C++ memory model.
  52 //! In Rust, these operations can only be done through a `&mut AtomicFoo` reference or one derived from it, therefore these operations
  53 //! can only happen after all previous accesses on the same locations. This implementation is adapted to allow these operations.
  54 //! A mixed atomicity read that races with writes, or a write that races with reads or writes will still cause UBs to be thrown.
  55 //! Mixed size atomic accesses must not race with any other atomic access, whether read or write, or a UB will be thrown.
  56 //! You can refer to test cases in weak_memory/extra_cpp.rs and weak_memory/extra_cpp_unsafe.rs for examples of these operations.
  57
  58 // Our and the author's own implementation (tsan11) of the paper have some deviations from the provided operational semantics in §5.3:
  59 // 1. In the operational semantics, store elements keep a copy of the atomic object's vector clock (AtomicCellClocks::sync_vector in miri),
  60 // but this is not used anywhere so it's omitted here.
  61 //
  62 // 2. In the operational semantics, each store element keeps the timestamp of a thread when it loads from the store.
  63 // If the same thread loads from the same store element multiple times, then the timestamps at all loads are saved in a list of load elements.
  64 // This is not necessary as later loads by the same thread will always have greater timetstamp values, so we only need to record the timestamp of the first
  65 // load by each thread. This optimisation is done in tsan11
  66 // (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.h#L35-L37)
  67 // and here.
  68 //
  69 // 3. §4.5 of the paper wants an SC store to mark all existing stores in the buffer that happens before it
  70 // as SC. This is not done in the operational semantics but implemented correctly in tsan11
  71 // (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.cc#L160-L167)
  72 // and here.
  73 //
  74 // 4. W_SC ; R_SC case requires the SC load to ignore all but last store maked SC (stores not marked SC are not
  75 // affected). But this rule is applied to all loads in ReadsFromSet from the paper (last two lines of code), not just SC load.
  76 // This is implemented correctly in tsan11
  77 // (https://github.com/ChrisLidbury/tsan11/blob/ecbd6b81e9b9454e01cba78eb9d88684168132c7/lib/tsan/rtl/tsan_relaxed.cc#L295)
  78 // and here.
  79
  80 use std::{
  81     cell::{Ref, RefCell},
  82     collections::VecDeque,
  83 };
  84
  85 use rustc_const_eval::interpret::{alloc_range, AllocRange, InterpResult, MPlaceTy, Scalar};
  86 use rustc_data_structures::fx::FxHashMap;
  87
  88 use crate::*;
  89
  90 use super::{
  91     data_race::{GlobalState as DataRaceState, ThreadClockSet},
  92     range_object_map::{AccessType, RangeObjectMap},
  93     vector_clock::{VClock, VTimestamp, VectorIdx},
  94 };
  95
  96 pub type AllocExtra = StoreBufferAlloc;
  97
  98 // Each store buffer must be bounded otherwise it will grow indefinitely.
  99 // However, bounding the store buffer means restricting the amount of weak
 100 // behaviours observable. The author picked 128 as a good tradeoff
 101 // so we follow them here.
 102 const STORE_BUFFER_LIMIT: usize = 128;
 103
 104 #[derive(Debug, Clone)]
 105 pub struct StoreBufferAlloc {
 106     /// Store buffer of each atomic object in this allocation
 107     // Behind a RefCell because we need to allocate/remove on read access
 108     store_buffers: RefCell<RangeObjectMap<StoreBuffer>>,
 109 }
 110
 111 impl VisitTags for StoreBufferAlloc {
 112     fn visit_tags(&self, visit: &mut dyn FnMut(SbTag)) {
 113         let Self { store_buffers } = self;
 114         for val in store_buffers
 115             .borrow()
 116             .iter()
 117             .flat_map(|buf| buf.buffer.iter().map(|element| &element.val))
 118         {
 119             val.visit_tags(visit);
 120         }
 121     }
 122 }
 123
 124 #[derive(Debug, Clone, PartialEq, Eq)]
 125 pub(super) struct StoreBuffer {
 126     // Stores to this location in modification order
 127     buffer: VecDeque<StoreElement>,
 128 }
 129
 130 /// Whether a load returned the latest value or not.
 131 #[derive(PartialEq, Eq)]
 132 enum LoadRecency {
 133     Latest,
 134     Outdated,
 135 }
 136
 137 #[derive(Debug, Clone, PartialEq, Eq)]
 138 struct StoreElement {
 139     /// The identifier of the vector index, corresponding to a thread
 140     /// that performed the store.
 141     store_index: VectorIdx,
 142
 143     /// Whether this store is SC.
 144     is_seqcst: bool,
 145
 146     /// The timestamp of the storing thread when it performed the store
 147     timestamp: VTimestamp,
 148     /// The value of this store
 149     // FIXME: this means the store must be fully initialized;
 150     // we will have to change this if we want to support atomics on
 151     // (partially) uninitialized data.
 152     val: Scalar<Provenance>,
 153
 154     /// Metadata about loads from this store element,
 155     /// behind a RefCell to keep load op take &self
 156     load_info: RefCell<LoadInfo>,
 157 }
 158
 159 #[derive(Debug, Clone, PartialEq, Eq, Default)]
 160 struct LoadInfo {
 161     /// Timestamp of first loads from this store element by each thread
 162     timestamps: FxHashMap<VectorIdx, VTimestamp>,
 163     /// Whether this store element has been read by an SC load
 164     sc_loaded: bool,
 165 }
 166
 167 impl StoreBufferAlloc {
 168     pub fn new_allocation() -> Self {
 169         Self { store_buffers: RefCell::new(RangeObjectMap::new()) }
 170     }
 171
 172     /// Checks if the range imperfectly overlaps with existing buffers
 173     /// Used to determine if mixed-size atomic accesses
 174     fn is_overlapping(&self, range: AllocRange) -> bool {
 175         let buffers = self.store_buffers.borrow();
 176         let access_type = buffers.access_type(range);
 177         matches!(access_type, AccessType::ImperfectlyOverlapping(_))
 178     }
 179
 180     /// When a non-atomic access happens on a location that has been atomically accessed
 181     /// before without data race, we can determine that the non-atomic access fully happens
 182     /// after all the prior atomic accesses so the location no longer needs to exhibit
 183     /// any weak memory behaviours until further atomic accesses.
 184     pub fn memory_accessed(&self, range: AllocRange, global: &DataRaceState) {
 185         if !global.ongoing_action_data_race_free() {
 186             let mut buffers = self.store_buffers.borrow_mut();
 187             let access_type = buffers.access_type(range);
 188             match access_type {
 189                 AccessType::PerfectlyOverlapping(pos) => {
 190                     buffers.remove_from_pos(pos);
 191                 }
 192                 AccessType::ImperfectlyOverlapping(pos_range) => {
 193                     buffers.remove_pos_range(pos_range);
 194                 }
 195                 AccessType::Empty(_) => {
 196                     // The range had no weak behaivours attached, do nothing
 197                 }
 198             }
 199         }
 200     }
 201
 202     /// Gets a store buffer associated with an atomic object in this allocation,
 203     /// or creates one with the specified initial value if no atomic object exists yet.
 204     fn get_or_create_store_buffer<'tcx>(
 205         &self,
 206         range: AllocRange,
 207         init: Scalar<Provenance>,
 208     ) -> InterpResult<'tcx, Ref<'_, StoreBuffer>> {
 209         let access_type = self.store_buffers.borrow().access_type(range);
 210         let pos = match access_type {
 211             AccessType::PerfectlyOverlapping(pos) => pos,
 212             AccessType::Empty(pos) => {
 213                 let mut buffers = self.store_buffers.borrow_mut();
 214                 buffers.insert_at_pos(pos, range, StoreBuffer::new(init));
 215                 pos
 216             }
 217             AccessType::ImperfectlyOverlapping(pos_range) => {
 218                 // Once we reach here we would've already checked that this access is not racy
 219                 let mut buffers = self.store_buffers.borrow_mut();
 220                 buffers.remove_pos_range(pos_range.clone());
 221                 buffers.insert_at_pos(pos_range.start, range, StoreBuffer::new(init));
 222                 pos_range.start
 223             }
 224         };
 225         Ok(Ref::map(self.store_buffers.borrow(), |buffer| &buffer[pos]))
 226     }
 227
 228     /// Gets a mutable store buffer associated with an atomic object in this allocation
 229     fn get_or_create_store_buffer_mut<'tcx>(
 230         &mut self,
 231         range: AllocRange,
 232         init: Scalar<Provenance>,
 233     ) -> InterpResult<'tcx, &mut StoreBuffer> {
 234         let buffers = self.store_buffers.get_mut();
 235         let access_type = buffers.access_type(range);
 236         let pos = match access_type {
 237             AccessType::PerfectlyOverlapping(pos) => pos,
 238             AccessType::Empty(pos) => {
 239                 buffers.insert_at_pos(pos, range, StoreBuffer::new(init));
 240                 pos
 241             }
 242             AccessType::ImperfectlyOverlapping(pos_range) => {
 243                 buffers.remove_pos_range(pos_range.clone());
 244                 buffers.insert_at_pos(pos_range.start, range, StoreBuffer::new(init));
 245                 pos_range.start
 246             }
 247         };
 248         Ok(&mut buffers[pos])
 249     }
 250 }
 251
 252 impl<'mir, 'tcx: 'mir> StoreBuffer {
 253     fn new(init: Scalar<Provenance>) -> Self {
 254         let mut buffer = VecDeque::new();
 255         buffer.reserve(STORE_BUFFER_LIMIT);
 256         let mut ret = Self { buffer };
 257         let store_elem = StoreElement {
 258             // The thread index and timestamp of the initialisation write
 259             // are never meaningfully used, so it's fine to leave them as 0
 260             store_index: VectorIdx::from(0),
 261             timestamp: 0,
 262             val: init,
 263             is_seqcst: false,
 264             load_info: RefCell::new(LoadInfo::default()),
 265         };
 266         ret.buffer.push_back(store_elem);
 267         ret
 268     }
 269
 270     /// Reads from the last store in modification order
 271     fn read_from_last_store(
 272         &self,
 273         global: &DataRaceState,
 274         thread_mgr: &ThreadManager<'_, '_>,
 275         is_seqcst: bool,
 276     ) {
 277         let store_elem = self.buffer.back();
 278         if let Some(store_elem) = store_elem {
 279             let (index, clocks) = global.current_thread_state(thread_mgr);
 280             store_elem.load_impl(index, &clocks, is_seqcst);
 281         }
 282     }
 283
 284     fn buffered_read(
 285         &self,
 286         global: &DataRaceState,
 287         thread_mgr: &ThreadManager<'_, '_>,
 288         is_seqcst: bool,
 289         rng: &mut (impl rand::Rng + ?Sized),
 290         validate: impl FnOnce() -> InterpResult<'tcx>,
 291     ) -> InterpResult<'tcx, (Scalar<Provenance>, LoadRecency)> {
 292         // Having a live borrow to store_buffer while calling validate_atomic_load is fine
 293         // because the race detector doesn't touch store_buffer
 294
 295         let (store_elem, recency) = {
 296             // The `clocks` we got here must be dropped before calling validate_atomic_load
 297             // as the race detector will update it
 298             let (.., clocks) = global.current_thread_state(thread_mgr);
 299             // Load from a valid entry in the store buffer
 300             self.fetch_store(is_seqcst, &clocks, &mut *rng)
 301         };
 302
 303         // Unlike in buffered_atomic_write, thread clock updates have to be done
 304         // after we've picked a store element from the store buffer, as presented
 305         // in ATOMIC LOAD rule of the paper. This is because fetch_store
 306         // requires access to ThreadClockSet.clock, which is updated by the race detector
 307         validate()?;
 308
 309         let (index, clocks) = global.current_thread_state(thread_mgr);
 310         let loaded = store_elem.load_impl(index, &clocks, is_seqcst);
 311         Ok((loaded, recency))
 312     }
 313
 314     fn buffered_write(
 315         &mut self,
 316         val: Scalar<Provenance>,
 317         global: &DataRaceState,
 318         thread_mgr: &ThreadManager<'_, '_>,
 319         is_seqcst: bool,
 320     ) -> InterpResult<'tcx> {
 321         let (index, clocks) = global.current_thread_state(thread_mgr);
 322
 323         self.store_impl(val, index, &clocks.clock, is_seqcst);
 324         Ok(())
 325     }
 326
 327     #[allow(clippy::if_same_then_else, clippy::needless_bool)]
 328     /// Selects a valid store element in the buffer.
 329     fn fetch_store<R: rand::Rng + ?Sized>(
 330         &self,
 331         is_seqcst: bool,
 332         clocks: &ThreadClockSet,
 333         rng: &mut R,
 334     ) -> (&StoreElement, LoadRecency) {
 335         use rand::seq::IteratorRandom;
 336         let mut found_sc = false;
 337         // FIXME: we want an inclusive take_while (stops after a false predicate, but
 338         // includes the element that gave the false), but such function doesn't yet
 339         // exist in the standard libary https://github.com/rust-lang/rust/issues/62208
 340         // so we have to hack around it with keep_searching
 341         let mut keep_searching = true;
 342         let candidates = self
 343             .buffer
 344             .iter()
 345             .rev()
 346             .take_while(move |&store_elem| {
 347                 if !keep_searching {
 348                     return false;
 349                 }
 350
 351                 keep_searching = if store_elem.timestamp <= clocks.clock[store_elem.store_index] {
 352                     // CoWR: if a store happens-before the current load,
 353                     // then we can't read-from anything earlier in modification order.
 354                     // C++20 §6.9.2.2 [intro.races] paragraph 18
 355                     false
 356                 } else if store_elem.load_info.borrow().timestamps.iter().any(
 357                     |(&load_index, &load_timestamp)| load_timestamp <= clocks.clock[load_index],
 358                 ) {
 359                     // CoRR: if there was a load from this store which happened-before the current load,
 360                     // then we cannot read-from anything earlier in modification order.
 361                     // C++20 §6.9.2.2 [intro.races] paragraph 16
 362                     false
 363                 } else if store_elem.timestamp <= clocks.fence_seqcst[store_elem.store_index] {
 364                     // The current load, which may be sequenced-after an SC fence, cannot read-before
 365                     // the last store sequenced-before an SC fence in another thread.
 366                     // C++17 §32.4 [atomics.order] paragraph 6
 367                     false
 368                 } else if store_elem.timestamp <= clocks.write_seqcst[store_elem.store_index]
 369                     && store_elem.is_seqcst
 370                 {
 371                     // The current non-SC load, which may be sequenced-after an SC fence,
 372                     // cannot read-before the last SC store executed before the fence.
 373                     // C++17 §32.4 [atomics.order] paragraph 4
 374                     false
 375                 } else if is_seqcst
 376                     && store_elem.timestamp <= clocks.read_seqcst[store_elem.store_index]
 377                 {
 378                     // The current SC load cannot read-before the last store sequenced-before
 379                     // the last SC fence.
 380                     // C++17 §32.4 [atomics.order] paragraph 5
 381                     false
 382                 } else if is_seqcst && store_elem.load_info.borrow().sc_loaded {
 383                     // The current SC load cannot read-before a store that an earlier SC load has observed.
 384                     // See https://github.com/rust-lang/miri/issues/2301#issuecomment-1222720427
 385                     // Consequences of C++20 §31.4 [atomics.order] paragraph 3.1, 3.3 (coherence-ordered before)
 386                     // and 4.1 (coherence-ordered before between SC makes global total order S)
 387                     false
 388                 } else {
 389                     true
 390                 };
 391
 392                 true
 393             })
 394             .filter(|&store_elem| {
 395                 if is_seqcst && store_elem.is_seqcst {
 396                     // An SC load needs to ignore all but last store maked SC (stores not marked SC are not
 397                     // affected)
 398                     let include = !found_sc;
 399                     found_sc = true;
 400                     include
 401                 } else {
 402                     true
 403                 }
 404             });
 405
 406         let chosen = candidates.choose(rng).expect("store buffer cannot be empty");
 407         if std::ptr::eq(chosen, self.buffer.back().expect("store buffer cannot be empty")) {
 408             (chosen, LoadRecency::Latest)
 409         } else {
 410             (chosen, LoadRecency::Outdated)
 411         }
 412     }
 413
 414     /// ATOMIC STORE IMPL in the paper (except we don't need the location's vector clock)
 415     fn store_impl(
 416         &mut self,
 417         val: Scalar<Provenance>,
 418         index: VectorIdx,
 419         thread_clock: &VClock,
 420         is_seqcst: bool,
 421     ) {
 422         let store_elem = StoreElement {
 423             store_index: index,
 424             timestamp: thread_clock[index],
 425             // In the language provided in the paper, an atomic store takes the value from a
 426             // non-atomic memory location.
 427             // But we already have the immediate value here so we don't need to do the memory
 428             // access
 429             val,
 430             is_seqcst,
 431             load_info: RefCell::new(LoadInfo::default()),
 432         };
 433         self.buffer.push_back(store_elem);
 434         if self.buffer.len() > STORE_BUFFER_LIMIT {
 435             self.buffer.pop_front();
 436         }
 437         if is_seqcst {
 438             // Every store that happens before this needs to be marked as SC
 439             // so that in a later SC load, only the last SC store (i.e. this one) or stores that
 440             // aren't ordered by hb with the last SC is picked.
 441             self.buffer.iter_mut().rev().for_each(|elem| {
 442                 if elem.timestamp <= thread_clock[elem.store_index] {
 443                     elem.is_seqcst = true;
 444                 }
 445             })
 446         }
 447     }
 448 }
 449
 450 impl StoreElement {
 451     /// ATOMIC LOAD IMPL in the paper
 452     /// Unlike the operational semantics in the paper, we don't need to keep track
 453     /// of the thread timestamp for every single load. Keeping track of the first (smallest)
 454     /// timestamp of each thread that has loaded from a store is sufficient: if the earliest
 455     /// load of another thread happens before the current one, then we must stop searching the store
 456     /// buffer regardless of subsequent loads by the same thread; if the earliest load of another
 457     /// thread doesn't happen before the current one, then no subsequent load by the other thread
 458     /// can happen before the current one.
 459     fn load_impl(
 460         &self,
 461         index: VectorIdx,
 462         clocks: &ThreadClockSet,
 463         is_seqcst: bool,
 464     ) -> Scalar<Provenance> {
 465         let mut load_info = self.load_info.borrow_mut();
 466         load_info.sc_loaded |= is_seqcst;
 467         let _ = load_info.timestamps.try_insert(index, clocks.clock[index]);
 468         self.val
 469     }
 470 }
 471
 472 impl<'mir, 'tcx: 'mir> EvalContextExt<'mir, 'tcx> for crate::MiriInterpCx<'mir, 'tcx> {}
 473 pub(super) trait EvalContextExt<'mir, 'tcx: 'mir>:
 474     crate::MiriInterpCxExt<'mir, 'tcx>
 475 {
 476     // If weak memory emulation is enabled, check if this atomic op imperfectly overlaps with a previous
 477     // atomic read or write. If it does, then we require it to be ordered (non-racy) with all previous atomic
 478     // accesses on all the bytes in range
 479     fn validate_overlapping_atomic(
 480         &self,
 481         place: &MPlaceTy<'tcx, Provenance>,
 482     ) -> InterpResult<'tcx> {
 483         let this = self.eval_context_ref();
 484         let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr)?;
 485         if let crate::AllocExtra {
 486             weak_memory: Some(alloc_buffers),
 487             data_race: Some(alloc_clocks),
 488             ..
 489         } = this.get_alloc_extra(alloc_id)?
 490         {
 491             let range = alloc_range(base_offset, place.layout.size);
 492             if alloc_buffers.is_overlapping(range)
 493                 && !alloc_clocks.race_free_with_atomic(
 494                     range,
 495                     this.machine.data_race.as_ref().unwrap(),
 496                     &this.machine.threads,
 497                 )
 498             {
 499                 throw_unsup_format!(
 500                     "racy imperfectly overlapping atomic access is not possible in the C++20 memory model, and not supported by Miri's weak memory emulation"
 501                 );
 502             }
 503         }
 504         Ok(())
 505     }
 506
 507     fn buffered_atomic_rmw(
 508         &mut self,
 509         new_val: Scalar<Provenance>,
 510         place: &MPlaceTy<'tcx, Provenance>,
 511         atomic: AtomicRwOrd,
 512         init: Scalar<Provenance>,
 513     ) -> InterpResult<'tcx> {
 514         let this = self.eval_context_mut();
 515         let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr)?;
 516         if let (
 517             crate::AllocExtra { weak_memory: Some(alloc_buffers), .. },
 518             crate::MiriMachine { data_race: Some(global), threads, .. },
 519         ) = this.get_alloc_extra_mut(alloc_id)?
 520         {
 521             if atomic == AtomicRwOrd::SeqCst {
 522                 global.sc_read(threads);
 523                 global.sc_write(threads);
 524             }
 525             let range = alloc_range(base_offset, place.layout.size);
 526             let buffer = alloc_buffers.get_or_create_store_buffer_mut(range, init)?;
 527             buffer.read_from_last_store(global, threads, atomic == AtomicRwOrd::SeqCst);
 528             buffer.buffered_write(new_val, global, threads, atomic == AtomicRwOrd::SeqCst)?;
 529         }
 530         Ok(())
 531     }
 532
 533     fn buffered_atomic_read(
 534         &self,
 535         place: &MPlaceTy<'tcx, Provenance>,
 536         atomic: AtomicReadOrd,
 537         latest_in_mo: Scalar<Provenance>,
 538         validate: impl FnOnce() -> InterpResult<'tcx>,
 539     ) -> InterpResult<'tcx, Scalar<Provenance>> {
 540         let this = self.eval_context_ref();
 541         if let Some(global) = &this.machine.data_race {
 542             let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr)?;
 543             if let Some(alloc_buffers) = this.get_alloc_extra(alloc_id)?.weak_memory.as_ref() {
 544                 if atomic == AtomicReadOrd::SeqCst {
 545                     global.sc_read(&this.machine.threads);
 546                 }
 547                 let mut rng = this.machine.rng.borrow_mut();
 548                 let buffer = alloc_buffers.get_or_create_store_buffer(
 549                     alloc_range(base_offset, place.layout.size),
 550                     latest_in_mo,
 551                 )?;
 552                 let (loaded, recency) = buffer.buffered_read(
 553                     global,
 554                     &this.machine.threads,
 555                     atomic == AtomicReadOrd::SeqCst,
 556                     &mut *rng,
 557                     validate,
 558                 )?;
 559                 if global.track_outdated_loads && recency == LoadRecency::Outdated {
 560                     this.emit_diagnostic(NonHaltingDiagnostic::WeakMemoryOutdatedLoad);
 561                 }
 562
 563                 return Ok(loaded);
 564             }
 565         }
 566
 567         // Race detector or weak memory disabled, simply read the latest value
 568         validate()?;
 569         Ok(latest_in_mo)
 570     }
 571
 572     fn buffered_atomic_write(
 573         &mut self,
 574         val: Scalar<Provenance>,
 575         dest: &MPlaceTy<'tcx, Provenance>,
 576         atomic: AtomicWriteOrd,
 577         init: Scalar<Provenance>,
 578     ) -> InterpResult<'tcx> {
 579         let this = self.eval_context_mut();
 580         let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(dest.ptr)?;
 581         if let (
 582             crate::AllocExtra { weak_memory: Some(alloc_buffers), .. },
 583             crate::MiriMachine { data_race: Some(global), threads, .. },
 584         ) = this.get_alloc_extra_mut(alloc_id)?
 585         {
 586             if atomic == AtomicWriteOrd::SeqCst {
 587                 global.sc_write(threads);
 588             }
 589
 590             // UGLY HACK: in write_scalar_atomic() we don't know the value before our write,
 591             // so init == val always. If the buffer is fresh then we would've duplicated an entry,
 592             // so we need to remove it.
 593             // See https://github.com/rust-lang/miri/issues/2164
 594             let was_empty = matches!(
 595                 alloc_buffers
 596                     .store_buffers
 597                     .borrow()
 598                     .access_type(alloc_range(base_offset, dest.layout.size)),
 599                 AccessType::Empty(_)
 600             );
 601             let buffer = alloc_buffers
 602                 .get_or_create_store_buffer_mut(alloc_range(base_offset, dest.layout.size), init)?;
 603             if was_empty {
 604                 buffer.buffer.pop_front();
 605             }
 606
 607             buffer.buffered_write(val, global, threads, atomic == AtomicWriteOrd::SeqCst)?;
 608         }
 609
 610         // Caller should've written to dest with the vanilla scalar write, we do nothing here
 611         Ok(())
 612     }
 613
 614     /// Caller should never need to consult the store buffer for the latest value.
 615     /// This function is used exclusively for failed atomic_compare_exchange_scalar
 616     /// to perform load_impl on the latest store element
 617     fn perform_read_on_buffered_latest(
 618         &self,
 619         place: &MPlaceTy<'tcx, Provenance>,
 620         atomic: AtomicReadOrd,
 621         init: Scalar<Provenance>,
 622     ) -> InterpResult<'tcx> {
 623         let this = self.eval_context_ref();
 624
 625         if let Some(global) = &this.machine.data_race {
 626             if atomic == AtomicReadOrd::SeqCst {
 627                 global.sc_read(&this.machine.threads);
 628             }
 629             let size = place.layout.size;
 630             let (alloc_id, base_offset, ..) = this.ptr_get_alloc_id(place.ptr)?;
 631             if let Some(alloc_buffers) = this.get_alloc_extra(alloc_id)?.weak_memory.as_ref() {
 632                 let buffer = alloc_buffers
 633                     .get_or_create_store_buffer(alloc_range(base_offset, size), init)?;
 634                 buffer.read_from_last_store(
 635                     global,
 636                     &this.machine.threads,
 637                     atomic == AtomicReadOrd::SeqCst,
 638                 );
 639             }
 640         }
 641         Ok(())
 642     }
 643 }