// option. This file may not be copied, modified, or distributed
// except according to those terms.
+/// This file includes the logic for exhaustiveness and usefulness checking for
+/// pattern-matching. Specifically, given a list of patterns for a type, we can
+/// tell whether:
+/// (a) the patterns cover every possible constructor for the type [exhaustiveness]
+/// (b) each pattern is necessary [usefulness]
+///
+/// The algorithm implemented here is a modified version of the one described in:
+/// http://moscova.inria.fr/~maranget/papers/warn/index.html
+/// However, to save future implementors from reading the original paper, I'm going
+/// to summarise the algorithm here to hopefully save time and be a little clearer
+/// (without being so rigorous).
+///
+/// The core of the algorithm revolves about a "usefulness" check. In particular, we
+/// are trying to compute a predicate `U(P, p_{m + 1})` where `P` is a list of patterns
+/// of length `m` for a compound (product) type with `n` components (we refer to this as
+/// a matrix). `U(P, p_{m + 1})` represents whether, given an existing list of patterns
+/// `p_1 ..= p_m`, adding a new pattern will be "useful" (that is, cover previously-
+/// uncovered values of the type).
+///
+/// If we have this predicate, then we can easily compute both exhaustiveness of an
+/// entire set of patterns and the individual usefulness of each one.
+/// (a) the set of patterns is exhaustive iff `U(P, _)` is false (i.e. adding a wildcard
+/// match doesn't increase the number of values we're matching)
+/// (b) a pattern `p_i` is not useful if `U(P[0..=(i-1), p_i)` is false (i.e. adding a
+/// pattern to those that have come before it doesn't increase the number of values
+/// we're matching).
+///
+/// For example, say we have the following:
+/// ```
+/// // x: (Option<bool>, Result<()>)
+/// match x {
+/// (Some(true), _) => {}
+/// (None, Err(())) => {}
+/// (None, Err(_)) => {}
+/// }
+/// ```
+/// Here, the matrix `P` is 3 x 2 (rows x columns).
+/// [
+/// [Some(true), _],
+/// [None, Err(())],
+/// [None, Err(_)],
+/// ]
+/// We can tell it's not exhaustive, because `U(P, _)` is true (we're not covering
+/// `[Some(false), _]`, for instance). In addition, row 3 is not useful, because
+/// all the values it covers are already covered by row 2.
+///
+/// To compute `U`, we must have two other concepts.
+/// 1. `S(c, P)` is a "specialised matrix", where `c` is a constructor (like `Some` or
+/// `None`). You can think of it as filtering `P` to just the rows whose *first* pattern
+/// can cover `c` (and expanding OR-patterns into distinct patterns), and then expanding
+/// the constructor into all of its components.
+/// The specialisation of a row vector is computed by `specialize`.
+///
+/// It is computed as follows. For each row `p_i` of P, we have four cases:
+/// 1.1. `p_(i,1) = c(r_1, .., r_a)`. Then `S(c, P)` has a corresponding row:
+/// r_1, .., r_a, p_(i,2), .., p_(i,n)
+/// 1.2. `p_(i,1) = c'(r_1, .., r_a')` where `c ≠ c'`. Then `S(c, P)` has no
+/// corresponding row.
+/// 1.3. `p_(i,1) = _`. Then `S(c, P)` has a corresponding row:
+/// _, .., _, p_(i,2), .., p_(i,n)
+/// 1.4. `p_(i,1) = r_1 | r_2`. Then `S(c, P)` has corresponding rows inlined from:
+/// S(c, (r_1, p_(i,2), .., p_(i,n)))
+/// S(c, (r_2, p_(i,2), .., p_(i,n)))
+///
+/// 2. `D(P)` is a "default matrix". This is used when we know there are missing
+/// constructor cases, but there might be existing wildcard patterns, so to check the
+/// usefulness of the matrix, we have to check all its *other* components.
+/// The default matrix is computed inline in `is_useful`.
+///
+/// It is computed as follows. For each row `p_i` of P, we have three cases:
+/// 1.1. `p_(i,1) = c(r_1, .., r_a)`. Then `D(P)` has no corresponding row.
+/// 1.2. `p_(i,1) = _`. Then `D(P)` has a corresponding row:
+/// p_(i,2), .., p_(i,n)
+/// 1.3. `p_(i,1) = r_1 | r_2`. Then `D(P)` has corresponding rows inlined from:
+/// D((r_1, p_(i,2), .., p_(i,n)))
+/// D((r_2, p_(i,2), .., p_(i,n)))
+///
+/// Note that the OR-patterns are not always used directly in Rust, but are used to derive
+/// the exhaustive integer matching rules, so they're written here for posterity.
+///
+/// The algorithm for computing `U`
+/// -------------------------------
+/// The algorithm is inductive (on the number of columns: i.e. components of tuple patterns).
+/// That means we're going to check the components from left-to-right, so the algorithm
+/// operates principally on the first component of the matrix and new pattern `p_{m + 1}`.
+/// This algorithm is realised in the `is_useful` function.
+///
+/// Base case. (`n = 0`, i.e. an empty tuple pattern)
+/// - If `P` already contains an empty pattern (i.e. if the number of patterns `m > 0`),
+/// then `U(P, p_{m + 1})` is false.
+/// - Otherwise, `P` must be empty, so `U(P, p_{m + 1})` is true.
+///
+/// Inductive step. (`n > 0`, i.e. whether there's at least one column
+/// [which may then be expanded into further columns later])
+/// We're going to match on the new pattern, `p_{m + 1}`.
+/// - If `p_{m + 1} == c(r_1, .., r_a)`, then we have a constructor pattern.
+/// Thus, the usefulness of `p_{m + 1}` can be reduced to whether it is useful when
+/// we ignore all the patterns in `P` that involve other constructors. This is where
+/// `S(c, P)` comes in:
+/// `U(P, p_{m + 1}) := U(S(c, P), S(c, p_{m + 1}))`
+/// This special case is handled in `is_useful_specialized`.
+/// - If `p_{m + 1} == _`, then we have two more cases:
+/// + All the constructors of the first component of the type exist within
+/// all the rows (after having expanded OR-patterns). In this case:
+/// `U(P, p_{m + 1}) := ∨(k ϵ constructors) U(S(k, P), S(k, p_{m + 1}))`
+/// I.e. the pattern `p_{m + 1}` is only useful when all the constructors are
+/// present *if* its later components are useful for the respective constructors
+/// covered by `p_{m + 1}` (usually a single constructor, but all in the case of `_`).
+/// + Some constructors are not present in the existing rows (after having expanded
+/// OR-patterns). However, there might be wildcard patterns (`_`) present. Thus, we
+/// are only really concerned with the other patterns leading with wildcards. This is
+/// where `D` comes in:
+/// `U(P, p_{m + 1}) := U(D(P), p_({m + 1},2), .., p_({m + 1},n))`
+/// - If `p_{m + 1} == r_1 | r_2`, then the usefulness depends on each separately:
+/// `U(P, p_{m + 1}) := U(P, (r_1, p_({m + 1},2), .., p_({m + 1},n)))
+/// || U(P, (r_2, p_({m + 1},2), .., p_({m + 1},n)))`
+///
+/// Modifications to the algorithm
+/// ------------------------------
+/// The algorithm in the paper doesn't cover some of the special cases that arise in Rust, for
+/// example uninhabited types and variable-length slice patterns. These are drawn attention to
+/// throughout the code below. I'll make a quick note here about how exhaustive integer matching
+/// is accounted for, though.
+///
+/// Exhaustive integer matching
+/// ---------------------------
+/// An integer type can be thought of as a (huge) sum type: 1 | 2 | 3 | ...
+/// So to support exhaustive integer matching, we can make use of the logic in the paper for
+/// OR-patterns. However, we obviously can't just treat ranges x..=y as individual sums, because
+/// they are likely gigantic. So we instead treat ranges as constructors of the integers. This means
+/// that we have a constructor *of* constructors (the integers themselves). We then need to work
+/// through all the inductive step rules above, deriving how the ranges would be treated as
+/// OR-patterns, and making sure that they're treated in the same way even when they're ranges.
+/// There are really only four special cases here:
+/// - When we match on a constructor that's actually a range, we have to treat it as if we would
+/// an OR-pattern.
+/// + It turns out that we can simply extend the case for single-value patterns in
+/// `specialize` to either be *equal* to a value constructor, or *contained within* a range
+/// constructor.
+/// + When the pattern itself is a range, you just want to tell whether any of the values in
+/// the pattern range coincide with values in the constructor range, which is precisely
+/// intersection.
+/// Since when encountering a range pattern for a value constructor, we also use inclusion, it
+/// means that whenever the constructor is a value/range and the pattern is also a value/range,
+/// we can simply use intersection to test usefulness.
+/// - When we're testing for usefulness of a pattern and the pattern's first component is a
+/// wildcard.
+/// + If all the constructors appear in the matrix, we have a slight complication. By default,
+/// the behaviour (i.e. a disjunction over specialised matrices for each constructor) is
+/// invalid, because we want a disjunction over every *integer* in each range, not just a
+/// disjunction over every range. This is a bit more tricky to deal with: essentially we need
+/// to form equivalence classes of subranges of the constructor range for which the behaviour
+/// of the matrix `P` and new pattern `p_{m + 1}` are the same. This is described in more
+/// detail in `split_grouped_constructors`.
+/// + If some constructors are missing from the matrix, it turns out we don't need to do
+/// anything special (because we know none of the integers are actually wildcards: i.e. we
+/// can't span wildcards using ranges).
+
use self::Constructor::*;
use self::Usefulness::*;
use self::WitnessPreference::*;
use rustc::hir::def_id::DefId;
use rustc::hir::RangeEnd;
use rustc::ty::{self, Ty, TyCtxt, TypeFoldable};
+use rustc::ty::layout::{Integer, IntegerExt};
use rustc::mir::Field;
use rustc::mir::interpret::ConstValue;
use rustc::util::common::ErrorReported;
+use syntax::attr::{SignedInt, UnsignedInt};
use syntax_pos::{Span, DUMMY_SP};
use arena::TypedArena;
-use std::cmp::{self, Ordering};
+use std::cmp::{self, Ordering, min, max};
use std::fmt;
-use std::iter::{FromIterator, IntoIterator, repeat};
+use std::iter::{FromIterator, IntoIterator};
+use std::ops::RangeInclusive;
pub fn expand_pattern<'a, 'tcx>(cx: &MatchCheckCtxt<'a, 'tcx>, pat: Pattern<'tcx>)
-> &'a Pattern<'tcx>
}).collect();
let total_width = column_widths.iter().cloned().sum::<usize>() + column_count * 3 + 1;
- let br = repeat('+').take(total_width).collect::<String>();
+ let br = "+".repeat(total_width);
write!(f, "{}\n", br)?;
for row in pretty_printed_matrix {
write!(f, "+")?;
}
}
-//NOTE: appears to be the only place other then InferCtxt to contain a ParamEnv
pub struct MatchCheckCtxt<'a, 'tcx: 'a> {
pub tcx: TyCtxt<'a, 'tcx, 'tcx>,
/// The module in which the match occurs. This is necessary for
tcx,
module,
pattern_arena: &pattern_arena,
- byte_array_map: FxHashMap(),
+ byte_array_map: FxHashMap::default(),
})
}
}
}
-#[derive(Clone)]
+#[derive(Clone, Debug)]
pub enum Usefulness<'tcx> {
Useful,
UsefulWithWitness(Vec<Witness<'tcx>>),
}
}
-#[derive(Copy, Clone)]
+#[derive(Copy, Clone, Debug)]
pub enum WitnessPreference {
ConstructWitness,
LeaveOutWitness
max_slice_length: u64,
}
-/// A stack of patterns in reverse order of construction
-#[derive(Clone)]
+/// A witness of non-exhaustiveness for error reporting, represented
+/// as a list of patterns (in reverse order of construction) with
+/// wildcards inside to represent elements that can take any inhabitant
+/// of the type as a value.
+///
+/// A witness against a list of patterns should have the same types
+/// and length as the pattern matched against. Because Rust `match`
+/// is always against a single pattern, at the end the witness will
+/// have length 1, but in the middle of the algorithm, it can contain
+/// multiple patterns.
+///
+/// For example, if we are constructing a witness for the match against
+/// ```
+/// struct Pair(Option<(u32, u32)>, bool);
+///
+/// match (p: Pair) {
+/// Pair(None, _) => {}
+/// Pair(_, false) => {}
+/// }
+/// ```
+///
+/// We'll perform the following steps:
+/// 1. Start with an empty witness
+/// `Witness(vec![])`
+/// 2. Push a witness `Some(_)` against the `None`
+/// `Witness(vec![Some(_)])`
+/// 3. Push a witness `true` against the `false`
+/// `Witness(vec![Some(_), true])`
+/// 4. Apply the `Pair` constructor to the witnesses
+/// `Witness(vec![Pair(Some(_), true)])`
+///
+/// The final `Pair(Some(_), true)` is then the resulting witness.
+#[derive(Clone, Debug)]
pub struct Witness<'tcx>(Vec<Pattern<'tcx>>);
impl<'tcx> Witness<'tcx> {
let arity = constructor_arity(cx, ctor, ty);
let pat = {
let len = self.0.len() as u64;
- let mut pats = self.0.drain((len-arity) as usize..).rev();
+ let mut pats = self.0.drain((len - arity) as usize..).rev();
match ty.sty {
ty::TyAdt(..) |
_ => {
match *ctor {
ConstantValue(value) => PatternKind::Constant { value },
+ ConstantRange(lo, hi, end) => PatternKind::Range { lo, hi, end },
_ => PatternKind::Wild,
}
}
/// but is instead bounded by the maximum fixed length of slice patterns in
/// the column of patterns being analyzed.
///
-/// This intentionally does not list ConstantValue specializations for
-/// non-booleans, because we currently assume that there is always a
-/// "non-standard constant" that matches. See issue #12483.
-///
/// We make sure to omit constructors that are statically impossible. eg for
/// Option<!> we do not include Some(_) in the returned list of constructors.
fn all_constructors<'a, 'tcx: 'a>(cx: &mut MatchCheckCtxt<'a, 'tcx>,
-> Vec<Constructor<'tcx>>
{
debug!("all_constructors({:?})", pcx.ty);
- match pcx.ty.sty {
+ let exhaustive_integer_patterns = cx.tcx.features().exhaustive_integer_patterns;
+ let ctors = match pcx.ty.sty {
ty::TyBool => {
[true, false].iter().map(|&b| {
ConstantValue(ty::Const::from_bool(cx.tcx, b))
.map(|v| Variant(v.did))
.collect()
}
+ ty::TyChar if exhaustive_integer_patterns => {
+ let endpoint = |c: char| {
+ let ty = ty::ParamEnv::empty().and(cx.tcx.types.char);
+ ty::Const::from_bits(cx.tcx, c as u128, ty)
+ };
+ vec![
+ // The valid Unicode Scalar Value ranges.
+ ConstantRange(endpoint('\u{0000}'), endpoint('\u{D7FF}'), RangeEnd::Included),
+ ConstantRange(endpoint('\u{E000}'), endpoint('\u{10FFFF}'), RangeEnd::Included),
+ ]
+ }
+ ty::TyInt(ity) if exhaustive_integer_patterns => {
+ // FIXME(49937): refactor these bit manipulations into interpret.
+ let bits = Integer::from_attr(cx.tcx, SignedInt(ity)).size().bits() as u128;
+ let min = 1u128 << (bits - 1);
+ let max = (1u128 << (bits - 1)) - 1;
+ let ty = ty::ParamEnv::empty().and(pcx.ty);
+ vec![ConstantRange(ty::Const::from_bits(cx.tcx, min as u128, ty),
+ ty::Const::from_bits(cx.tcx, max as u128, ty),
+ RangeEnd::Included)]
+ }
+ ty::TyUint(uty) if exhaustive_integer_patterns => {
+ // FIXME(49937): refactor these bit manipulations into interpret.
+ let bits = Integer::from_attr(cx.tcx, UnsignedInt(uty)).size().bits() as u128;
+ let max = !0u128 >> (128 - bits);
+ let ty = ty::ParamEnv::empty().and(pcx.ty);
+ vec![ConstantRange(ty::Const::from_bits(cx.tcx, 0, ty),
+ ty::Const::from_bits(cx.tcx, max, ty),
+ RangeEnd::Included)]
+ }
_ => {
if cx.is_uninhabited(pcx.ty) {
vec![]
vec![Single]
}
}
- }
+ };
+ ctors
}
fn max_slice_length<'p, 'a: 'p, 'tcx: 'a, I>(
// `[true, ..]`
// `[.., false]`
// Then any slice of length ≥1 that matches one of these two
- // patterns can be be trivially turned to a slice of any
+ // patterns can be trivially turned to a slice of any
// other length ≥1 that matches them and vice-versa - for
// but the slice from length 2 `[false, true]` that matches neither
// of these patterns can't be turned to a slice from length 1 that
cmp::max(max_fixed_len + 1, max_prefix_len + max_suffix_len)
}
+/// An inclusive interval, used for precise integer exhaustiveness checking.
+/// `IntRange`s always store a contiguous range. This means that values are
+/// encoded such that `0` encodes the minimum value for the integer,
+/// regardless of the signedness.
+/// For example, the pattern `-128...127i8` is encoded as `0..=255`.
+/// This makes comparisons and arithmetic on interval endpoints much more
+/// straightforward. See `signed_bias` for details.
+///
+/// `IntRange` is never used to encode an empty range or a "range" that wraps
+/// around the (offset) space: i.e. `range.lo <= range.hi`.
+struct IntRange<'tcx> {
+ pub range: RangeInclusive<u128>,
+ pub ty: Ty<'tcx>,
+}
+
+impl<'tcx> IntRange<'tcx> {
+ fn from_ctor(tcx: TyCtxt<'_, 'tcx, 'tcx>,
+ ctor: &Constructor<'tcx>)
+ -> Option<IntRange<'tcx>> {
+ match ctor {
+ ConstantRange(lo, hi, end) => {
+ assert_eq!(lo.ty, hi.ty);
+ let ty = lo.ty;
+ let env_ty = ty::ParamEnv::empty().and(ty);
+ if let Some(lo) = lo.assert_bits(tcx, env_ty) {
+ if let Some(hi) = hi.assert_bits(tcx, env_ty) {
+ // Perform a shift if the underlying types are signed,
+ // which makes the interval arithmetic simpler.
+ let bias = IntRange::signed_bias(tcx, ty);
+ let (lo, hi) = (lo ^ bias, hi ^ bias);
+ // Make sure the interval is well-formed.
+ return if lo > hi || lo == hi && *end == RangeEnd::Excluded {
+ None
+ } else {
+ let offset = (*end == RangeEnd::Excluded) as u128;
+ Some(IntRange { range: lo..=(hi - offset), ty })
+ };
+ }
+ }
+ None
+ }
+ ConstantValue(val) => {
+ let ty = val.ty;
+ if let Some(val) = val.assert_bits(tcx, ty::ParamEnv::empty().and(ty)) {
+ let bias = IntRange::signed_bias(tcx, ty);
+ let val = val ^ bias;
+ Some(IntRange { range: val..=val, ty })
+ } else {
+ None
+ }
+ }
+ Single | Variant(_) | Slice(_) => {
+ None
+ }
+ }
+ }
+
+ fn from_pat(tcx: TyCtxt<'_, 'tcx, 'tcx>,
+ pat: &Pattern<'tcx>)
+ -> Option<IntRange<'tcx>> {
+ Self::from_ctor(tcx, &match pat.kind {
+ box PatternKind::Constant { value } => ConstantValue(value),
+ box PatternKind::Range { lo, hi, end } => ConstantRange(lo, hi, end),
+ _ => return None,
+ })
+ }
+
+ // The return value of `signed_bias` should be XORed with an endpoint to encode/decode it.
+ fn signed_bias(tcx: TyCtxt<'_, 'tcx, 'tcx>, ty: Ty<'tcx>) -> u128 {
+ match ty.sty {
+ ty::TyInt(ity) => {
+ let bits = Integer::from_attr(tcx, SignedInt(ity)).size().bits() as u128;
+ 1u128 << (bits - 1)
+ }
+ _ => 0
+ }
+ }
+
+ /// Convert a `RangeInclusive` to a `ConstantValue` or inclusive `ConstantRange`.
+ fn range_to_ctor(
+ tcx: TyCtxt<'_, 'tcx, 'tcx>,
+ ty: Ty<'tcx>,
+ r: RangeInclusive<u128>,
+ ) -> Constructor<'tcx> {
+ let bias = IntRange::signed_bias(tcx, ty);
+ let ty = ty::ParamEnv::empty().and(ty);
+ let (lo, hi) = r.into_inner();
+ if lo == hi {
+ ConstantValue(ty::Const::from_bits(tcx, lo ^ bias, ty))
+ } else {
+ ConstantRange(ty::Const::from_bits(tcx, lo ^ bias, ty),
+ ty::Const::from_bits(tcx, hi ^ bias, ty),
+ RangeEnd::Included)
+ }
+ }
+
+ /// Return a collection of ranges that spans the values covered by `ranges`, subtracted
+ /// by the values covered by `self`: i.e. `ranges \ self` (in set notation).
+ fn subtract_from(self,
+ tcx: TyCtxt<'_, 'tcx, 'tcx>,
+ ranges: Vec<Constructor<'tcx>>)
+ -> Vec<Constructor<'tcx>> {
+ let ranges = ranges.into_iter().filter_map(|r| {
+ IntRange::from_ctor(tcx, &r).map(|i| i.range)
+ });
+ let mut remaining_ranges = vec![];
+ let ty = self.ty;
+ let (lo, hi) = self.range.into_inner();
+ for subrange in ranges {
+ let (subrange_lo, subrange_hi) = subrange.into_inner();
+ if lo > subrange_hi || subrange_lo > hi {
+ // The pattern doesn't intersect with the subrange at all,
+ // so the subrange remains untouched.
+ remaining_ranges.push(Self::range_to_ctor(tcx, ty, subrange_lo..=subrange_hi));
+ } else {
+ if lo > subrange_lo {
+ // The pattern intersects an upper section of the
+ // subrange, so a lower section will remain.
+ remaining_ranges.push(Self::range_to_ctor(tcx, ty, subrange_lo..=(lo - 1)));
+ }
+ if hi < subrange_hi {
+ // The pattern intersects a lower section of the
+ // subrange, so an upper section will remain.
+ remaining_ranges.push(Self::range_to_ctor(tcx, ty, (hi + 1)..=subrange_hi));
+ }
+ }
+ }
+ remaining_ranges
+ }
+
+ fn intersection(&self, other: &Self) -> Option<Self> {
+ let ty = self.ty;
+ let (lo, hi) = (*self.range.start(), *self.range.end());
+ let (other_lo, other_hi) = (*other.range.start(), *other.range.end());
+ if lo <= other_hi && other_lo <= hi {
+ Some(IntRange { range: max(lo, other_lo)..=min(hi, other_hi), ty })
+ } else {
+ None
+ }
+ }
+}
+
+// Return a set of constructors equivalent to `all_ctors \ used_ctors`.
+fn compute_missing_ctors<'a, 'tcx: 'a>(
+ tcx: TyCtxt<'a, 'tcx, 'tcx>,
+ all_ctors: &Vec<Constructor<'tcx>>,
+ used_ctors: &Vec<Constructor<'tcx>>,
+) -> Vec<Constructor<'tcx>> {
+ let mut missing_ctors = vec![];
+
+ for req_ctor in all_ctors {
+ let mut refined_ctors = vec![req_ctor.clone()];
+ for used_ctor in used_ctors {
+ if used_ctor == req_ctor {
+ // If a constructor appears in a `match` arm, we can
+ // eliminate it straight away.
+ refined_ctors = vec![]
+ } else if tcx.features().exhaustive_integer_patterns {
+ if let Some(interval) = IntRange::from_ctor(tcx, used_ctor) {
+ // Refine the required constructors for the type by subtracting
+ // the range defined by the current constructor pattern.
+ refined_ctors = interval.subtract_from(tcx, refined_ctors);
+ }
+ }
+
+ // If the constructor patterns that have been considered so far
+ // already cover the entire range of values, then we the
+ // constructor is not missing, and we can move on to the next one.
+ if refined_ctors.is_empty() {
+ break;
+ }
+ }
+ // If a constructor has not been matched, then it is missing.
+ // We add `refined_ctors` instead of `req_ctor`, because then we can
+ // provide more detailed error information about precisely which
+ // ranges have been omitted.
+ missing_ctors.extend(refined_ctors);
+ }
+
+ missing_ctors
+}
+
/// Algorithm from http://moscova.inria.fr/~maranget/papers/warn/index.html
/// The algorithm from the paper has been modified to correctly handle empty
/// types. The changes are:
// FIXME: this might lead to "unstable" behavior with macro hygiene
// introducing uninhabited patterns for inaccessible fields. We
// need to figure out how to model that.
- ty: rows.iter().map(|r| r[0].ty).find(|ty| !ty.references_error())
- .unwrap_or(v[0].ty),
+ ty: rows.iter().map(|r| r[0].ty).find(|ty| !ty.references_error()).unwrap_or(v[0].ty),
max_slice_length: max_slice_length(cx, rows.iter().map(|r| r[0]).chain(Some(v[0])))
};
if let Some(constructors) = pat_constructors(cx, v[0], pcx) {
debug!("is_useful - expanding constructors: {:#?}", constructors);
- constructors.into_iter().map(|c|
+ split_grouped_constructors(cx.tcx, constructors, matrix, pcx.ty).into_iter().map(|c|
is_useful_specialized(cx, matrix, v, c.clone(), pcx.ty, witness)
).find(|result| result.is_useful()).unwrap_or(NotUseful)
} else {
pat_constructors(cx, row[0], pcx).unwrap_or(vec![])
}).collect();
debug!("used_ctors = {:#?}", used_ctors);
+ // `all_ctors` are all the constructors for the given type, which
+ // should all be represented (or caught with the wild pattern `_`).
let all_ctors = all_constructors(cx, pcx);
debug!("all_ctors = {:#?}", all_ctors);
- let missing_ctors: Vec<Constructor> = all_ctors.iter().filter(|c| {
- !used_ctors.contains(*c)
- }).cloned().collect();
// `missing_ctors` is the set of constructors from the same type as the
// first column of `matrix` that are matched only by wildcard patterns
// feature flag is not present, so this is only
// needed for that case.
- let is_privately_empty =
- all_ctors.is_empty() && !cx.is_uninhabited(pcx.ty);
- let is_declared_nonexhaustive =
- cx.is_non_exhaustive_enum(pcx.ty) && !cx.is_local(pcx.ty);
+ // Find those constructors that are not matched by any non-wildcard patterns in the
+ // current column.
+ let missing_ctors = compute_missing_ctors(cx.tcx, &all_ctors, &used_ctors);
+
+ let is_privately_empty = all_ctors.is_empty() && !cx.is_uninhabited(pcx.ty);
+ let is_declared_nonexhaustive = cx.is_non_exhaustive_enum(pcx.ty) && !cx.is_local(pcx.ty);
debug!("missing_ctors={:#?} is_privately_empty={:#?} is_declared_nonexhaustive={:#?}",
missing_ctors, is_privately_empty, is_declared_nonexhaustive);
let is_non_exhaustive = is_privately_empty || is_declared_nonexhaustive;
if missing_ctors.is_empty() && !is_non_exhaustive {
- all_ctors.into_iter().map(|c| {
+ split_grouped_constructors(cx.tcx, all_ctors, matrix, pcx.ty).into_iter().map(|c| {
is_useful_specialized(cx, matrix, v, c.clone(), pcx.ty, witness)
}).find(|result| result.is_useful()).unwrap_or(NotUseful)
} else {
// `used_ctors` is empty.
let new_witnesses = if is_non_exhaustive || used_ctors.is_empty() {
// All constructors are unused. Add wild patterns
- // rather than each individual constructor
+ // rather than each individual constructor.
pats.into_iter().map(|mut witness| {
witness.0.push(Pattern {
ty: pcx.ty,
} else {
pats.into_iter().flat_map(|witness| {
missing_ctors.iter().map(move |ctor| {
+ // Extends the witness with a "wild" version of this
+ // constructor, that matches everything that can be built with
+ // it. For example, if `ctor` is a `Constructor::Variant` for
+ // `Option::Some`, this pushes the witness for `Some(_)`.
witness.clone().push_wild_constructor(cx, ctor, pcx.ty)
})
}).collect()
}
}
+/// A shorthand for the `U(S(c, P), S(c, q))` operation from the paper. I.e. `is_useful` applied
+/// to the specialised version of both the pattern matrix `P` and the new pattern `q`.
fn is_useful_specialized<'p, 'a:'p, 'tcx: 'a>(
cx: &mut MatchCheckCtxt<'a, 'tcx>,
&Matrix(ref m): &Matrix<'p, 'tcx>,
v: &[&'p Pattern<'tcx>],
ctor: Constructor<'tcx>,
lty: Ty<'tcx>,
- witness: WitnessPreference) -> Usefulness<'tcx>
-{
+ witness: WitnessPreference,
+) -> Usefulness<'tcx> {
debug!("is_useful_specialized({:#?}, {:#?}, {:?})", v, ctor, lty);
let sub_pat_tys = constructor_sub_pattern_tys(cx, &ctor, lty);
let wild_patterns_owned: Vec<_> = sub_pat_tys.iter().map(|ty| {
.collect()
),
result => result
- },
+ }
None => NotUseful
}
}
/// Slice patterns, however, can match slices of different lengths. For instance,
/// `[a, b, ..tail]` can match a slice of length 2, 3, 4 and so on.
///
-/// Returns None in case of a catch-all, which can't be specialized.
+/// Returns `None` in case of a catch-all, which can't be specialized.
fn pat_constructors<'tcx>(cx: &mut MatchCheckCtxt,
pat: &Pattern<'tcx>,
pcx: PatternContext)
-> Option<Vec<Constructor<'tcx>>>
{
match *pat.kind {
- PatternKind::Binding { .. } | PatternKind::Wild =>
- None,
- PatternKind::Leaf { .. } | PatternKind::Deref { .. } =>
- Some(vec![Single]),
- PatternKind::Variant { adt_def, variant_index, .. } =>
- Some(vec![Variant(adt_def.variants[variant_index].did)]),
- PatternKind::Constant { value } =>
- Some(vec![ConstantValue(value)]),
- PatternKind::Range { lo, hi, end } =>
- Some(vec![ConstantRange(lo, hi, end)]),
+ PatternKind::Binding { .. } | PatternKind::Wild => None,
+ PatternKind::Leaf { .. } | PatternKind::Deref { .. } => Some(vec![Single]),
+ PatternKind::Variant { adt_def, variant_index, .. } => {
+ Some(vec![Variant(adt_def.variants[variant_index].did)])
+ }
+ PatternKind::Constant { value } => Some(vec![ConstantValue(value)]),
+ PatternKind::Range { lo, hi, end } => Some(vec![ConstantRange(lo, hi, end)]),
PatternKind::Array { .. } => match pcx.ty.sty {
ty::TyArray(_, length) => Some(vec![
Slice(length.unwrap_usize(cx.tcx))
Ok(true)
}
+// Whether to evaluate a constructor using exhaustive integer matching. This is true if the
+// constructor is a range or constant with an integer type.
+fn should_treat_range_exhaustively(tcx: TyCtxt<'_, 'tcx, 'tcx>, ctor: &Constructor<'tcx>) -> bool {
+ if tcx.features().exhaustive_integer_patterns {
+ if let ConstantValue(value) | ConstantRange(value, _, _) = ctor {
+ if let ty::TyChar | ty::TyInt(_) | ty::TyUint(_) = value.ty.sty {
+ return true;
+ }
+ }
+ }
+ false
+}
+
+/// For exhaustive integer matching, some constructors are grouped within other constructors
+/// (namely integer typed values are grouped within ranges). However, when specialising these
+/// constructors, we want to be specialising for the underlying constructors (the integers), not
+/// the groups (the ranges). Thus we need to split the groups up. Splitting them up naïvely would
+/// mean creating a separate constructor for every single value in the range, which is clearly
+/// impractical. However, observe that for some ranges of integers, the specialisation will be
+/// identical across all values in that range (i.e. there are equivalence classes of ranges of
+/// constructors based on their `is_useful_specialised` outcome). These classes are grouped by
+/// the patterns that apply to them (both in the matrix `P` and in the new row `p_{m + 1}`). We
+/// can split the range whenever the patterns that apply to that range (specifically: the patterns
+/// that *intersect* with that range) change.
+/// Our solution, therefore, is to split the range constructor into subranges at every single point
+/// the group of intersecting patterns changes, which we can compute by converting each pattern to
+/// a range and recording its endpoints, then creating subranges between each consecutive pair of
+/// endpoints.
+/// And voilà! We're testing precisely those ranges that we need to, without any exhaustive matching
+/// on actual integers. The nice thing about this is that the number of subranges is linear in the
+/// number of rows in the matrix (i.e. the number of cases in the `match` statement), so we don't
+/// need to be worried about matching over gargantuan ranges.
+fn split_grouped_constructors<'p, 'a: 'p, 'tcx: 'a>(
+ tcx: TyCtxt<'a, 'tcx, 'tcx>,
+ ctors: Vec<Constructor<'tcx>>,
+ &Matrix(ref m): &Matrix<'p, 'tcx>,
+ ty: Ty<'tcx>,
+) -> Vec<Constructor<'tcx>> {
+ let mut split_ctors = Vec::with_capacity(ctors.len());
+
+ for ctor in ctors.into_iter() {
+ match ctor {
+ // For now, only ranges may denote groups of "subconstructors", so we only need to
+ // special-case constant ranges.
+ ConstantRange(..) if should_treat_range_exhaustively(tcx, &ctor) => {
+ // We only care about finding all the subranges within the range of the intersection
+ // of the new pattern `p_({m + 1},1)` (here `pat`) and the constructor range.
+ // Anything else is irrelevant, because it is guaranteed to result in `NotUseful`,
+ // which is the default case anyway, and can be ignored.
+ let ctor_range = IntRange::from_ctor(tcx, &ctor).unwrap();
+
+ // We're going to collect all the endpoints in the new pattern so we can create
+ // subranges between them.
+ // If there's a single point, we need to identify it as belonging
+ // to a length-1 range, so it can be treated as an individual
+ // constructor, rather than as an endpoint. To do this, we keep track of which
+ // endpoint a point corresponds to. Whenever a point corresponds to both a start
+ // and an end, then we create a unit range for it.
+ #[derive(PartialEq, Clone, Copy, Debug)]
+ enum Endpoint {
+ Start,
+ End,
+ Both,
+ };
+ let mut points = FxHashMap::default();
+ let add_endpoint = |points: &mut FxHashMap<_, _>, x, endpoint| {
+ points.entry(x).and_modify(|ex_x| {
+ if *ex_x != endpoint {
+ *ex_x = Endpoint::Both
+ }
+ }).or_insert(endpoint);
+ };
+ let add_endpoints = |points: &mut FxHashMap<_, _>, lo, hi| {
+ // Insert the endpoints, taking care to keep track of to
+ // which endpoints a point corresponds.
+ add_endpoint(points, lo, Endpoint::Start);
+ add_endpoint(points, hi, Endpoint::End);
+ };
+ let (lo, hi) = (*ctor_range.range.start(), *ctor_range.range.end());
+ add_endpoints(&mut points, lo, hi);
+ // We're going to iterate through every row pattern, adding endpoints in.
+ for row in m.iter() {
+ if let Some(r) = IntRange::from_pat(tcx, row[0]) {
+ // We're only interested in endpoints that lie (at least partially)
+ // within the subrange domain.
+ if let Some(r) = ctor_range.intersection(&r) {
+ let (r_lo, r_hi) = r.range.into_inner();
+ add_endpoints(&mut points, r_lo, r_hi);
+ }
+ }
+ }
+
+ // The patterns were iterated in an arbitrary order (i.e. in the order the user
+ // wrote them), so we need to make sure our endpoints are sorted.
+ let mut points: Vec<(u128, Endpoint)> = points.into_iter().collect();
+ points.sort_unstable_by_key(|(x, _)| *x);
+ let mut points = points.into_iter();
+ let mut a = points.next().unwrap();
+
+ // Iterate through pairs of points, adding the subranges to `split_ctors`.
+ // We have to be careful about the orientation of the points as endpoints, to make
+ // sure we're enumerating precisely the correct ranges. Too few and the matching is
+ // actually incorrect. Too many and our diagnostics are poorer. This involves some
+ // case analysis.
+ while let Some(b) = points.next() {
+ // a < b (strictly)
+ if let Endpoint::Both = a.1 {
+ split_ctors.push(IntRange::range_to_ctor(tcx, ty, a.0..=a.0));
+ }
+ let c = match a.1 {
+ Endpoint::Start => a.0,
+ Endpoint::End | Endpoint::Both => a.0 + 1,
+ };
+ let d = match b.1 {
+ Endpoint::Start | Endpoint::Both => b.0 - 1,
+ Endpoint::End => b.0,
+ };
+ // In some cases, we won't need an intermediate range between two ranges
+ // lie immediately adjacent to one another.
+ if c <= d {
+ split_ctors.push(IntRange::range_to_ctor(tcx, ty, c..=d));
+ }
+
+ a = b;
+ }
+ }
+ // Any other constructor can be used unchanged.
+ _ => split_ctors.push(ctor),
+ }
+ }
+
+ split_ctors
+}
+
+/// Check whether there exists any shared value in either `ctor` or `pat` by intersecting them.
+fn constructor_intersects_pattern<'p, 'a: 'p, 'tcx: 'a>(
+ tcx: TyCtxt<'a, 'tcx, 'tcx>,
+ ctor: &Constructor<'tcx>,
+ pat: &'p Pattern<'tcx>,
+) -> Option<Vec<&'p Pattern<'tcx>>> {
+ if should_treat_range_exhaustively(tcx, ctor) {
+ match (IntRange::from_ctor(tcx, ctor), IntRange::from_pat(tcx, pat)) {
+ (Some(ctor), Some(pat)) => ctor.intersection(&pat).map(|_| vec![]),
+ _ => None,
+ }
+ } else {
+ // Fallback for non-ranges and ranges that involve floating-point numbers, which are not
+ // conveniently handled by `IntRange`. For these cases, the constructor may not be a range
+ // so intersection actually devolves into being covered by the pattern.
+ match constructor_covered_by_range(tcx, ctor, pat) {
+ Ok(true) => Some(vec![]),
+ Ok(false) | Err(ErrorReported) => None,
+ }
+ }
+}
+
fn constructor_covered_by_range<'a, 'tcx>(
tcx: TyCtxt<'a, 'tcx, 'tcx>,
ctor: &Constructor<'tcx>,
- from: &'tcx ty::Const<'tcx>, to: &'tcx ty::Const<'tcx>,
- end: RangeEnd,
- ty: Ty<'tcx>,
+ pat: &Pattern<'tcx>,
) -> Result<bool, ErrorReported> {
+ let (from, to, end, ty) = match pat.kind {
+ box PatternKind::Constant { value } => (value, value, RangeEnd::Included, value.ty),
+ box PatternKind::Range { lo, hi, end } => (lo, hi, end, lo.ty),
+ _ => bug!("`constructor_covered_by_range` called with {:?}", pat),
+ };
trace!("constructor_covered_by_range {:#?}, {:#?}, {:#?}, {}", ctor, from, to, ty);
let cmp_from = |c_from| compare_const_vals(tcx, c_from, from, ty::ParamEnv::empty().and(ty))
.map(|res| res != Ordering::Less);
cx: &mut MatchCheckCtxt<'a, 'tcx>,
r: &[&'p Pattern<'tcx>],
constructor: &Constructor<'tcx>,
- wild_patterns: &[&'p Pattern<'tcx>])
- -> Option<Vec<&'p Pattern<'tcx>>>
-{
+ wild_patterns: &[&'p Pattern<'tcx>],
+) -> Option<Vec<&'p Pattern<'tcx>>> {
let pat = &r[0];
let head: Option<Vec<&Pattern>> = match *pat.kind {
PatternKind::Binding { .. } | PatternKind::Wild => {
Some(wild_patterns.to_owned())
- },
+ }
PatternKind::Variant { adt_def, variant_index, ref subpatterns, .. } => {
let ref variant = adt_def.variants[variant_index];
PatternKind::Leaf { ref subpatterns } => {
Some(patterns_for_variant(subpatterns, wild_patterns))
}
+
PatternKind::Deref { ref subpattern } => {
Some(vec![subpattern])
}
span_bug!(pat.span,
"unexpected const-val {:?} with ctor {:?}", value, constructor)
}
- },
+ }
_ => {
- match constructor_covered_by_range(
- cx.tcx,
- constructor, value, value, RangeEnd::Included,
- value.ty,
- ) {
- Ok(true) => Some(vec![]),
- Ok(false) => None,
- Err(ErrorReported) => None,
- }
+ // If the constructor is a single value, we add a row to the specialised matrix
+ // if the pattern is equal to the constructor. If the constructor is a range of
+ // values, we add a row to the specialised matrix if the pattern is contained
+ // within the constructor. These two cases (for a single value pattern) can be
+ // treated as intersection.
+ constructor_intersects_pattern(cx.tcx, constructor, pat)
}
}
}
- PatternKind::Range { lo, hi, ref end } => {
- match constructor_covered_by_range(
- cx.tcx,
- constructor, lo, hi, end.clone(), lo.ty,
- ) {
- Ok(true) => Some(vec![]),
- Ok(false) => None,
- Err(ErrorReported) => None,
- }
+ PatternKind::Range { .. } => {
+ // If the constructor is a single value, we add a row to the specialised matrix if the
+ // pattern contains the constructor. If the constructor is a range of values, we add a
+ // row to the specialised matrix if there exists any value that lies both within the
+ // pattern and the constructor. These two cases reduce to intersection.
+ constructor_intersects_pattern(cx.tcx, constructor, pat)
}
PatternKind::Array { ref prefix, ref slice, ref suffix } |
let pat_len = prefix.len() + suffix.len();
if let Some(slice_count) = wild_patterns.len().checked_sub(pat_len) {
if slice_count == 0 || slice.is_some() {
- Some(
- prefix.iter().chain(
- wild_patterns.iter().map(|p| *p)
- .skip(prefix.len())
- .take(slice_count)
- .chain(
- suffix.iter()
- )).collect())
+ Some(prefix.iter().chain(
+ wild_patterns.iter().map(|p| *p)
+ .skip(prefix.len())
+ .take(slice_count)
+ .chain(suffix.iter())
+ ).collect())
} else {
None
}