--- /dev/null
- We currently have 2 CI matrices through Travis CI and GitHub Actions that will automatically build and test your change in order to verify that `std::simd`'s portable API is, in fact, portable. If your change builds locally, but does not build on either, this is likely due to a platform-specific concern that your code has not addressed. Please consult the build logs and address the error, or ask for help if you need it.
+# Contributing to `std::simd`
+
+Simple version:
+1. Fork it and `git clone` it
+2. Create your feature branch: `git checkout -b my-branch`
+3. Write your changes.
+4. Test it: `cargo test`. Remember to enable whatever SIMD features you intend to test by setting `RUSTFLAGS`.
+5. Commit your changes: `git commit add ./path/to/changes && git commit -m 'Fix some bug'`
+6. Push the branch: `git push --set-upstream origin my-branch`
+7. Submit a pull request!
+
+## Taking on an Issue
+
+SIMD can be quite complex, and even a "simple" issue can be huge. If an issue is organized like a tracking issue, with an itemized list of items that don't necessarily have to be done in a specific order, please take the issue one item at a time. This will help by letting work proceed apace on the rest of the issue. If it's a (relatively) small issue, feel free to announce your intention to solve it on the issue tracker and take it in one go!
+
+## CI
+
++We currently use GitHub Actions which will automatically build and test your change in order to verify that `std::simd`'s portable API is, in fact, portable. If your change builds locally, but does not build in CI, this is likely due to a platform-specific concern that your code has not addressed. Please consult the build logs and address the error, or ask for help if you need it.
+
+## Beyond stdsimd
+
+A large amount of the core SIMD implementation is found in the rustc_codegen_* crates in the [main rustc repo](https://github.com/rust-lang/rust). In addition, actual platform-specific functions are implemented in [stdarch]. Not all changes to `std::simd` require interacting with either of these, but if you're wondering where something is and it doesn't seem to be in this repository, those might be where to start looking.
+
+## Questions? Concerns? Need Help?
+
+Please feel free to ask in the [#project-portable-simd][zulip-portable-simd] stream on the [rust-lang Zulip][zulip] for help with making changes to `std::simd`!
+If your changes include directly modifying the compiler, it might also be useful to ask in [#t-compiler/help][zulip-compiler-help].
+
+[zulip-portable-simd]: https://rust-lang.zulipchat.com/#narrow/stream/257879-project-portable-simd
+[zulip-compiler-help]: https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp
+[zulip]: https://rust-lang.zulipchat.com
+[stdarch]: https://github.com/rust-lang/stdarch
--- /dev/null
- [![Build Status](https://travis-ci.com/rust-lang/portable-simd.svg?branch=master)](https://travis-ci.com/rust-lang/portable-simd)
+# The Rust standard library's portable SIMD API
++![Build Status](https://github.com/rust-lang/portable-simd/actions/workflows/ci.yml/badge.svg?branch=master)
+
+Code repository for the [Portable SIMD Project Group](https://github.com/rust-lang/project-portable-simd).
+Please refer to [CONTRIBUTING.md](./CONTRIBUTING.md) for our contributing guidelines.
+
+The docs for this crate are published from the main branch.
+You can [read them here][docs].
+
+If you have questions about SIMD, we have begun writing a [guide][simd-guide].
+We can also be found on [Zulip][zulip-project-portable-simd].
+
+If you are interested in support for a specific architecture, you may want [stdarch] instead.
+
+## Hello World
+
+Now we're gonna dip our toes into this world with a small SIMD "Hello, World!" example. Make sure your compiler is up to date and using `nightly`. We can do that by running
+
+```bash
+rustup update -- nightly
+```
+
+or by setting up `rustup default nightly` or else with `cargo +nightly {build,test,run}`. After updating, run
+```bash
+cargo new hellosimd
+```
+to create a new crate. Edit `hellosimd/Cargo.toml` to be
+```toml
+[package]
+name = "hellosimd"
+version = "0.1.0"
+edition = "2018"
+[dependencies]
+core_simd = { git = "https://github.com/rust-lang/portable-simd" }
+```
+
+and finally write this in `src/main.rs`:
+```rust
+use core_simd::*;
+fn main() {
+ let a = f32x4::splat(10.0);
+ let b = f32x4::from_array([1.0, 2.0, 3.0, 4.0]);
+ println!("{:?}", a + b);
+}
+```
+
+Explanation: We import all the bindings from the crate with the first line. Then, we construct our SIMD vectors with methods like `splat` or `from_array`. Finally, we can use operators on them like `+` and the appropriate SIMD instructions will be carried out. When we run `cargo run` you should get `[11.0, 12.0, 13.0, 14.0]`.
+
+## Code Organization
+
+Currently the crate is organized so that each element type is a file, and then the 64-bit, 128-bit, 256-bit, and 512-bit vectors using those types are contained in said file.
+
+All types are then exported as a single, flat module.
+
+Depending on the size of the primitive type, the number of lanes the vector will have varies. For example, 128-bit vectors have four `f32` lanes and two `f64` lanes.
+
+The supported element types are as follows:
+* **Floating Point:** `f32`, `f64`
+* **Signed Integers:** `i8`, `i16`, `i32`, `i64`, `i128`, `isize`
+* **Unsigned Integers:** `u8`, `u16`, `u32`, `u64`, `u128`, `usize`
+* **Masks:** `mask8`, `mask16`, `mask32`, `mask64`, `mask128`, `masksize`
+
+Floating point, signed integers, and unsigned integers are the [primitive types](https://doc.rust-lang.org/core/primitive/index.html) you're already used to.
+The `mask` types are "truthy" values, but they use the number of bits in their name instead of just 1 bit like a normal `bool` uses.
+
+[simd-guide]: ./beginners-guide.md
+[zulip-project-portable-simd]: https://rust-lang.zulipchat.com/#narrow/stream/257879-project-portable-simd
+[stdarch]: https://github.com/rust-lang/stdarch
+[docs]: https://rust-lang.github.io/portable-simd/core_simd
--- /dev/null
- sun.v -= body.v * m_ratio;
+#![cfg_attr(feature = "std", feature(portable_simd))]
+
+/// Benchmarks game nbody code
+/// Taken from the `packed_simd` crate
+/// Run this benchmark with `cargo test --example nbody`
+#[cfg(feature = "std")]
+mod nbody {
+ use core_simd::*;
+
+ use std::f64::consts::PI;
+ const SOLAR_MASS: f64 = 4.0 * PI * PI;
+ const DAYS_PER_YEAR: f64 = 365.24;
+
+ #[derive(Debug, Clone, Copy)]
+ struct Body {
+ pub x: f64x4,
+ pub v: f64x4,
+ pub mass: f64,
+ }
+
+ const N_BODIES: usize = 5;
+ const BODIES: [Body; N_BODIES] = [
+ // sun:
+ Body {
+ x: f64x4::from_array([0., 0., 0., 0.]),
+ v: f64x4::from_array([0., 0., 0., 0.]),
+ mass: SOLAR_MASS,
+ },
+ // jupiter:
+ Body {
+ x: f64x4::from_array([
+ 4.84143144246472090e+00,
+ -1.16032004402742839e+00,
+ -1.03622044471123109e-01,
+ 0.,
+ ]),
+ v: f64x4::from_array([
+ 1.66007664274403694e-03 * DAYS_PER_YEAR,
+ 7.69901118419740425e-03 * DAYS_PER_YEAR,
+ -6.90460016972063023e-05 * DAYS_PER_YEAR,
+ 0.,
+ ]),
+ mass: 9.54791938424326609e-04 * SOLAR_MASS,
+ },
+ // saturn:
+ Body {
+ x: f64x4::from_array([
+ 8.34336671824457987e+00,
+ 4.12479856412430479e+00,
+ -4.03523417114321381e-01,
+ 0.,
+ ]),
+ v: f64x4::from_array([
+ -2.76742510726862411e-03 * DAYS_PER_YEAR,
+ 4.99852801234917238e-03 * DAYS_PER_YEAR,
+ 2.30417297573763929e-05 * DAYS_PER_YEAR,
+ 0.,
+ ]),
+ mass: 2.85885980666130812e-04 * SOLAR_MASS,
+ },
+ // uranus:
+ Body {
+ x: f64x4::from_array([
+ 1.28943695621391310e+01,
+ -1.51111514016986312e+01,
+ -2.23307578892655734e-01,
+ 0.,
+ ]),
+ v: f64x4::from_array([
+ 2.96460137564761618e-03 * DAYS_PER_YEAR,
+ 2.37847173959480950e-03 * DAYS_PER_YEAR,
+ -2.96589568540237556e-05 * DAYS_PER_YEAR,
+ 0.,
+ ]),
+ mass: 4.36624404335156298e-05 * SOLAR_MASS,
+ },
+ // neptune:
+ Body {
+ x: f64x4::from_array([
+ 1.53796971148509165e+01,
+ -2.59193146099879641e+01,
+ 1.79258772950371181e-01,
+ 0.,
+ ]),
+ v: f64x4::from_array([
+ 2.68067772490389322e-03 * DAYS_PER_YEAR,
+ 1.62824170038242295e-03 * DAYS_PER_YEAR,
+ -9.51592254519715870e-05 * DAYS_PER_YEAR,
+ 0.,
+ ]),
+ mass: 5.15138902046611451e-05 * SOLAR_MASS,
+ },
+ ];
+
+ fn offset_momentum(bodies: &mut [Body; N_BODIES]) {
+ let (sun, rest) = bodies.split_at_mut(1);
+ let sun = &mut sun[0];
+ for body in rest {
+ let m_ratio = body.mass / SOLAR_MASS;
- let f = r[i] * mag[i];
- bodies[j].v -= f * bodies[k].mass;
- bodies[k].v += f * bodies[j].mass;
++ sun.v -= body.v * Simd::splat(m_ratio);
+ }
+ }
+
+ fn energy(bodies: &[Body; N_BODIES]) -> f64 {
+ let mut e = 0.;
+ for i in 0..N_BODIES {
+ let bi = &bodies[i];
+ e += bi.mass * (bi.v * bi.v).horizontal_sum() * 0.5;
+ for bj in bodies.iter().take(N_BODIES).skip(i + 1) {
+ let dx = bi.x - bj.x;
+ e -= bi.mass * bj.mass / (dx * dx).horizontal_sum().sqrt()
+ }
+ }
+ e
+ }
+
+ fn advance(bodies: &mut [Body; N_BODIES], dt: f64) {
+ const N: usize = N_BODIES * (N_BODIES - 1) / 2;
+
+ // compute distance between bodies:
+ let mut r = [f64x4::splat(0.); N];
+ {
+ let mut i = 0;
+ for j in 0..N_BODIES {
+ for k in j + 1..N_BODIES {
+ r[i] = bodies[j].x - bodies[k].x;
+ i += 1;
+ }
+ }
+ }
+
+ let mut mag = [0.0; N];
+ for i in (0..N).step_by(2) {
+ let d2s = f64x2::from_array([
+ (r[i] * r[i]).horizontal_sum(),
+ (r[i + 1] * r[i + 1]).horizontal_sum(),
+ ]);
+ let dmags = f64x2::splat(dt) / (d2s * d2s.sqrt());
+ mag[i] = dmags[0];
+ mag[i + 1] = dmags[1];
+ }
+
+ let mut i = 0;
+ for j in 0..N_BODIES {
+ for k in j + 1..N_BODIES {
- body.x += dt * body.v
++ let f = r[i] * Simd::splat(mag[i]);
++ bodies[j].v -= f * Simd::splat(bodies[k].mass);
++ bodies[k].v += f * Simd::splat(bodies[j].mass);
+ i += 1
+ }
+ }
+ for body in bodies {
++ body.x += Simd::splat(dt) * body.v
+ }
+ }
+
+ pub fn run(n: usize) -> (f64, f64) {
+ let mut bodies = BODIES;
+ offset_momentum(&mut bodies);
+ let energy_before = energy(&bodies);
+ for _ in 0..n {
+ advance(&mut bodies, 0.01);
+ }
+ let energy_after = energy(&bodies);
+
+ (energy_before, energy_after)
+ }
+}
+
+#[cfg(feature = "std")]
+#[cfg(test)]
+mod tests {
+ // Good enough for demonstration purposes, not going for strictness here.
+ fn approx_eq_f64(a: f64, b: f64) -> bool {
+ (a - b).abs() < 0.00001
+ }
+ #[test]
+ fn test() {
+ const OUTPUT: [f64; 2] = [-0.169075164, -0.169087605];
+ let (energy_before, energy_after) = super::nbody::run(1000);
+ assert!(approx_eq_f64(energy_before, OUTPUT[0]));
+ assert!(approx_eq_f64(energy_after, OUTPUT[1]));
+ }
+}
+
+fn main() {
+ #[cfg(feature = "std")]
+ {
+ let (energy_before, energy_after) = nbody::run(1000);
+ println!("Energy before: {}", energy_before);
+ println!("Energy after: {}", energy_after);
+ }
+}
--- /dev/null
+use crate::simd::intrinsics;
+use crate::simd::{LaneCount, Mask, Simd, SimdElement, SupportedLaneCount};
+
+impl<T, const LANES: usize> Simd<T, LANES>
+where
+ T: SimdElement + PartialEq,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ /// Test if each lane is equal to the corresponding lane in `other`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn lanes_eq(self, other: Self) -> Mask<T::Mask, LANES> {
+ unsafe { Mask::from_int_unchecked(intrinsics::simd_eq(self, other)) }
+ }
+
+ /// Test if each lane is not equal to the corresponding lane in `other`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn lanes_ne(self, other: Self) -> Mask<T::Mask, LANES> {
+ unsafe { Mask::from_int_unchecked(intrinsics::simd_ne(self, other)) }
+ }
+}
+
+impl<T, const LANES: usize> Simd<T, LANES>
+where
+ T: SimdElement + PartialOrd,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ /// Test if each lane is less than the corresponding lane in `other`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn lanes_lt(self, other: Self) -> Mask<T::Mask, LANES> {
+ unsafe { Mask::from_int_unchecked(intrinsics::simd_lt(self, other)) }
+ }
+
+ /// Test if each lane is greater than the corresponding lane in `other`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn lanes_gt(self, other: Self) -> Mask<T::Mask, LANES> {
+ unsafe { Mask::from_int_unchecked(intrinsics::simd_gt(self, other)) }
+ }
+
+ /// Test if each lane is less than or equal to the corresponding lane in `other`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn lanes_le(self, other: Self) -> Mask<T::Mask, LANES> {
+ unsafe { Mask::from_int_unchecked(intrinsics::simd_le(self, other)) }
+ }
+
+ /// Test if each lane is greater than or equal to the corresponding lane in `other`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn lanes_ge(self, other: Self) -> Mask<T::Mask, LANES> {
+ unsafe { Mask::from_int_unchecked(intrinsics::simd_ge(self, other)) }
+ }
+}
--- /dev/null
-
- #[doc(hidden)]
- type IntBitMask;
+mod sealed {
+ pub trait Sealed {}
+}
+use sealed::Sealed;
+
+/// A type representing a vector lane count.
+pub struct LaneCount<const LANES: usize>;
+
+impl<const LANES: usize> LaneCount<LANES> {
+ /// The number of bytes in a bitmask with this many lanes.
+ pub const BITMASK_LEN: usize = (LANES + 7) / 8;
+}
+
+/// Helper trait for vector lane counts.
+pub trait SupportedLaneCount: Sealed {
+ #[doc(hidden)]
+ type BitMask: Copy + Default + AsRef<[u8]> + AsMut<[u8]>;
- type IntBitMask = u8;
+}
+
+impl<const LANES: usize> Sealed for LaneCount<LANES> {}
+
+impl SupportedLaneCount for LaneCount<1> {
+ type BitMask = [u8; 1];
- type IntBitMask = u8;
+}
+impl SupportedLaneCount for LaneCount<2> {
+ type BitMask = [u8; 1];
- type IntBitMask = u8;
+}
+impl SupportedLaneCount for LaneCount<4> {
+ type BitMask = [u8; 1];
- type IntBitMask = u8;
+}
+impl SupportedLaneCount for LaneCount<8> {
+ type BitMask = [u8; 1];
- type IntBitMask = u16;
+}
+impl SupportedLaneCount for LaneCount<16> {
+ type BitMask = [u8; 2];
- type IntBitMask = u32;
+}
+impl SupportedLaneCount for LaneCount<32> {
+ type BitMask = [u8; 4];
++}
++impl SupportedLaneCount for LaneCount<64> {
++ type BitMask = [u8; 8];
+}
--- /dev/null
+//! Types and traits associated with masking lanes of vectors.
+//! Types representing
+#![allow(non_camel_case_types)]
+
+#[cfg_attr(
+ not(all(target_arch = "x86_64", target_feature = "avx512f")),
+ path = "masks/full_masks.rs"
+)]
+#[cfg_attr(
+ all(target_arch = "x86_64", target_feature = "avx512f"),
+ path = "masks/bitmask.rs"
+)]
+mod mask_impl;
+
+use crate::simd::{LaneCount, Simd, SimdElement, SupportedLaneCount};
+use core::cmp::Ordering;
+use core::fmt;
+
+mod sealed {
+ use super::*;
+
+ /// Not only does this seal the `MaskElement` trait, but these functions prevent other traits
+ /// from bleeding into the parent bounds.
+ ///
+ /// For example, `eq` could be provided by requiring `MaskElement: PartialEq`, but that would
+ /// prevent us from ever removing that bound, or from implementing `MaskElement` on
+ /// non-`PartialEq` types in the future.
+ pub trait Sealed {
+ fn valid<const LANES: usize>(values: Simd<Self, LANES>) -> bool
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ Self: SimdElement;
+
+ fn eq(self, other: Self) -> bool;
+
+ const TRUE: Self;
+
+ const FALSE: Self;
+ }
+}
+use sealed::Sealed;
+
+/// Marker trait for types that may be used as SIMD mask elements.
+pub unsafe trait MaskElement: SimdElement + Sealed {}
+
+macro_rules! impl_element {
+ { $ty:ty } => {
+ impl Sealed for $ty {
+ fn valid<const LANES: usize>(value: Simd<Self, LANES>) -> bool
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ (value.lanes_eq(Simd::splat(0)) | value.lanes_eq(Simd::splat(-1))).all()
+ }
+
+ fn eq(self, other: Self) -> bool { self == other }
+
+ const TRUE: Self = -1;
+ const FALSE: Self = 0;
+ }
+
+ unsafe impl MaskElement for $ty {}
+ }
+}
+
+impl_element! { i8 }
+impl_element! { i16 }
+impl_element! { i32 }
+impl_element! { i64 }
+impl_element! { isize }
+
+/// A SIMD vector mask for `LANES` elements of width specified by `Element`.
+///
+/// The layout of this type is unspecified.
+#[repr(transparent)]
+pub struct Mask<T, const LANES: usize>(mask_impl::Mask<T, LANES>)
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount;
+
+impl<T, const LANES: usize> Copy for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+}
+
+impl<T, const LANES: usize> Clone for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn clone(&self) -> Self {
+ *self
+ }
+}
+
+impl<T, const LANES: usize> Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ /// Construct a mask by setting all lanes to the given value.
+ pub fn splat(value: bool) -> Self {
+ Self(mask_impl::Mask::splat(value))
+ }
+
+ /// Converts an array to a SIMD vector.
+ pub fn from_array(array: [bool; LANES]) -> Self {
+ let mut vector = Self::splat(false);
+ for (i, v) in array.iter().enumerate() {
+ vector.set(i, *v);
+ }
+ vector
+ }
+
+ /// Converts a SIMD vector to an array.
+ pub fn to_array(self) -> [bool; LANES] {
+ let mut array = [false; LANES];
+ for (i, v) in array.iter_mut().enumerate() {
+ *v = self.test(i);
+ }
+ array
+ }
+
+ /// Converts a vector of integers to a mask, where 0 represents `false` and -1
+ /// represents `true`.
+ ///
+ /// # Safety
+ /// All lanes must be either 0 or -1.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub unsafe fn from_int_unchecked(value: Simd<T, LANES>) -> Self {
+ unsafe { Self(mask_impl::Mask::from_int_unchecked(value)) }
+ }
+
+ /// Converts a vector of integers to a mask, where 0 represents `false` and -1
+ /// represents `true`.
+ ///
+ /// # Panics
+ /// Panics if any lane is not 0 or -1.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn from_int(value: Simd<T, LANES>) -> Self {
+ assert!(T::valid(value), "all values must be either 0 or -1",);
+ unsafe { Self::from_int_unchecked(value) }
+ }
+
+ /// Converts the mask to a vector of integers, where 0 represents `false` and -1
+ /// represents `true`.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn to_int(self) -> Simd<T, LANES> {
+ self.0.to_int()
+ }
+
+ /// Tests the value of the specified lane.
+ ///
+ /// # Safety
+ /// `lane` must be less than `LANES`.
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub unsafe fn test_unchecked(&self, lane: usize) -> bool {
+ unsafe { self.0.test_unchecked(lane) }
+ }
+
+ /// Tests the value of the specified lane.
+ ///
+ /// # Panics
+ /// Panics if `lane` is greater than or equal to the number of lanes in the vector.
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub fn test(&self, lane: usize) -> bool {
+ assert!(lane < LANES, "lane index out of range");
+ unsafe { self.test_unchecked(lane) }
+ }
+
+ /// Sets the value of the specified lane.
+ ///
+ /// # Safety
+ /// `lane` must be less than `LANES`.
+ #[inline]
+ pub unsafe fn set_unchecked(&mut self, lane: usize, value: bool) {
+ unsafe {
+ self.0.set_unchecked(lane, value);
+ }
+ }
+
+ /// Sets the value of the specified lane.
+ ///
+ /// # Panics
+ /// Panics if `lane` is greater than or equal to the number of lanes in the vector.
+ #[inline]
+ pub fn set(&mut self, lane: usize, value: bool) {
+ assert!(lane < LANES, "lane index out of range");
+ unsafe {
+ self.set_unchecked(lane, value);
+ }
+ }
+
+ /// Convert this mask to a bitmask, with one bit set per lane.
+ #[cfg(feature = "generic_const_exprs")]
++ #[inline]
++ #[must_use = "method returns a new array and does not mutate the original value"]
+ pub fn to_bitmask(self) -> [u8; LaneCount::<LANES>::BITMASK_LEN] {
+ self.0.to_bitmask()
+ }
+
+ /// Convert a bitmask to a mask.
+ #[cfg(feature = "generic_const_exprs")]
++ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn from_bitmask(bitmask: [u8; LaneCount::<LANES>::BITMASK_LEN]) -> Self {
+ Self(mask_impl::Mask::from_bitmask(bitmask))
+ }
+
+ /// Returns true if any lane is set, or false otherwise.
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub fn any(self) -> bool {
+ self.0.any()
+ }
+
+ /// Returns true if all lanes are set, or false otherwise.
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub fn all(self) -> bool {
+ self.0.all()
+ }
+}
+
+// vector/array conversion
+impl<T, const LANES: usize> From<[bool; LANES]> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn from(array: [bool; LANES]) -> Self {
+ Self::from_array(array)
+ }
+}
+
+impl<T, const LANES: usize> From<Mask<T, LANES>> for [bool; LANES]
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn from(vector: Mask<T, LANES>) -> Self {
+ vector.to_array()
+ }
+}
+
+impl<T, const LANES: usize> Default for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
++ #[must_use = "method returns a defaulted mask with all lanes set to false (0)"]
+ fn default() -> Self {
+ Self::splat(false)
+ }
+}
+
+impl<T, const LANES: usize> PartialEq for Mask<T, LANES>
+where
+ T: MaskElement + PartialEq,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ fn eq(&self, other: &Self) -> bool {
+ self.0 == other.0
+ }
+}
+
+impl<T, const LANES: usize> PartialOrd for Mask<T, LANES>
+where
+ T: MaskElement + PartialOrd,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
++ #[must_use = "method returns a new Ordering and does not mutate the original value"]
+ fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
+ self.0.partial_cmp(&other.0)
+ }
+}
+
+impl<T, const LANES: usize> fmt::Debug for Mask<T, LANES>
+where
+ T: MaskElement + fmt::Debug,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+ f.debug_list()
+ .entries((0..LANES).map(|lane| self.test(lane)))
+ .finish()
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitAnd for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitand(self, rhs: Self) -> Self {
+ Self(self.0 & rhs.0)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitAnd<bool> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitand(self, rhs: bool) -> Self {
+ self & Self::splat(rhs)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitAnd<Mask<T, LANES>> for bool
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Mask<T, LANES>;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitand(self, rhs: Mask<T, LANES>) -> Mask<T, LANES> {
+ Mask::splat(self) & rhs
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitOr for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitor(self, rhs: Self) -> Self {
+ Self(self.0 | rhs.0)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitOr<bool> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitor(self, rhs: bool) -> Self {
+ self | Self::splat(rhs)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitOr<Mask<T, LANES>> for bool
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Mask<T, LANES>;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitor(self, rhs: Mask<T, LANES>) -> Mask<T, LANES> {
+ Mask::splat(self) | rhs
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitXor for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitxor(self, rhs: Self) -> Self::Output {
+ Self(self.0 ^ rhs.0)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitXor<bool> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitxor(self, rhs: bool) -> Self::Output {
+ self ^ Self::splat(rhs)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitXor<Mask<T, LANES>> for bool
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Mask<T, LANES>;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitxor(self, rhs: Mask<T, LANES>) -> Self::Output {
+ Mask::splat(self) ^ rhs
+ }
+}
+
+impl<T, const LANES: usize> core::ops::Not for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Mask<T, LANES>;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn not(self) -> Self::Output {
+ Self(!self.0)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitAndAssign for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
+ fn bitand_assign(&mut self, rhs: Self) {
+ self.0 = self.0 & rhs.0;
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitAndAssign<bool> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
+ fn bitand_assign(&mut self, rhs: bool) {
+ *self &= Self::splat(rhs);
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitOrAssign for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
+ fn bitor_assign(&mut self, rhs: Self) {
+ self.0 = self.0 | rhs.0;
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitOrAssign<bool> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
+ fn bitor_assign(&mut self, rhs: bool) {
+ *self |= Self::splat(rhs);
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitXorAssign for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
+ fn bitxor_assign(&mut self, rhs: Self) {
+ self.0 = self.0 ^ rhs.0;
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitXorAssign<bool> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
+ fn bitxor_assign(&mut self, rhs: bool) {
+ *self ^= Self::splat(rhs);
+ }
+}
+
+/// Vector of eight 8-bit masks
+pub type mask8x8 = Mask<i8, 8>;
+
+/// Vector of 16 8-bit masks
+pub type mask8x16 = Mask<i8, 16>;
+
+/// Vector of 32 8-bit masks
+pub type mask8x32 = Mask<i8, 32>;
+
+/// Vector of 16 8-bit masks
+pub type mask8x64 = Mask<i8, 64>;
+
+/// Vector of four 16-bit masks
+pub type mask16x4 = Mask<i16, 4>;
+
+/// Vector of eight 16-bit masks
+pub type mask16x8 = Mask<i16, 8>;
+
+/// Vector of 16 16-bit masks
+pub type mask16x16 = Mask<i16, 16>;
+
+/// Vector of 32 16-bit masks
+pub type mask16x32 = Mask<i32, 32>;
+
+/// Vector of two 32-bit masks
+pub type mask32x2 = Mask<i32, 2>;
+
+/// Vector of four 32-bit masks
+pub type mask32x4 = Mask<i32, 4>;
+
+/// Vector of eight 32-bit masks
+pub type mask32x8 = Mask<i32, 8>;
+
+/// Vector of 16 32-bit masks
+pub type mask32x16 = Mask<i32, 16>;
+
+/// Vector of two 64-bit masks
+pub type mask64x2 = Mask<i64, 2>;
+
+/// Vector of four 64-bit masks
+pub type mask64x4 = Mask<i64, 4>;
+
+/// Vector of eight 64-bit masks
+pub type mask64x8 = Mask<i64, 8>;
+
+/// Vector of two pointer-width masks
+pub type masksizex2 = Mask<isize, 2>;
+
+/// Vector of four pointer-width masks
+pub type masksizex4 = Mask<isize, 4>;
+
+/// Vector of eight pointer-width masks
+pub type masksizex8 = Mask<isize, 8>;
+
+macro_rules! impl_from {
+ { $from:ty => $($to:ty),* } => {
+ $(
+ impl<const LANES: usize> From<Mask<$from, LANES>> for Mask<$to, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ fn from(value: Mask<$from, LANES>) -> Self {
+ Self(value.0.convert())
+ }
+ }
+ )*
+ }
+}
+impl_from! { i8 => i16, i32, i64, isize }
+impl_from! { i16 => i32, i64, isize, i8 }
+impl_from! { i32 => i64, isize, i8, i16 }
+impl_from! { i64 => isize, i8, i16, i32 }
+impl_from! { isize => i8, i16, i32, i64 }
--- /dev/null
- let mask: <LaneCount<LANES> as SupportedLaneCount>::IntBitMask =
- core::mem::transmute_copy(&self);
- intrinsics::simd_select_bitmask(mask, Simd::splat(T::TRUE), Simd::splat(T::FALSE))
++#![allow(unused_imports)]
+use super::MaskElement;
+use crate::simd::intrinsics;
+use crate::simd::{LaneCount, Simd, SupportedLaneCount};
+use core::marker::PhantomData;
+
+/// A mask where each lane is represented by a single bit.
+#[repr(transparent)]
+pub struct Mask<T, const LANES: usize>(
+ <LaneCount<LANES> as SupportedLaneCount>::BitMask,
+ PhantomData<T>,
+)
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount;
+
+impl<T, const LANES: usize> Copy for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+}
+
+impl<T, const LANES: usize> Clone for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn clone(&self) -> Self {
+ *self
+ }
+}
+
+impl<T, const LANES: usize> PartialEq for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn eq(&self, other: &Self) -> bool {
+ self.0.as_ref() == other.0.as_ref()
+ }
+}
+
+impl<T, const LANES: usize> PartialOrd for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn partial_cmp(&self, other: &Self) -> Option<core::cmp::Ordering> {
+ self.0.as_ref().partial_cmp(other.0.as_ref())
+ }
+}
+
+impl<T, const LANES: usize> Eq for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+}
+
+impl<T, const LANES: usize> Ord for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn cmp(&self, other: &Self) -> core::cmp::Ordering {
+ self.0.as_ref().cmp(other.0.as_ref())
+ }
+}
+
+impl<T, const LANES: usize> Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn splat(value: bool) -> Self {
+ let mut mask = <LaneCount<LANES> as SupportedLaneCount>::BitMask::default();
+ if value {
+ mask.as_mut().fill(u8::MAX)
+ } else {
+ mask.as_mut().fill(u8::MIN)
+ }
+ if LANES % 8 > 0 {
+ *mask.as_mut().last_mut().unwrap() &= u8::MAX >> (8 - LANES % 8);
+ }
+ Self(mask, PhantomData)
+ }
+
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub unsafe fn test_unchecked(&self, lane: usize) -> bool {
+ (self.0.as_ref()[lane / 8] >> (lane % 8)) & 0x1 > 0
+ }
+
+ #[inline]
+ pub unsafe fn set_unchecked(&mut self, lane: usize, value: bool) {
+ unsafe {
+ self.0.as_mut()[lane / 8] ^= ((value ^ self.test_unchecked(lane)) as u8) << (lane % 8)
+ }
+ }
+
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn to_int(self) -> Simd<T, LANES> {
+ unsafe {
- // TODO remove the transmute when rustc is more flexible
- assert_eq!(
- core::mem::size_of::<<LaneCount::<LANES> as SupportedLaneCount>::BitMask>(),
- core::mem::size_of::<<LaneCount::<LANES> as SupportedLaneCount>::IntBitMask>(),
- );
- unsafe {
- let mask: <LaneCount<LANES> as SupportedLaneCount>::IntBitMask =
- intrinsics::simd_bitmask(value);
- Self(core::mem::transmute_copy(&mask), PhantomData)
- }
++ crate::intrinsics::simd_select_bitmask(
++ self.0,
++ Simd::splat(T::TRUE),
++ Simd::splat(T::FALSE),
++ )
+ }
+ }
+
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub unsafe fn from_int_unchecked(value: Simd<T, LANES>) -> Self {
++ unsafe { Self(crate::intrinsics::simd_bitmask(value), PhantomData) }
+ }
+
+ #[cfg(feature = "generic_const_exprs")]
+ #[inline]
++ #[must_use = "method returns a new array and does not mutate the original value"]
+ pub fn to_bitmask(self) -> [u8; LaneCount::<LANES>::BITMASK_LEN] {
+ // Safety: these are the same type and we are laundering the generic
+ unsafe { core::mem::transmute_copy(&self.0) }
+ }
+
+ #[cfg(feature = "generic_const_exprs")]
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn from_bitmask(bitmask: [u8; LaneCount::<LANES>::BITMASK_LEN]) -> Self {
+ // Safety: these are the same type and we are laundering the generic
+ Self(unsafe { core::mem::transmute_copy(&bitmask) }, PhantomData)
+ }
+
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn convert<U>(self) -> Mask<U, LANES>
+ where
+ U: MaskElement,
+ {
+ unsafe { core::mem::transmute_copy(&self) }
+ }
+
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub fn any(self) -> bool {
+ self != Self::splat(false)
+ }
+
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub fn all(self) -> bool {
+ self == Self::splat(true)
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitAnd for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+ <LaneCount<LANES> as SupportedLaneCount>::BitMask: AsRef<[u8]> + AsMut<[u8]>,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitand(mut self, rhs: Self) -> Self {
+ for (l, r) in self.0.as_mut().iter_mut().zip(rhs.0.as_ref().iter()) {
+ *l &= r;
+ }
+ self
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitOr for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+ <LaneCount<LANES> as SupportedLaneCount>::BitMask: AsRef<[u8]> + AsMut<[u8]>,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitor(mut self, rhs: Self) -> Self {
+ for (l, r) in self.0.as_mut().iter_mut().zip(rhs.0.as_ref().iter()) {
+ *l |= r;
+ }
+ self
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitXor for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitxor(mut self, rhs: Self) -> Self::Output {
+ for (l, r) in self.0.as_mut().iter_mut().zip(rhs.0.as_ref().iter()) {
+ *l ^= r;
+ }
+ self
+ }
+}
+
+impl<T, const LANES: usize> core::ops::Not for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn not(mut self) -> Self::Output {
+ for x in self.0.as_mut() {
+ *x = !*x;
+ }
+ if LANES % 8 > 0 {
+ *self.0.as_mut().last_mut().unwrap() &= u8::MAX >> (8 - LANES % 8);
+ }
+ self
+ }
+}
--- /dev/null
- // TODO remove the transmute when rustc can use arrays of u8 as bitmasks
- assert_eq!(
- core::mem::size_of::<<LaneCount::<LANES> as SupportedLaneCount>::IntBitMask>(),
- LaneCount::<LANES>::BITMASK_LEN,
- );
- let bitmask: <LaneCount<LANES> as SupportedLaneCount>::IntBitMask =
- intrinsics::simd_bitmask(self.0);
+//! Masks that take up full SIMD vector registers.
+
+use super::MaskElement;
+use crate::simd::intrinsics;
+use crate::simd::{LaneCount, Simd, SupportedLaneCount};
+
+#[repr(transparent)]
+pub struct Mask<T, const LANES: usize>(Simd<T, LANES>)
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount;
+
+impl<T, const LANES: usize> Copy for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+}
+
+impl<T, const LANES: usize> Clone for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn clone(&self) -> Self {
+ *self
+ }
+}
+
+impl<T, const LANES: usize> PartialEq for Mask<T, LANES>
+where
+ T: MaskElement + PartialEq,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn eq(&self, other: &Self) -> bool {
+ self.0.eq(&other.0)
+ }
+}
+
+impl<T, const LANES: usize> PartialOrd for Mask<T, LANES>
+where
+ T: MaskElement + PartialOrd,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn partial_cmp(&self, other: &Self) -> Option<core::cmp::Ordering> {
+ self.0.partial_cmp(&other.0)
+ }
+}
+
+impl<T, const LANES: usize> Eq for Mask<T, LANES>
+where
+ T: MaskElement + Eq,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+}
+
+impl<T, const LANES: usize> Ord for Mask<T, LANES>
+where
+ T: MaskElement + Ord,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn cmp(&self, other: &Self) -> core::cmp::Ordering {
+ self.0.cmp(&other.0)
+ }
+}
+
+impl<T, const LANES: usize> Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
++ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn splat(value: bool) -> Self {
+ Self(Simd::splat(if value { T::TRUE } else { T::FALSE }))
+ }
+
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub unsafe fn test_unchecked(&self, lane: usize) -> bool {
+ T::eq(self.0[lane], T::TRUE)
+ }
+
+ #[inline]
+ pub unsafe fn set_unchecked(&mut self, lane: usize, value: bool) {
+ self.0[lane] = if value { T::TRUE } else { T::FALSE }
+ }
+
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn to_int(self) -> Simd<T, LANES> {
+ self.0
+ }
+
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub unsafe fn from_int_unchecked(value: Simd<T, LANES>) -> Self {
+ Self(value)
+ }
+
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn convert<U>(self) -> Mask<U, LANES>
+ where
+ U: MaskElement,
+ {
+ unsafe { Mask(intrinsics::simd_cast(self.0)) }
+ }
+
+ #[cfg(feature = "generic_const_exprs")]
+ #[inline]
++ #[must_use = "method returns a new array and does not mutate the original value"]
+ pub fn to_bitmask(self) -> [u8; LaneCount::<LANES>::BITMASK_LEN] {
+ unsafe {
- core::mem::transmute_copy(&bitmask);
+ let mut bitmask: [u8; LaneCount::<LANES>::BITMASK_LEN] =
- // TODO remove the transmute when rustc can use arrays of u8 as bitmasks
- assert_eq!(
- core::mem::size_of::<<LaneCount::<LANES> as SupportedLaneCount>::IntBitMask>(),
- LaneCount::<LANES>::BITMASK_LEN,
- );
- let bitmask: <LaneCount<LANES> as SupportedLaneCount>::IntBitMask =
- core::mem::transmute_copy(&bitmask);
-
- Self::from_int_unchecked(intrinsics::simd_select_bitmask(
++ crate::intrinsics::simd_bitmask(self.0);
+
+ // There is a bug where LLVM appears to implement this operation with the wrong
+ // bit order.
+ // TODO fix this in a better way
+ if cfg!(target_endian = "big") {
+ for x in bitmask.as_mut() {
+ *x = x.reverse_bits();
+ }
+ }
+
+ bitmask
+ }
+ }
+
+ #[cfg(feature = "generic_const_exprs")]
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn from_bitmask(mut bitmask: [u8; LaneCount::<LANES>::BITMASK_LEN]) -> Self {
+ unsafe {
+ // There is a bug where LLVM appears to implement this operation with the wrong
+ // bit order.
+ // TODO fix this in a better way
+ if cfg!(target_endian = "big") {
+ for x in bitmask.as_mut() {
+ *x = x.reverse_bits();
+ }
+ }
+
++ Self::from_int_unchecked(crate::intrinsics::simd_select_bitmask(
+ bitmask,
+ Self::splat(true).to_int(),
+ Self::splat(false).to_int(),
+ ))
+ }
+ }
+
+ #[inline]
++ #[must_use = "method returns a new bool and does not mutate the original value"]
+ pub fn any(self) -> bool {
+ unsafe { intrinsics::simd_reduce_any(self.to_int()) }
+ }
+
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn all(self) -> bool {
+ unsafe { intrinsics::simd_reduce_all(self.to_int()) }
+ }
+}
+
+impl<T, const LANES: usize> core::convert::From<Mask<T, LANES>> for Simd<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ fn from(value: Mask<T, LANES>) -> Self {
+ value.0
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitAnd for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitand(self, rhs: Self) -> Self {
+ unsafe { Self(intrinsics::simd_and(self.0, rhs.0)) }
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitOr for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitor(self, rhs: Self) -> Self {
+ unsafe { Self(intrinsics::simd_or(self.0, rhs.0)) }
+ }
+}
+
+impl<T, const LANES: usize> core::ops::BitXor for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn bitxor(self, rhs: Self) -> Self {
+ unsafe { Self(intrinsics::simd_xor(self.0, rhs.0)) }
+ }
+}
+
+impl<T, const LANES: usize> core::ops::Not for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ type Output = Self;
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ fn not(self) -> Self::Output {
+ Self::splat(true) ^ self
+ }
+}
--- /dev/null
- /// assert_eq!(x - 1, unsat);
+use crate::simd::intrinsics::{simd_saturating_add, simd_saturating_sub};
+use crate::simd::{LaneCount, Simd, SupportedLaneCount};
+
+macro_rules! impl_uint_arith {
+ ($($ty:ty),+) => {
+ $( impl<const LANES: usize> Simd<$ty, LANES> where LaneCount<LANES>: SupportedLaneCount {
+
+ /// Lanewise saturating add.
+ ///
+ /// # Examples
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ #[doc = concat!("# use core::", stringify!($ty), "::MAX;")]
+ /// let x = Simd::from_array([2, 1, 0, MAX]);
+ /// let max = Simd::splat(MAX);
+ /// let unsat = x + max;
+ /// let sat = x.saturating_add(max);
- /// assert_eq!(unsat, x + 1);
++ /// assert_eq!(unsat, Simd::from_array([1, 0, MAX, MAX - 1]));
+ /// assert_eq!(sat, max);
+ /// ```
+ #[inline]
+ pub fn saturating_add(self, second: Self) -> Self {
+ unsafe { simd_saturating_add(self, second) }
+ }
+
+ /// Lanewise saturating subtract.
+ ///
+ /// # Examples
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ #[doc = concat!("# use core::", stringify!($ty), "::MAX;")]
+ /// let x = Simd::from_array([2, 1, 0, MAX]);
+ /// let max = Simd::splat(MAX);
+ /// let unsat = x - max;
+ /// let sat = x.saturating_sub(max);
- let m = self >> SHR;
++ /// assert_eq!(unsat, Simd::from_array([3, 2, 1, 0]));
+ /// assert_eq!(sat, Simd::splat(0));
+ #[inline]
+ pub fn saturating_sub(self, second: Self) -> Self {
+ unsafe { simd_saturating_sub(self, second) }
+ }
+ })+
+ }
+}
+
+macro_rules! impl_int_arith {
+ ($($ty:ty),+) => {
+ $( impl<const LANES: usize> Simd<$ty, LANES> where LaneCount<LANES>: SupportedLaneCount {
+
+ /// Lanewise saturating add.
+ ///
+ /// # Examples
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ #[doc = concat!("# use core::", stringify!($ty), "::{MIN, MAX};")]
+ /// let x = Simd::from_array([MIN, 0, 1, MAX]);
+ /// let max = Simd::splat(MAX);
+ /// let unsat = x + max;
+ /// let sat = x.saturating_add(max);
+ /// assert_eq!(unsat, Simd::from_array([-1, MAX, MIN, -2]));
+ /// assert_eq!(sat, Simd::from_array([-1, MAX, MAX, MAX]));
+ /// ```
+ #[inline]
+ pub fn saturating_add(self, second: Self) -> Self {
+ unsafe { simd_saturating_add(self, second) }
+ }
+
+ /// Lanewise saturating subtract.
+ ///
+ /// # Examples
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ #[doc = concat!("# use core::", stringify!($ty), "::{MIN, MAX};")]
+ /// let x = Simd::from_array([MIN, -2, -1, MAX]);
+ /// let max = Simd::splat(MAX);
+ /// let unsat = x - max;
+ /// let sat = x.saturating_sub(max);
+ /// assert_eq!(unsat, Simd::from_array([1, MAX, MIN, 0]));
+ /// assert_eq!(sat, Simd::from_array([MIN, MIN, MIN, 0]));
+ #[inline]
+ pub fn saturating_sub(self, second: Self) -> Self {
+ unsafe { simd_saturating_sub(self, second) }
+ }
+
+ /// Lanewise absolute value, implemented in Rust.
+ /// Every lane becomes its absolute value.
+ ///
+ /// # Examples
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ #[doc = concat!("# use core::", stringify!($ty), "::{MIN, MAX};")]
+ /// let xs = Simd::from_array([MIN, MIN +1, -5, 0]);
+ /// assert_eq!(xs.abs(), Simd::from_array([MIN, MAX, 5, 0]));
+ /// ```
+ #[inline]
+ pub fn abs(self) -> Self {
+ const SHR: $ty = <$ty>::BITS as $ty - 1;
- let m = self >> SHR;
++ let m = self >> Simd::splat(SHR);
+ (self^m) - m
+ }
+
+ /// Lanewise saturating absolute value, implemented in Rust.
+ /// As abs(), except the MIN value becomes MAX instead of itself.
+ ///
+ /// # Examples
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ #[doc = concat!("# use core::", stringify!($ty), "::{MIN, MAX};")]
+ /// let xs = Simd::from_array([MIN, -2, 0, 3]);
+ /// let unsat = xs.abs();
+ /// let sat = xs.saturating_abs();
+ /// assert_eq!(unsat, Simd::from_array([MIN, 2, 0, 3]));
+ /// assert_eq!(sat, Simd::from_array([MAX, 2, 0, 3]));
+ /// ```
+ #[inline]
+ pub fn saturating_abs(self) -> Self {
+ // arith shift for -1 or 0 mask based on sign bit, giving 2s complement
+ const SHR: $ty = <$ty>::BITS as $ty - 1;
++ let m = self >> Simd::splat(SHR);
+ (self^m).saturating_sub(m)
+ }
+
+ /// Lanewise saturating negation, implemented in Rust.
+ /// As neg(), except the MIN value becomes MAX instead of itself.
+ ///
+ /// # Examples
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ #[doc = concat!("# use core::", stringify!($ty), "::{MIN, MAX};")]
+ /// let x = Simd::from_array([MIN, -2, 3, MAX]);
+ /// let unsat = -x;
+ /// let sat = x.saturating_neg();
+ /// assert_eq!(unsat, Simd::from_array([MIN, 2, -3, MIN + 1]));
+ /// assert_eq!(sat, Simd::from_array([MAX, 2, -3, MIN + 1]));
+ /// ```
+ #[inline]
+ pub fn saturating_neg(self) -> Self {
+ Self::splat(0).saturating_sub(self)
+ }
+ })+
+ }
+}
+
+impl_uint_arith! { u8, u16, u32, u64, usize }
+impl_int_arith! { i8, i16, i32, i64, isize }
--- /dev/null
-
- impl<const $lanes: usize> core::ops::$trait<&'_ $rhs> for $type
- where
- LaneCount<$lanes2>: SupportedLaneCount,
- {
- type Output = <$type as core::ops::$trait<$rhs>>::Output;
-
- $(#[$attrs])*
- fn $fn($self_tok, $rhs_arg: &$rhs) -> Self::Output {
- core::ops::$trait::$fn($self_tok, *$rhs_arg)
- }
- }
-
- impl<const $lanes: usize> core::ops::$trait<$rhs> for &'_ $type
- where
- LaneCount<$lanes2>: SupportedLaneCount,
- {
- type Output = <$type as core::ops::$trait<$rhs>>::Output;
-
- $(#[$attrs])*
- fn $fn($self_tok, $rhs_arg: $rhs) -> Self::Output {
- core::ops::$trait::$fn(*$self_tok, $rhs_arg)
- }
- }
-
- impl<const $lanes: usize> core::ops::$trait<&'_ $rhs> for &'_ $type
- where
- LaneCount<$lanes2>: SupportedLaneCount,
- {
- type Output = <$type as core::ops::$trait<$rhs>>::Output;
-
- $(#[$attrs])*
- fn $fn($self_tok, $rhs_arg: &$rhs) -> Self::Output {
- core::ops::$trait::$fn(*$self_tok, *$rhs_arg)
- }
- }
+use crate::simd::intrinsics;
+use crate::simd::{LaneCount, Simd, SimdElement, SupportedLaneCount};
++use core::ops::{Add, Mul};
++use core::ops::{BitAnd, BitOr, BitXor};
++use core::ops::{Div, Rem, Sub};
++use core::ops::{Shl, Shr};
++
++mod assign;
++mod deref;
++mod unary;
+
+impl<I, T, const LANES: usize> core::ops::Index<I> for Simd<T, LANES>
+where
+ T: SimdElement,
+ LaneCount<LANES>: SupportedLaneCount,
+ I: core::slice::SliceIndex<[T]>,
+{
+ type Output = I::Output;
+ fn index(&self, index: I) -> &Self::Output {
+ &self.as_array()[index]
+ }
+}
+
+impl<I, T, const LANES: usize> core::ops::IndexMut<I> for Simd<T, LANES>
+where
+ T: SimdElement,
+ LaneCount<LANES>: SupportedLaneCount,
+ I: core::slice::SliceIndex<[T]>,
+{
+ fn index_mut(&mut self, index: I) -> &mut Self::Output {
+ &mut self.as_mut_array()[index]
+ }
+}
+
+/// Checks if the right-hand side argument of a left- or right-shift would cause overflow.
+fn invalid_shift_rhs<T>(rhs: T) -> bool
+where
+ T: Default + PartialOrd + core::convert::TryFrom<usize>,
+ <T as core::convert::TryFrom<usize>>::Error: core::fmt::Debug,
+{
+ let bits_in_type = T::try_from(8 * core::mem::size_of::<T>()).unwrap();
+ rhs < T::default() || rhs >= bits_in_type
+}
+
+/// Automatically implements operators over references in addition to the provided operator.
+macro_rules! impl_ref_ops {
+ // binary op
+ {
+ impl<const $lanes:ident: usize> core::ops::$trait:ident<$rhs:ty> for $type:ty
+ where
+ LaneCount<$lanes2:ident>: SupportedLaneCount,
+ {
+ type Output = $output:ty;
+
+ $(#[$attrs:meta])*
+ fn $fn:ident($self_tok:ident, $rhs_arg:ident: $rhs_arg_ty:ty) -> Self::Output $body:tt
+ }
+ } => {
+ impl<const $lanes: usize> core::ops::$trait<$rhs> for $type
+ where
+ LaneCount<$lanes2>: SupportedLaneCount,
+ {
+ type Output = $output;
+
+ $(#[$attrs])*
+ fn $fn($self_tok, $rhs_arg: $rhs_arg_ty) -> Self::Output $body
+ }
-
- // binary assignment op
- {
- impl<const $lanes:ident: usize> core::ops::$trait:ident<$rhs:ty> for $type:ty
- where
- LaneCount<$lanes2:ident>: SupportedLaneCount,
- {
- $(#[$attrs:meta])*
- fn $fn:ident(&mut $self_tok:ident, $rhs_arg:ident: $rhs_arg_ty:ty) $body:tt
- }
- } => {
- impl<const $lanes: usize> core::ops::$trait<$rhs> for $type
- where
- LaneCount<$lanes2>: SupportedLaneCount,
- {
- $(#[$attrs])*
- fn $fn(&mut $self_tok, $rhs_arg: $rhs_arg_ty) $body
- }
-
- impl<const $lanes: usize> core::ops::$trait<&'_ $rhs> for $type
- where
- LaneCount<$lanes2>: SupportedLaneCount,
- {
- $(#[$attrs])*
- fn $fn(&mut $self_tok, $rhs_arg: &$rhs_arg_ty) {
- core::ops::$trait::$fn($self_tok, *$rhs_arg)
- }
- }
- };
-
- // unary op
- {
- impl<const $lanes:ident: usize> core::ops::$trait:ident for $type:ty
- where
- LaneCount<$lanes2:ident>: SupportedLaneCount,
- {
- type Output = $output:ty;
- fn $fn:ident($self_tok:ident) -> Self::Output $body:tt
- }
- } => {
- impl<const $lanes: usize> core::ops::$trait for $type
- where
- LaneCount<$lanes2>: SupportedLaneCount,
- {
- type Output = $output;
- fn $fn($self_tok) -> Self::Output $body
- }
-
- impl<const $lanes: usize> core::ops::$trait for &'_ $type
- where
- LaneCount<$lanes2>: SupportedLaneCount,
- {
- type Output = <$type as core::ops::$trait>::Output;
- fn $fn($self_tok) -> Self::Output {
- core::ops::$trait::$fn(*$self_tok)
- }
- }
- }
+ };
- impl_op! { @binary $scalar, Add::add, AddAssign::add_assign, simd_add }
+}
+
+/// Automatically implements operators over vectors and scalars for a particular vector.
+macro_rules! impl_op {
+ { impl Add for $scalar:ty } => {
- impl_op! { @binary $scalar, Sub::sub, SubAssign::sub_assign, simd_sub }
++ impl_op! { @binary $scalar, Add::add, simd_add }
+ };
+ { impl Sub for $scalar:ty } => {
- impl_op! { @binary $scalar, Mul::mul, MulAssign::mul_assign, simd_mul }
++ impl_op! { @binary $scalar, Sub::sub, simd_sub }
+ };
+ { impl Mul for $scalar:ty } => {
- impl_op! { @binary $scalar, Div::div, DivAssign::div_assign, simd_div }
++ impl_op! { @binary $scalar, Mul::mul, simd_mul }
+ };
+ { impl Div for $scalar:ty } => {
- impl_op! { @binary $scalar, Rem::rem, RemAssign::rem_assign, simd_rem }
++ impl_op! { @binary $scalar, Div::div, simd_div }
+ };
+ { impl Rem for $scalar:ty } => {
- impl_op! { @binary $scalar, Shl::shl, ShlAssign::shl_assign, simd_shl }
++ impl_op! { @binary $scalar, Rem::rem, simd_rem }
+ };
+ { impl Shl for $scalar:ty } => {
- impl_op! { @binary $scalar, Shr::shr, ShrAssign::shr_assign, simd_shr }
++ impl_op! { @binary $scalar, Shl::shl, simd_shl }
+ };
+ { impl Shr for $scalar:ty } => {
- impl_op! { @binary $scalar, BitAnd::bitand, BitAndAssign::bitand_assign, simd_and }
++ impl_op! { @binary $scalar, Shr::shr, simd_shr }
+ };
+ { impl BitAnd for $scalar:ty } => {
- impl_op! { @binary $scalar, BitOr::bitor, BitOrAssign::bitor_assign, simd_or }
++ impl_op! { @binary $scalar, BitAnd::bitand, simd_and }
+ };
+ { impl BitOr for $scalar:ty } => {
- impl_op! { @binary $scalar, BitXor::bitxor, BitXorAssign::bitxor_assign, simd_xor }
- };
-
- { impl Not for $scalar:ty } => {
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Not for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Self;
- fn not(self) -> Self::Output {
- self ^ Self::splat(!<$scalar>::default())
- }
- }
- }
- };
-
- { impl Neg for $scalar:ty } => {
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Neg for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Self;
- fn neg(self) -> Self::Output {
- unsafe { intrinsics::simd_neg(self) }
- }
- }
- }
++ impl_op! { @binary $scalar, BitOr::bitor, simd_or }
+ };
+ { impl BitXor for $scalar:ty } => {
- { @binary $scalar:ty, $trait:ident :: $trait_fn:ident, $assign_trait:ident :: $assign_trait_fn:ident, $intrinsic:ident } => {
++ impl_op! { @binary $scalar, BitXor::bitxor, simd_xor }
+ };
+
+ // generic binary op with assignment when output is `Self`
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::$trait<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Self;
-
- #[inline]
- fn $trait_fn(self, rhs: $scalar) -> Self::Output {
- core::ops::$trait::$trait_fn(self, Self::splat(rhs))
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::$trait<Simd<$scalar, LANES>> for $scalar
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Simd<$scalar, LANES>;
-
- #[inline]
- fn $trait_fn(self, rhs: Simd<$scalar, LANES>) -> Self::Output {
- core::ops::$trait::$trait_fn(Simd::splat(self), rhs)
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::$assign_trait<Self> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn $assign_trait_fn(&mut self, rhs: Self) {
- unsafe {
- *self = intrinsics::$intrinsic(*self, rhs);
- }
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::$assign_trait<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn $assign_trait_fn(&mut self, rhs: $scalar) {
- core::ops::$assign_trait::$assign_trait_fn(self, Self::splat(rhs));
- }
- }
- }
++ { @binary $scalar:ty, $trait:ident :: $trait_fn:ident, $intrinsic:ident } => {
+ impl_ref_ops! {
+ impl<const LANES: usize> core::ops::$trait<Self> for Simd<$scalar, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ type Output = Self;
+
+ #[inline]
+ fn $trait_fn(self, rhs: Self) -> Self::Output {
+ unsafe {
+ intrinsics::$intrinsic(self, rhs)
+ }
+ }
+ }
+ }
- impl_op! { impl Neg for $scalar }
+ };
+}
+
+/// Implements floating-point operators for the provided types.
+macro_rules! impl_float_ops {
+ { $($scalar:ty),* } => {
+ $(
+ impl_op! { impl Add for $scalar }
+ impl_op! { impl Sub for $scalar }
+ impl_op! { impl Mul for $scalar }
+ impl_op! { impl Div for $scalar }
+ impl_op! { impl Rem for $scalar }
- impl_op! { impl Not for $scalar }
+ )*
+ };
+}
+
+/// Implements unsigned integer operators for the provided types.
+macro_rules! impl_unsigned_int_ops {
+ { $($scalar:ty),* } => {
+ $(
+ impl_op! { impl Add for $scalar }
+ impl_op! { impl Sub for $scalar }
+ impl_op! { impl Mul for $scalar }
+ impl_op! { impl BitAnd for $scalar }
+ impl_op! { impl BitOr for $scalar }
+ impl_op! { impl BitXor for $scalar }
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Div<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Self;
-
- #[inline]
- fn div(self, rhs: $scalar) -> Self::Output {
- if rhs == 0 {
- panic!("attempt to divide by zero");
- }
- if <$scalar>::MIN != 0 &&
- self.as_array().iter().any(|x| *x == <$scalar>::MIN) &&
- rhs == -1 as _ {
- panic!("attempt to divide with overflow");
- }
- let rhs = Self::splat(rhs);
- unsafe { intrinsics::simd_div(self, rhs) }
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Div<Simd<$scalar, LANES>> for $scalar
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Simd<$scalar, LANES>;
-
- #[inline]
- fn div(self, rhs: Simd<$scalar, LANES>) -> Self::Output {
- Simd::splat(self) / rhs
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::DivAssign<Self> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn div_assign(&mut self, rhs: Self) {
- *self = *self / rhs;
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::DivAssign<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn div_assign(&mut self, rhs: $scalar) {
- *self = *self / rhs;
- }
- }
- }
-
+
+ // Integers panic on divide by 0
+ impl_ref_ops! {
+ impl<const LANES: usize> core::ops::Div<Self> for Simd<$scalar, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ type Output = Self;
+
+ #[inline]
+ fn div(self, rhs: Self) -> Self::Output {
+ if rhs.as_array()
+ .iter()
+ .any(|x| *x == 0)
+ {
+ panic!("attempt to divide by zero");
+ }
+
+ // Guards for div(MIN, -1),
+ // this check only applies to signed ints
+ if <$scalar>::MIN != 0 && self.as_array().iter()
+ .zip(rhs.as_array().iter())
+ .any(|(x,y)| *x == <$scalar>::MIN && *y == -1 as _) {
+ panic!("attempt to divide with overflow");
+ }
+ unsafe { intrinsics::simd_div(self, rhs) }
+ }
+ }
+ }
+
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Rem<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Self;
-
- #[inline]
- fn rem(self, rhs: $scalar) -> Self::Output {
- if rhs == 0 {
- panic!("attempt to calculate the remainder with a divisor of zero");
- }
- if <$scalar>::MIN != 0 &&
- self.as_array().iter().any(|x| *x == <$scalar>::MIN) &&
- rhs == -1 as _ {
- panic!("attempt to calculate the remainder with overflow");
- }
- let rhs = Self::splat(rhs);
- unsafe { intrinsics::simd_rem(self, rhs) }
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Rem<Simd<$scalar, LANES>> for $scalar
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Simd<$scalar, LANES>;
-
- #[inline]
- fn rem(self, rhs: Simd<$scalar, LANES>) -> Self::Output {
- Simd::splat(self) % rhs
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::RemAssign<Self> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn rem_assign(&mut self, rhs: Self) {
- *self = *self % rhs;
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::RemAssign<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn rem_assign(&mut self, rhs: $scalar) {
- *self = *self % rhs;
- }
- }
- }
-
+ // remainder panics on zero divisor
+ impl_ref_ops! {
+ impl<const LANES: usize> core::ops::Rem<Self> for Simd<$scalar, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ type Output = Self;
+
+ #[inline]
+ fn rem(self, rhs: Self) -> Self::Output {
+ if rhs.as_array()
+ .iter()
+ .any(|x| *x == 0)
+ {
+ panic!("attempt to calculate the remainder with a divisor of zero");
+ }
+
+ // Guards for rem(MIN, -1)
+ // this branch applies the check only to signed ints
+ if <$scalar>::MIN != 0 && self.as_array().iter()
+ .zip(rhs.as_array().iter())
+ .any(|(x,y)| *x == <$scalar>::MIN && *y == -1 as _) {
+ panic!("attempt to calculate the remainder with overflow");
+ }
+ unsafe { intrinsics::simd_rem(self, rhs) }
+ }
+ }
+ }
+
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Shl<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Self;
-
- #[inline]
- fn shl(self, rhs: $scalar) -> Self::Output {
- if invalid_shift_rhs(rhs) {
- panic!("attempt to shift left with overflow");
- }
- let rhs = Self::splat(rhs);
- unsafe { intrinsics::simd_shl(self, rhs) }
- }
- }
- }
-
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::ShlAssign<Self> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn shl_assign(&mut self, rhs: Self) {
- *self = *self << rhs;
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::ShlAssign<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn shl_assign(&mut self, rhs: $scalar) {
- *self = *self << rhs;
- }
- }
- }
-
+ // shifts panic on overflow
+ impl_ref_ops! {
+ impl<const LANES: usize> core::ops::Shl<Self> for Simd<$scalar, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ type Output = Self;
+
+ #[inline]
+ fn shl(self, rhs: Self) -> Self::Output {
+ // TODO there is probably a better way of doing this
+ if rhs.as_array()
+ .iter()
+ .copied()
+ .any(invalid_shift_rhs)
+ {
+ panic!("attempt to shift left with overflow");
+ }
+ unsafe { intrinsics::simd_shl(self, rhs) }
+ }
+ }
+ }
+
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::Shr<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- type Output = Self;
-
- #[inline]
- fn shr(self, rhs: $scalar) -> Self::Output {
- if invalid_shift_rhs(rhs) {
- panic!("attempt to shift with overflow");
- }
- let rhs = Self::splat(rhs);
- unsafe { intrinsics::simd_shr(self, rhs) }
- }
- }
- }
-
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::ShrAssign<Self> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn shr_assign(&mut self, rhs: Self) {
- *self = *self >> rhs;
- }
- }
- }
-
- impl_ref_ops! {
- impl<const LANES: usize> core::ops::ShrAssign<$scalar> for Simd<$scalar, LANES>
- where
- LaneCount<LANES>: SupportedLaneCount,
- {
- #[inline]
- fn shr_assign(&mut self, rhs: $scalar) {
- *self = *self >> rhs;
- }
- }
- }
+ impl_ref_ops! {
+ impl<const LANES: usize> core::ops::Shr<Self> for Simd<$scalar, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ type Output = Self;
+
+ #[inline]
+ fn shr(self, rhs: Self) -> Self::Output {
+ // TODO there is probably a better way of doing this
+ if rhs.as_array()
+ .iter()
+ .copied()
+ .any(invalid_shift_rhs)
+ {
+ panic!("attempt to shift with overflow");
+ }
+ unsafe { intrinsics::simd_shr(self, rhs) }
+ }
+ }
+ }
- $( // scalar
- impl_op! { impl Neg for $scalar }
- )*
+ )*
+ };
+}
+
+/// Implements unsigned integer operators for the provided types.
+macro_rules! impl_signed_int_ops {
+ { $($scalar:ty),* } => {
+ impl_unsigned_int_ops! { $($scalar),* }
+ };
+}
+
+impl_unsigned_int_ops! { u8, u16, u32, u64, usize }
+impl_signed_int_ops! { i8, i16, i32, i64, isize }
+impl_float_ops! { f32, f64 }
--- /dev/null
--- /dev/null
++//! Assignment operators
++use super::*;
++use core::ops::{AddAssign, MulAssign}; // commutative binary op-assignment
++use core::ops::{BitAndAssign, BitOrAssign, BitXorAssign}; // commutative bit binary op-assignment
++use core::ops::{DivAssign, RemAssign, SubAssign}; // non-commutative binary op-assignment
++use core::ops::{ShlAssign, ShrAssign}; // non-commutative bit binary op-assignment
++
++// Arithmetic
++
++macro_rules! assign_ops {
++ ($(impl<T, U, const LANES: usize> $assignTrait:ident<U> for Simd<T, LANES>
++ where
++ Self: $trait:ident,
++ {
++ fn $assign_call:ident(rhs: U) {
++ $call:ident
++ }
++ })*) => {
++ $(impl<T, U, const LANES: usize> $assignTrait<U> for Simd<T, LANES>
++ where
++ Self: $trait<U, Output = Self>,
++ T: SimdElement,
++ LaneCount<LANES>: SupportedLaneCount,
++ {
++ #[inline]
++ fn $assign_call(&mut self, rhs: U) {
++ *self = self.$call(rhs);
++ }
++ })*
++ }
++}
++
++assign_ops! {
++ // Arithmetic
++ impl<T, U, const LANES: usize> AddAssign<U> for Simd<T, LANES>
++ where
++ Self: Add,
++ {
++ fn add_assign(rhs: U) {
++ add
++ }
++ }
++
++ impl<T, U, const LANES: usize> MulAssign<U> for Simd<T, LANES>
++ where
++ Self: Mul,
++ {
++ fn mul_assign(rhs: U) {
++ mul
++ }
++ }
++
++ impl<T, U, const LANES: usize> SubAssign<U> for Simd<T, LANES>
++ where
++ Self: Sub,
++ {
++ fn sub_assign(rhs: U) {
++ sub
++ }
++ }
++
++ impl<T, U, const LANES: usize> DivAssign<U> for Simd<T, LANES>
++ where
++ Self: Div,
++ {
++ fn div_assign(rhs: U) {
++ div
++ }
++ }
++ impl<T, U, const LANES: usize> RemAssign<U> for Simd<T, LANES>
++ where
++ Self: Rem,
++ {
++ fn rem_assign(rhs: U) {
++ rem
++ }
++ }
++
++ // Bitops
++ impl<T, U, const LANES: usize> BitAndAssign<U> for Simd<T, LANES>
++ where
++ Self: BitAnd,
++ {
++ fn bitand_assign(rhs: U) {
++ bitand
++ }
++ }
++
++ impl<T, U, const LANES: usize> BitOrAssign<U> for Simd<T, LANES>
++ where
++ Self: BitOr,
++ {
++ fn bitor_assign(rhs: U) {
++ bitor
++ }
++ }
++
++ impl<T, U, const LANES: usize> BitXorAssign<U> for Simd<T, LANES>
++ where
++ Self: BitXor,
++ {
++ fn bitxor_assign(rhs: U) {
++ bitxor
++ }
++ }
++
++ impl<T, U, const LANES: usize> ShlAssign<U> for Simd<T, LANES>
++ where
++ Self: Shl,
++ {
++ fn shl_assign(rhs: U) {
++ shl
++ }
++ }
++
++ impl<T, U, const LANES: usize> ShrAssign<U> for Simd<T, LANES>
++ where
++ Self: Shr,
++ {
++ fn shr_assign(rhs: U) {
++ shr
++ }
++ }
++}
--- /dev/null
--- /dev/null
++//! This module hacks in "implicit deref" for Simd's operators.
++//! Ideally, Rust would take care of this itself,
++//! and method calls usually handle the LHS implicitly.
++//! But this is not the case with arithmetic ops.
++use super::*;
++
++macro_rules! deref_lhs {
++ (impl<T, const LANES: usize> $trait:ident for $simd:ty {
++ fn $call:ident
++ }) => {
++ impl<T, const LANES: usize> $trait<$simd> for &$simd
++ where
++ T: SimdElement,
++ $simd: $trait<$simd, Output = $simd>,
++ LaneCount<LANES>: SupportedLaneCount,
++ {
++ type Output = Simd<T, LANES>;
++
++ #[inline]
++ #[must_use = "operator returns a new vector without mutating the inputs"]
++ fn $call(self, rhs: $simd) -> Self::Output {
++ (*self).$call(rhs)
++ }
++ }
++ };
++}
++
++macro_rules! deref_rhs {
++ (impl<T, const LANES: usize> $trait:ident for $simd:ty {
++ fn $call:ident
++ }) => {
++ impl<T, const LANES: usize> $trait<&$simd> for $simd
++ where
++ T: SimdElement,
++ $simd: $trait<$simd, Output = $simd>,
++ LaneCount<LANES>: SupportedLaneCount,
++ {
++ type Output = Simd<T, LANES>;
++
++ #[inline]
++ #[must_use = "operator returns a new vector without mutating the inputs"]
++ fn $call(self, rhs: &$simd) -> Self::Output {
++ self.$call(*rhs)
++ }
++ }
++ };
++}
++
++macro_rules! deref_ops {
++ ($(impl<T, const LANES: usize> $trait:ident for $simd:ty {
++ fn $call:ident
++ })*) => {
++ $(
++ deref_rhs! {
++ impl<T, const LANES: usize> $trait for $simd {
++ fn $call
++ }
++ }
++ deref_lhs! {
++ impl<T, const LANES: usize> $trait for $simd {
++ fn $call
++ }
++ }
++ impl<'lhs, 'rhs, T, const LANES: usize> $trait<&'rhs $simd> for &'lhs $simd
++ where
++ T: SimdElement,
++ $simd: $trait<$simd, Output = $simd>,
++ LaneCount<LANES>: SupportedLaneCount,
++ {
++ type Output = $simd;
++
++ #[inline]
++ #[must_use = "operator returns a new vector without mutating the inputs"]
++ fn $call(self, rhs: &$simd) -> Self::Output {
++ (*self).$call(*rhs)
++ }
++ }
++ )*
++ }
++}
++
++deref_ops! {
++ // Arithmetic
++ impl<T, const LANES: usize> Add for Simd<T, LANES> {
++ fn add
++ }
++
++ impl<T, const LANES: usize> Mul for Simd<T, LANES> {
++ fn mul
++ }
++
++ impl<T, const LANES: usize> Sub for Simd<T, LANES> {
++ fn sub
++ }
++
++ impl<T, const LANES: usize> Div for Simd<T, LANES> {
++ fn div
++ }
++
++ impl<T, const LANES: usize> Rem for Simd<T, LANES> {
++ fn rem
++ }
++
++ // Bitops
++ impl<T, const LANES: usize> BitAnd for Simd<T, LANES> {
++ fn bitand
++ }
++
++ impl<T, const LANES: usize> BitOr for Simd<T, LANES> {
++ fn bitor
++ }
++
++ impl<T, const LANES: usize> BitXor for Simd<T, LANES> {
++ fn bitxor
++ }
++
++ impl<T, const LANES: usize> Shl for Simd<T, LANES> {
++ fn shl
++ }
++
++ impl<T, const LANES: usize> Shr for Simd<T, LANES> {
++ fn shr
++ }
++}
--- /dev/null
--- /dev/null
++use crate::simd::intrinsics;
++use crate::simd::{LaneCount, Simd, SimdElement, SupportedLaneCount};
++use core::ops::{Neg, Not}; // unary ops
++
++macro_rules! neg {
++ ($(impl<const LANES: usize> Neg for Simd<$scalar:ty, LANES>)*) => {
++ $(impl<const LANES: usize> Neg for Simd<$scalar, LANES>
++ where
++ $scalar: SimdElement,
++ LaneCount<LANES>: SupportedLaneCount,
++ {
++ type Output = Self;
++
++ #[inline]
++ #[must_use = "operator returns a new vector without mutating the input"]
++ fn neg(self) -> Self::Output {
++ unsafe { intrinsics::simd_neg(self) }
++ }
++ })*
++ }
++}
++
++neg! {
++ impl<const LANES: usize> Neg for Simd<f32, LANES>
++
++ impl<const LANES: usize> Neg for Simd<f64, LANES>
++
++ impl<const LANES: usize> Neg for Simd<i8, LANES>
++
++ impl<const LANES: usize> Neg for Simd<i16, LANES>
++
++ impl<const LANES: usize> Neg for Simd<i32, LANES>
++
++ impl<const LANES: usize> Neg for Simd<i64, LANES>
++
++ impl<const LANES: usize> Neg for Simd<isize, LANES>
++}
++
++macro_rules! not {
++ ($(impl<const LANES: usize> Not for Simd<$scalar:ty, LANES>)*) => {
++ $(impl<const LANES: usize> Not for Simd<$scalar, LANES>
++ where
++ $scalar: SimdElement,
++ LaneCount<LANES>: SupportedLaneCount,
++ {
++ type Output = Self;
++
++ #[inline]
++ #[must_use = "operator returns a new vector without mutating the input"]
++ fn not(self) -> Self::Output {
++ self ^ (Simd::splat(!(0 as $scalar)))
++ }
++ })*
++ }
++}
++
++not! {
++ impl<const LANES: usize> Not for Simd<i8, LANES>
++
++ impl<const LANES: usize> Not for Simd<i16, LANES>
++
++ impl<const LANES: usize> Not for Simd<i32, LANES>
++
++ impl<const LANES: usize> Not for Simd<i64, LANES>
++
++ impl<const LANES: usize> Not for Simd<isize, LANES>
++
++ impl<const LANES: usize> Not for Simd<u8, LANES>
++
++ impl<const LANES: usize> Not for Simd<u16, LANES>
++
++ impl<const LANES: usize> Not for Simd<u32, LANES>
++
++ impl<const LANES: usize> Not for Simd<u64, LANES>
++
++ impl<const LANES: usize> Not for Simd<usize, LANES>
++}
--- /dev/null
- use crate::simd::{LaneCount, Simd, SupportedLaneCount};
+use crate::simd::intrinsics::{
+ simd_reduce_add_ordered, simd_reduce_and, simd_reduce_max, simd_reduce_min,
+ simd_reduce_mul_ordered, simd_reduce_or, simd_reduce_xor,
+};
- /// Horizontal bitwise "and". Returns the cumulative bitwise "and" across the lanes of
- /// the vector.
- #[inline]
- pub fn horizontal_and(self) -> $scalar {
- unsafe { simd_reduce_and(self) }
- }
-
- /// Horizontal bitwise "or". Returns the cumulative bitwise "or" across the lanes of
- /// the vector.
- #[inline]
- pub fn horizontal_or(self) -> $scalar {
- unsafe { simd_reduce_or(self) }
- }
-
- /// Horizontal bitwise "xor". Returns the cumulative bitwise "xor" across the lanes of
- /// the vector.
- #[inline]
- pub fn horizontal_xor(self) -> $scalar {
- unsafe { simd_reduce_xor(self) }
- }
-
++use crate::simd::{LaneCount, Simd, SimdElement, SupportedLaneCount};
++use core::ops::{BitAnd, BitOr, BitXor};
+
+macro_rules! impl_integer_reductions {
+ { $scalar:ty } => {
+ impl<const LANES: usize> Simd<$scalar, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ /// Horizontal wrapping add. Returns the sum of the lanes of the vector, with wrapping addition.
+ #[inline]
+ pub fn horizontal_sum(self) -> $scalar {
+ unsafe { simd_reduce_add_ordered(self, 0) }
+ }
+
+ /// Horizontal wrapping multiply. Returns the product of the lanes of the vector, with wrapping multiplication.
+ #[inline]
+ pub fn horizontal_product(self) -> $scalar {
+ unsafe { simd_reduce_mul_ordered(self, 1) }
+ }
+
+ /// Horizontal maximum. Returns the maximum lane in the vector.
+ #[inline]
+ pub fn horizontal_max(self) -> $scalar {
+ unsafe { simd_reduce_max(self) }
+ }
+
+ /// Horizontal minimum. Returns the minimum lane in the vector.
+ #[inline]
+ pub fn horizontal_min(self) -> $scalar {
+ unsafe { simd_reduce_min(self) }
+ }
+ }
+ }
+}
+
+impl_integer_reductions! { i8 }
+impl_integer_reductions! { i16 }
+impl_integer_reductions! { i32 }
+impl_integer_reductions! { i64 }
+impl_integer_reductions! { isize }
+impl_integer_reductions! { u8 }
+impl_integer_reductions! { u16 }
+impl_integer_reductions! { u32 }
+impl_integer_reductions! { u64 }
+impl_integer_reductions! { usize }
+
+macro_rules! impl_float_reductions {
+ { $scalar:ty } => {
+ impl<const LANES: usize> Simd<$scalar, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+
+ /// Horizontal add. Returns the sum of the lanes of the vector.
+ #[inline]
+ pub fn horizontal_sum(self) -> $scalar {
+ // LLVM sum is inaccurate on i586
+ if cfg!(all(target_arch = "x86", not(target_feature = "sse2"))) {
+ self.as_array().iter().sum()
+ } else {
+ unsafe { simd_reduce_add_ordered(self, 0.) }
+ }
+ }
+
+ /// Horizontal multiply. Returns the product of the lanes of the vector.
+ #[inline]
+ pub fn horizontal_product(self) -> $scalar {
+ // LLVM product is inaccurate on i586
+ if cfg!(all(target_arch = "x86", not(target_feature = "sse2"))) {
+ self.as_array().iter().product()
+ } else {
+ unsafe { simd_reduce_mul_ordered(self, 1.) }
+ }
+ }
+
+ /// Horizontal maximum. Returns the maximum lane in the vector.
+ ///
+ /// Returns values based on equality, so a vector containing both `0.` and `-0.` may
+ /// return either. This function will not return `NaN` unless all lanes are `NaN`.
+ #[inline]
+ pub fn horizontal_max(self) -> $scalar {
+ unsafe { simd_reduce_max(self) }
+ }
+
+ /// Horizontal minimum. Returns the minimum lane in the vector.
+ ///
+ /// Returns values based on equality, so a vector containing both `0.` and `-0.` may
+ /// return either. This function will not return `NaN` unless all lanes are `NaN`.
+ #[inline]
+ pub fn horizontal_min(self) -> $scalar {
+ unsafe { simd_reduce_min(self) }
+ }
+ }
+ }
+}
+
+impl_float_reductions! { f32 }
+impl_float_reductions! { f64 }
++
++impl<T, const LANES: usize> Simd<T, LANES>
++where
++ Self: BitAnd<Self, Output = Self>,
++ T: SimdElement + BitAnd<T, Output = T>,
++ LaneCount<LANES>: SupportedLaneCount,
++{
++ /// Horizontal bitwise "and". Returns the cumulative bitwise "and" across the lanes of
++ /// the vector.
++ #[inline]
++ pub fn horizontal_and(self) -> T {
++ unsafe { simd_reduce_and(self) }
++ }
++}
++
++impl<T, const LANES: usize> Simd<T, LANES>
++where
++ Self: BitOr<Self, Output = Self>,
++ T: SimdElement + BitOr<T, Output = T>,
++ LaneCount<LANES>: SupportedLaneCount,
++{
++ /// Horizontal bitwise "or". Returns the cumulative bitwise "or" across the lanes of
++ /// the vector.
++ #[inline]
++ pub fn horizontal_or(self) -> T {
++ unsafe { simd_reduce_or(self) }
++ }
++}
++
++impl<T, const LANES: usize> Simd<T, LANES>
++where
++ Self: BitXor<Self, Output = Self>,
++ T: SimdElement + BitXor<T, Output = T>,
++ LaneCount<LANES>: SupportedLaneCount,
++{
++ /// Horizontal bitwise "xor". Returns the cumulative bitwise "xor" across the lanes of
++ /// the vector.
++ #[inline]
++ pub fn horizontal_xor(self) -> T {
++ unsafe { simd_reduce_xor(self) }
++ }
++}
--- /dev/null
+use crate::simd::intrinsics;
+use crate::simd::{LaneCount, Mask, MaskElement, Simd, SimdElement, SupportedLaneCount};
+
+mod sealed {
+ pub trait Sealed<Mask> {
+ fn select(mask: Mask, true_values: Self, false_values: Self) -> Self;
+ }
+}
+use sealed::Sealed;
+
+/// Supporting trait for vector `select` function
+pub trait Select<Mask>: Sealed<Mask> {}
+
+impl<T, const LANES: usize> Sealed<Mask<T::Mask, LANES>> for Simd<T, LANES>
+where
+ T: SimdElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ fn select(mask: Mask<T::Mask, LANES>, true_values: Self, false_values: Self) -> Self {
+ unsafe { intrinsics::simd_select(mask.to_int(), true_values, false_values) }
+ }
+}
+
+impl<T, const LANES: usize> Select<Mask<T::Mask, LANES>> for Simd<T, LANES>
+where
+ T: SimdElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+}
+
+impl<T, const LANES: usize> Sealed<Self> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ fn select(mask: Self, true_values: Self, false_values: Self) -> Self {
+ mask & true_values | !mask & false_values
+ }
+}
+
+impl<T, const LANES: usize> Select<Self> for Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+}
+
+impl<T, const LANES: usize> Mask<T, LANES>
+where
+ T: MaskElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ /// Choose lanes from two vectors.
+ ///
+ /// For each lane in the mask, choose the corresponding lane from `true_values` if
+ /// that lane mask is true, and `false_values` if that lane mask is false.
+ ///
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::{Simd, Mask};
+ /// # #[cfg(not(feature = "std"))] use core::simd::{Simd, Mask};
+ /// let a = Simd::from_array([0, 1, 2, 3]);
+ /// let b = Simd::from_array([4, 5, 6, 7]);
+ /// let mask = Mask::from_array([true, false, false, true]);
+ /// let c = mask.select(a, b);
+ /// assert_eq!(c.to_array(), [0, 5, 6, 3]);
+ /// ```
+ ///
+ /// `select` can also be used on masks:
+ /// ```
+ /// # #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Mask;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Mask;
+ /// let a = Mask::<i32, 4>::from_array([true, true, false, false]);
+ /// let b = Mask::<i32, 4>::from_array([false, false, true, true]);
+ /// let mask = Mask::<i32, 4>::from_array([true, false, false, true]);
+ /// let c = mask.select(a, b);
+ /// assert_eq!(c.to_array(), [true, false, true, false]);
+ /// ```
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ pub fn select<S: Select<Self>>(self, true_values: S, false_values: S) -> S {
+ S::select(self, true_values, false_values)
+ }
+}
--- /dev/null
+use crate::simd::intrinsics;
+use crate::simd::{LaneCount, Simd, SimdElement, SupportedLaneCount};
+
+/// Constructs a new vector by selecting values from the lanes of the source vector or vectors to use.
+///
+/// When swizzling one vector, the indices of the result vector are indicated by a `const` array
+/// of `usize`, like [`Swizzle`].
+/// When swizzling two vectors, the indices are indicated by a `const` array of [`Which`], like
+/// [`Swizzle2`].
+///
+/// # Examples
+/// ## One source vector
+/// ```
+/// # #![feature(portable_simd)]
+/// # #[cfg(feature = "std")] use core_simd::{Simd, simd_swizzle};
+/// # #[cfg(not(feature = "std"))] use core::simd::{Simd, simd_swizzle};
+/// let v = Simd::<f32, 4>::from_array([0., 1., 2., 3.]);
+///
+/// // Keeping the same size
+/// let r = simd_swizzle!(v, [3, 0, 1, 2]);
+/// assert_eq!(r.to_array(), [3., 0., 1., 2.]);
+///
+/// // Changing the number of lanes
+/// let r = simd_swizzle!(v, [3, 1]);
+/// assert_eq!(r.to_array(), [3., 1.]);
+/// ```
+///
+/// ## Two source vectors
+/// ```
+/// # #![feature(portable_simd)]
+/// # #[cfg(feature = "std")] use core_simd::{Simd, simd_swizzle, Which};
+/// # #[cfg(not(feature = "std"))] use core::simd::{Simd, simd_swizzle, Which};
+/// use Which::*;
+/// let a = Simd::<f32, 4>::from_array([0., 1., 2., 3.]);
+/// let b = Simd::<f32, 4>::from_array([4., 5., 6., 7.]);
+///
+/// // Keeping the same size
+/// let r = simd_swizzle!(a, b, [First(0), First(1), Second(2), Second(3)]);
+/// assert_eq!(r.to_array(), [0., 1., 6., 7.]);
+///
+/// // Changing the number of lanes
+/// let r = simd_swizzle!(a, b, [First(0), Second(0)]);
+/// assert_eq!(r.to_array(), [0., 4.]);
+/// ```
+#[allow(unused_macros)]
+pub macro simd_swizzle {
+ (
+ $vector:expr, $index:expr $(,)?
+ ) => {
+ {
+ use $crate::simd::Swizzle;
+ struct Impl;
+ impl<const LANES: usize> Swizzle<LANES, {$index.len()}> for Impl {
+ const INDEX: [usize; {$index.len()}] = $index;
+ }
+ Impl::swizzle($vector)
+ }
+ },
+ (
+ $first:expr, $second:expr, $index:expr $(,)?
+ ) => {
+ {
+ use $crate::simd::{Which, Swizzle2};
+ struct Impl;
+ impl<const LANES: usize> Swizzle2<LANES, {$index.len()}> for Impl {
+ const INDEX: [Which; {$index.len()}] = $index;
+ }
+ Impl::swizzle2($first, $second)
+ }
+ }
+}
+
+/// An index into one of two vectors.
+#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
+pub enum Which {
+ /// Indexes the first vector.
+ First(usize),
+ /// Indexes the second vector.
+ Second(usize),
+}
+
+/// Create a vector from the elements of another vector.
+pub trait Swizzle<const INPUT_LANES: usize, const OUTPUT_LANES: usize> {
+ /// Map from the lanes of the input vector to the output vector.
+ const INDEX: [usize; OUTPUT_LANES];
+
+ /// Create a new vector from the lanes of `vector`.
+ ///
+ /// Lane `i` of the output is `vector[Self::INDEX[i]]`.
++ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ fn swizzle<T>(vector: Simd<T, INPUT_LANES>) -> Simd<T, OUTPUT_LANES>
+ where
+ T: SimdElement,
+ LaneCount<INPUT_LANES>: SupportedLaneCount,
+ LaneCount<OUTPUT_LANES>: SupportedLaneCount,
+ {
+ unsafe { intrinsics::simd_shuffle(vector, vector, Self::INDEX_IMPL) }
+ }
+}
+
+/// Create a vector from the elements of two other vectors.
+pub trait Swizzle2<const INPUT_LANES: usize, const OUTPUT_LANES: usize> {
+ /// Map from the lanes of the input vectors to the output vector
+ const INDEX: [Which; OUTPUT_LANES];
+
+ /// Create a new vector from the lanes of `first` and `second`.
+ ///
+ /// Lane `i` is `first[j]` when `Self::INDEX[i]` is `First(j)`, or `second[j]` when it is
+ /// `Second(j)`.
++ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ fn swizzle2<T>(
+ first: Simd<T, INPUT_LANES>,
+ second: Simd<T, INPUT_LANES>,
+ ) -> Simd<T, OUTPUT_LANES>
+ where
+ T: SimdElement,
+ LaneCount<INPUT_LANES>: SupportedLaneCount,
+ LaneCount<OUTPUT_LANES>: SupportedLaneCount,
+ {
+ unsafe { intrinsics::simd_shuffle(first, second, Self::INDEX_IMPL) }
+ }
+}
+
+/// The `simd_shuffle` intrinsic expects `u32`, so do error checking and conversion here.
+/// This trait hides `INDEX_IMPL` from the public API.
+trait SwizzleImpl<const INPUT_LANES: usize, const OUTPUT_LANES: usize> {
+ const INDEX_IMPL: [u32; OUTPUT_LANES];
+}
+
+impl<T, const INPUT_LANES: usize, const OUTPUT_LANES: usize> SwizzleImpl<INPUT_LANES, OUTPUT_LANES>
+ for T
+where
+ T: Swizzle<INPUT_LANES, OUTPUT_LANES> + ?Sized,
+{
+ const INDEX_IMPL: [u32; OUTPUT_LANES] = {
+ let mut output = [0; OUTPUT_LANES];
+ let mut i = 0;
+ while i < OUTPUT_LANES {
+ let index = Self::INDEX[i];
+ assert!(index as u32 as usize == index);
+ assert!(index < INPUT_LANES, "source lane exceeds input lane count",);
+ output[i] = index as u32;
+ i += 1;
+ }
+ output
+ };
+}
+
+/// The `simd_shuffle` intrinsic expects `u32`, so do error checking and conversion here.
+/// This trait hides `INDEX_IMPL` from the public API.
+trait Swizzle2Impl<const INPUT_LANES: usize, const OUTPUT_LANES: usize> {
+ const INDEX_IMPL: [u32; OUTPUT_LANES];
+}
+
+impl<T, const INPUT_LANES: usize, const OUTPUT_LANES: usize> Swizzle2Impl<INPUT_LANES, OUTPUT_LANES>
+ for T
+where
+ T: Swizzle2<INPUT_LANES, OUTPUT_LANES> + ?Sized,
+{
+ const INDEX_IMPL: [u32; OUTPUT_LANES] = {
+ let mut output = [0; OUTPUT_LANES];
+ let mut i = 0;
+ while i < OUTPUT_LANES {
+ let (offset, index) = match Self::INDEX[i] {
+ Which::First(index) => (false, index),
+ Which::Second(index) => (true, index),
+ };
+ assert!(index < INPUT_LANES, "source lane exceeds input lane count",);
+
+ // lanes are indexed by the first vector, then second vector
+ let index = if offset { index + INPUT_LANES } else { index };
+ assert!(index as u32 as usize == index);
+ output[i] = index as u32;
+ i += 1;
+ }
+ output
+ };
+}
+
+impl<T, const LANES: usize> Simd<T, LANES>
+where
+ T: SimdElement,
+ LaneCount<LANES>: SupportedLaneCount,
+{
+ /// Reverse the order of the lanes in the vector.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ pub fn reverse(self) -> Self {
+ const fn reverse_index<const LANES: usize>() -> [usize; LANES] {
+ let mut index = [0; LANES];
+ let mut i = 0;
+ while i < LANES {
+ index[i] = LANES - i - 1;
+ i += 1;
+ }
+ index
+ }
+
+ struct Reverse;
+
+ impl<const LANES: usize> Swizzle<LANES, LANES> for Reverse {
+ const INDEX: [usize; LANES] = reverse_index::<LANES>();
+ }
+
+ Reverse::swizzle(self)
+ }
+
+ /// Rotates the vector such that the first `OFFSET` elements of the slice move to the end
+ /// while the last `LANES - OFFSET` elements move to the front. After calling `rotate_lanes_left`,
+ /// the element previously in lane `OFFSET` will become the first element in the slice.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ pub fn rotate_lanes_left<const OFFSET: usize>(self) -> Self {
+ const fn rotate_index<const OFFSET: usize, const LANES: usize>() -> [usize; LANES] {
+ let offset = OFFSET % LANES;
+ let mut index = [0; LANES];
+ let mut i = 0;
+ while i < LANES {
+ index[i] = (i + offset) % LANES;
+ i += 1;
+ }
+ index
+ }
+
+ struct Rotate<const OFFSET: usize>;
+
+ impl<const OFFSET: usize, const LANES: usize> Swizzle<LANES, LANES> for Rotate<OFFSET> {
+ const INDEX: [usize; LANES] = rotate_index::<OFFSET, LANES>();
+ }
+
+ Rotate::<OFFSET>::swizzle(self)
+ }
+
+ /// Rotates the vector such that the first `LANES - OFFSET` elements of the vector move to
+ /// the end while the last `OFFSET` elements move to the front. After calling `rotate_lanes_right`,
+ /// the element previously at index `LANES - OFFSET` will become the first element in the slice.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ pub fn rotate_lanes_right<const OFFSET: usize>(self) -> Self {
+ const fn rotate_index<const OFFSET: usize, const LANES: usize>() -> [usize; LANES] {
+ let offset = LANES - OFFSET % LANES;
+ let mut index = [0; LANES];
+ let mut i = 0;
+ while i < LANES {
+ index[i] = (i + offset) % LANES;
+ i += 1;
+ }
+ index
+ }
+
+ struct Rotate<const OFFSET: usize>;
+
+ impl<const OFFSET: usize, const LANES: usize> Swizzle<LANES, LANES> for Rotate<OFFSET> {
+ const INDEX: [usize; LANES] = rotate_index::<OFFSET, LANES>();
+ }
+
+ Rotate::<OFFSET>::swizzle(self)
+ }
+
+ /// Interleave two vectors.
+ ///
+ /// Produces two vectors with lanes taken alternately from `self` and `other`.
+ ///
+ /// The first result contains the first `LANES / 2` lanes from `self` and `other`,
+ /// alternating, starting with the first lane of `self`.
+ ///
+ /// The second result contains the last `LANES / 2` lanes from `self` and `other`,
+ /// alternating, starting with the lane `LANES / 2` from the start of `self`.
+ ///
+ /// ```
+ /// #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ /// let a = Simd::from_array([0, 1, 2, 3]);
+ /// let b = Simd::from_array([4, 5, 6, 7]);
+ /// let (x, y) = a.interleave(b);
+ /// assert_eq!(x.to_array(), [0, 4, 1, 5]);
+ /// assert_eq!(y.to_array(), [2, 6, 3, 7]);
+ /// ```
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ pub fn interleave(self, other: Self) -> (Self, Self) {
+ const fn lo<const LANES: usize>() -> [Which; LANES] {
+ let mut idx = [Which::First(0); LANES];
+ let mut i = 0;
+ while i < LANES {
+ let offset = i / 2;
+ idx[i] = if i % 2 == 0 {
+ Which::First(offset)
+ } else {
+ Which::Second(offset)
+ };
+ i += 1;
+ }
+ idx
+ }
+ const fn hi<const LANES: usize>() -> [Which; LANES] {
+ let mut idx = [Which::First(0); LANES];
+ let mut i = 0;
+ while i < LANES {
+ let offset = (LANES + i) / 2;
+ idx[i] = if i % 2 == 0 {
+ Which::First(offset)
+ } else {
+ Which::Second(offset)
+ };
+ i += 1;
+ }
+ idx
+ }
+
+ struct Lo;
+ struct Hi;
+
+ impl<const LANES: usize> Swizzle2<LANES, LANES> for Lo {
+ const INDEX: [Which; LANES] = lo::<LANES>();
+ }
+
+ impl<const LANES: usize> Swizzle2<LANES, LANES> for Hi {
+ const INDEX: [Which; LANES] = hi::<LANES>();
+ }
+
+ (Lo::swizzle2(self, other), Hi::swizzle2(self, other))
+ }
+
+ /// Deinterleave two vectors.
+ ///
+ /// The first result takes every other lane of `self` and then `other`, starting with
+ /// the first lane.
+ ///
+ /// The second result takes every other lane of `self` and then `other`, starting with
+ /// the second lane.
+ ///
+ /// ```
+ /// #![feature(portable_simd)]
+ /// # #[cfg(feature = "std")] use core_simd::Simd;
+ /// # #[cfg(not(feature = "std"))] use core::simd::Simd;
+ /// let a = Simd::from_array([0, 4, 1, 5]);
+ /// let b = Simd::from_array([2, 6, 3, 7]);
+ /// let (x, y) = a.deinterleave(b);
+ /// assert_eq!(x.to_array(), [0, 1, 2, 3]);
+ /// assert_eq!(y.to_array(), [4, 5, 6, 7]);
+ /// ```
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original inputs"]
+ pub fn deinterleave(self, other: Self) -> (Self, Self) {
+ const fn even<const LANES: usize>() -> [Which; LANES] {
+ let mut idx = [Which::First(0); LANES];
+ let mut i = 0;
+ while i < LANES / 2 {
+ idx[i] = Which::First(2 * i);
+ idx[i + LANES / 2] = Which::Second(2 * i);
+ i += 1;
+ }
+ idx
+ }
+ const fn odd<const LANES: usize>() -> [Which; LANES] {
+ let mut idx = [Which::First(0); LANES];
+ let mut i = 0;
+ while i < LANES / 2 {
+ idx[i] = Which::First(2 * i + 1);
+ idx[i + LANES / 2] = Which::Second(2 * i + 1);
+ i += 1;
+ }
+ idx
+ }
+
+ struct Even;
+ struct Odd;
+
+ impl<const LANES: usize> Swizzle2<LANES, LANES> for Even {
+ const INDEX: [Which; LANES] = even::<LANES>();
+ }
+
+ impl<const LANES: usize> Swizzle2<LANES, LANES> for Odd {
+ const INDEX: [Which; LANES] = odd::<LANES>();
+ }
+
+ (Even::swizzle2(self, other), Odd::swizzle2(self, other))
+ }
+}
--- /dev/null
+#![allow(non_camel_case_types)]
+
+use crate::simd::intrinsics;
+use crate::simd::{LaneCount, Mask, Simd, SupportedLaneCount};
+
+/// Implements inherent methods for a float vector containing multiple
+/// `$lanes` of float `$type`, which uses `$bits_ty` as its binary
+/// representation.
+macro_rules! impl_float_vector {
+ { $type:ty, $bits_ty:ty, $mask_ty:ty } => {
+ impl<const LANES: usize> Simd<$type, LANES>
+ where
+ LaneCount<LANES>: SupportedLaneCount,
+ {
+ /// Raw transmutation to an unsigned integer vector type with the
+ /// same size and number of lanes.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn to_bits(self) -> Simd<$bits_ty, LANES> {
+ assert_eq!(core::mem::size_of::<Self>(), core::mem::size_of::<Simd<$bits_ty, LANES>>());
+ unsafe { core::mem::transmute_copy(&self) }
+ }
+
+ /// Raw transmutation from an unsigned integer vector type with the
+ /// same size and number of lanes.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn from_bits(bits: Simd<$bits_ty, LANES>) -> Self {
+ assert_eq!(core::mem::size_of::<Self>(), core::mem::size_of::<Simd<$bits_ty, LANES>>());
+ unsafe { core::mem::transmute_copy(&bits) }
+ }
+
+ /// Produces a vector where every lane has the absolute value of the
+ /// equivalently-indexed lane in `self`.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn abs(self) -> Self {
+ unsafe { intrinsics::simd_fabs(self) }
+ }
+
+ /// Fused multiply-add. Computes `(self * a) + b` with only one rounding error,
+ /// yielding a more accurate result than an unfused multiply-add.
+ ///
+ /// Using `mul_add` *may* be more performant than an unfused multiply-add if the target
+ /// architecture has a dedicated `fma` CPU instruction. However, this is not always
+ /// true, and will be heavily dependent on designing algorithms with specific target
+ /// hardware in mind.
+ #[cfg(feature = "std")]
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn mul_add(self, a: Self, b: Self) -> Self {
+ unsafe { intrinsics::simd_fma(self, a, b) }
+ }
+
+ /// Produces a vector where every lane has the square root value
+ /// of the equivalently-indexed lane in `self`
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ #[cfg(feature = "std")]
+ pub fn sqrt(self) -> Self {
+ unsafe { intrinsics::simd_fsqrt(self) }
+ }
+
+ /// Takes the reciprocal (inverse) of each lane, `1/x`.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn recip(self) -> Self {
+ Self::splat(1.0) / self
+ }
+
+ /// Converts each lane from radians to degrees.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn to_degrees(self) -> Self {
+ // to_degrees uses a special constant for better precision, so extract that constant
+ self * Self::splat(<$type>::to_degrees(1.))
+ }
+
+ /// Converts each lane from degrees to radians.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn to_radians(self) -> Self {
+ self * Self::splat(<$type>::to_radians(1.))
+ }
+
+ /// Returns true for each lane if it has a positive sign, including
+ /// `+0.0`, `NaN`s with positive sign bit and positive infinity.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn is_sign_positive(self) -> Mask<$mask_ty, LANES> {
+ !self.is_sign_negative()
+ }
+
+ /// Returns true for each lane if it has a negative sign, including
+ /// `-0.0`, `NaN`s with negative sign bit and negative infinity.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn is_sign_negative(self) -> Mask<$mask_ty, LANES> {
+ let sign_bits = self.to_bits() & Simd::splat((!0 >> 1) + 1);
+ sign_bits.lanes_gt(Simd::splat(0))
+ }
+
+ /// Returns true for each lane if its value is `NaN`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn is_nan(self) -> Mask<$mask_ty, LANES> {
+ self.lanes_ne(self)
+ }
+
+ /// Returns true for each lane if its value is positive infinity or negative infinity.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn is_infinite(self) -> Mask<$mask_ty, LANES> {
+ self.abs().lanes_eq(Self::splat(<$type>::INFINITY))
+ }
+
+ /// Returns true for each lane if its value is neither infinite nor `NaN`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn is_finite(self) -> Mask<$mask_ty, LANES> {
+ self.abs().lanes_lt(Self::splat(<$type>::INFINITY))
+ }
+
+ /// Returns true for each lane if its value is subnormal.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn is_subnormal(self) -> Mask<$mask_ty, LANES> {
+ self.abs().lanes_ne(Self::splat(0.0)) & (self.to_bits() & Self::splat(<$type>::INFINITY).to_bits()).lanes_eq(Simd::splat(0))
+ }
+
+ /// Returns true for each lane if its value is neither neither zero, infinite,
+ /// subnormal, or `NaN`.
+ #[inline]
++ #[must_use = "method returns a new mask and does not mutate the original value"]
+ pub fn is_normal(self) -> Mask<$mask_ty, LANES> {
+ !(self.abs().lanes_eq(Self::splat(0.0)) | self.is_nan() | self.is_subnormal() | self.is_infinite())
+ }
+
+ /// Replaces each lane with a number that represents its sign.
+ ///
+ /// * `1.0` if the number is positive, `+0.0`, or `INFINITY`
+ /// * `-1.0` if the number is negative, `-0.0`, or `NEG_INFINITY`
+ /// * `NAN` if the number is `NAN`
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn signum(self) -> Self {
+ self.is_nan().select(Self::splat(<$type>::NAN), Self::splat(1.0).copysign(self))
+ }
+
+ /// Returns each lane with the magnitude of `self` and the sign of `sign`.
+ ///
+ /// If any lane is a `NAN`, then a `NAN` with the sign of `sign` is returned.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn copysign(self, sign: Self) -> Self {
+ let sign_bit = sign.to_bits() & Self::splat(-0.).to_bits();
+ let magnitude = self.to_bits() & !Self::splat(-0.).to_bits();
+ Self::from_bits(sign_bit | magnitude)
+ }
+
+ /// Returns the minimum of each lane.
+ ///
+ /// If one of the values is `NAN`, then the other value is returned.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn min(self, other: Self) -> Self {
+ // TODO consider using an intrinsic
+ self.is_nan().select(
+ other,
+ self.lanes_ge(other).select(other, self)
+ )
+ }
+
+ /// Returns the maximum of each lane.
+ ///
+ /// If one of the values is `NAN`, then the other value is returned.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn max(self, other: Self) -> Self {
+ // TODO consider using an intrinsic
+ self.is_nan().select(
+ other,
+ self.lanes_le(other).select(other, self)
+ )
+ }
+
+ /// Restrict each lane to a certain interval unless it is NaN.
+ ///
+ /// For each lane in `self`, returns the corresponding lane in `max` if the lane is
+ /// greater than `max`, and the corresponding lane in `min` if the lane is less
+ /// than `min`. Otherwise returns the lane in `self`.
+ #[inline]
++ #[must_use = "method returns a new vector and does not mutate the original value"]
+ pub fn clamp(self, min: Self, max: Self) -> Self {
+ assert!(
+ min.lanes_le(max).all(),
+ "each lane in `min` must be less than or equal to the corresponding lane in `max`",
+ );
+ let mut x = self;
+ x = x.lanes_lt(min).select(min, x);
+ x = x.lanes_gt(max).select(max, x);
+ x
+ }
+ }
+ };
+}
+
+impl_float_vector! { f32, u32, i32 }
+impl_float_vector! { f64, u64, i64 }
+
+/// Vector of two `f32` values
+pub type f32x2 = Simd<f32, 2>;
+
+/// Vector of four `f32` values
+pub type f32x4 = Simd<f32, 4>;
+
+/// Vector of eight `f32` values
+pub type f32x8 = Simd<f32, 8>;
+
+/// Vector of 16 `f32` values
+pub type f32x16 = Simd<f32, 16>;
+
+/// Vector of two `f64` values
+pub type f64x2 = Simd<f64, 2>;
+
+/// Vector of four `f64` values
+pub type f64x4 = Simd<f64, 4>;
+
+/// Vector of eight `f64` values
+pub type f64x8 = Simd<f64, 8>;
--- /dev/null
- mem::transmute_copy(&{ x + (addend * mem::size_of::<T>()) })
+//! Private implementation details of public gather/scatter APIs.
+use crate::simd::{LaneCount, Simd, SupportedLaneCount};
+use core::mem;
+
+/// A vector of *const T.
+#[derive(Debug, Copy, Clone)]
+#[repr(simd)]
+pub(crate) struct SimdConstPtr<T, const LANES: usize>([*const T; LANES]);
+
+impl<T, const LANES: usize> SimdConstPtr<T, LANES>
+where
+ LaneCount<LANES>: SupportedLaneCount,
+ T: Sized,
+{
+ #[inline]
+ #[must_use]
+ pub fn splat(ptr: *const T) -> Self {
+ Self([ptr; LANES])
+ }
+
+ #[inline]
+ #[must_use]
+ pub fn wrapping_add(self, addend: Simd<usize, LANES>) -> Self {
+ unsafe {
+ let x: Simd<usize, LANES> = mem::transmute_copy(&self);
- mem::transmute_copy(&{ x + (addend * mem::size_of::<T>()) })
++ mem::transmute_copy(&{ x + (addend * Simd::splat(mem::size_of::<T>())) })
+ }
+ }
+}
+
+/// A vector of *mut T. Be very careful around potential aliasing.
+#[derive(Debug, Copy, Clone)]
+#[repr(simd)]
+pub(crate) struct SimdMutPtr<T, const LANES: usize>([*mut T; LANES]);
+
+impl<T, const LANES: usize> SimdMutPtr<T, LANES>
+where
+ LaneCount<LANES>: SupportedLaneCount,
+ T: Sized,
+{
+ #[inline]
+ #[must_use]
+ pub fn splat(ptr: *mut T) -> Self {
+ Self([ptr; LANES])
+ }
+
+ #[inline]
+ #[must_use]
+ pub fn wrapping_add(self, addend: Simd<usize, LANES>) -> Self {
+ unsafe {
+ let x: Simd<usize, LANES> = mem::transmute_copy(&self);
++ mem::transmute_copy(&{ x + (addend * Simd::splat(mem::size_of::<T>())) })
+ }
+ }
+}
--- /dev/null
- //from_transmute! { unsafe u8x64 => __m512i }
+use crate::simd::*;
+
+#[cfg(any(target_arch = "x86"))]
+use core::arch::x86::*;
+
+#[cfg(target_arch = "x86_64")]
+use core::arch::x86_64::*;
+
+from_transmute! { unsafe u8x16 => __m128i }
+from_transmute! { unsafe u8x32 => __m256i }
- //from_transmute! { unsafe i8x64 => __m512i }
++from_transmute! { unsafe u8x64 => __m512i }
+from_transmute! { unsafe i8x16 => __m128i }
+from_transmute! { unsafe i8x32 => __m256i }
++from_transmute! { unsafe i8x64 => __m512i }
+
+from_transmute! { unsafe u16x8 => __m128i }
+from_transmute! { unsafe u16x16 => __m256i }
+from_transmute! { unsafe u16x32 => __m512i }
+from_transmute! { unsafe i16x8 => __m128i }
+from_transmute! { unsafe i16x16 => __m256i }
+from_transmute! { unsafe i16x32 => __m512i }
+
+from_transmute! { unsafe u32x4 => __m128i }
+from_transmute! { unsafe u32x8 => __m256i }
+from_transmute! { unsafe u32x16 => __m512i }
+from_transmute! { unsafe i32x4 => __m128i }
+from_transmute! { unsafe i32x8 => __m256i }
+from_transmute! { unsafe i32x16 => __m512i }
+from_transmute! { unsafe f32x4 => __m128 }
+from_transmute! { unsafe f32x8 => __m256 }
+from_transmute! { unsafe f32x16 => __m512 }
+
+from_transmute! { unsafe u64x2 => __m128i }
+from_transmute! { unsafe u64x4 => __m256i }
+from_transmute! { unsafe u64x8 => __m512i }
+from_transmute! { unsafe i64x2 => __m128i }
+from_transmute! { unsafe i64x4 => __m256i }
+from_transmute! { unsafe i64x8 => __m512i }
+from_transmute! { unsafe f64x2 => __m128d }
+from_transmute! { unsafe f64x4 => __m256d }
+from_transmute! { unsafe f64x8 => __m512d }
+
+#[cfg(target_pointer_width = "32")]
+mod p32 {
+ use super::*;
+ from_transmute! { unsafe usizex4 => __m128i }
+ from_transmute! { unsafe usizex8 => __m256i }
+ from_transmute! { unsafe Simd<usize, 16> => __m512i }
+ from_transmute! { unsafe isizex4 => __m128i }
+ from_transmute! { unsafe isizex8 => __m256i }
+ from_transmute! { unsafe Simd<isize, 16> => __m512i }
+}
+
+#[cfg(target_pointer_width = "64")]
+mod p64 {
+ use super::*;
+ from_transmute! { unsafe usizex2 => __m128i }
+ from_transmute! { unsafe usizex4 => __m256i }
+ from_transmute! { unsafe usizex8 => __m512i }
+ from_transmute! { unsafe isizex2 => __m128i }
+ from_transmute! { unsafe isizex4 => __m256i }
+ from_transmute! { unsafe isizex8 => __m512i }
+}
--- /dev/null
--- /dev/null
++// Test that we handle all our "auto-deref" cases correctly.
++#![feature(portable_simd)]
++use core_simd::f32x4;
++
++#[cfg(target_arch = "wasm32")]
++use wasm_bindgen_test::*;
++
++#[cfg(target_arch = "wasm32")]
++wasm_bindgen_test_configure!(run_in_browser);
++
++#[test]
++#[cfg_attr(target_arch = "wasm32", wasm_bindgen_test)]
++fn deref() {
++ let x = f32x4::splat(1.0);
++ let y = f32x4::splat(2.0);
++ let a = &x;
++ let b = &y;
++ assert_eq!(f32x4::splat(3.0), x + y);
++ assert_eq!(f32x4::splat(3.0), x + b);
++ assert_eq!(f32x4::splat(3.0), a + y);
++ assert_eq!(f32x4::splat(3.0), a + b);
++}
--- /dev/null
- fn scalar_rhs<const LANES: usize>() {
- test_helpers::test_binary_scalar_rhs_elementwise(
- &<Simd<$scalar, LANES> as core::ops::$trait<$scalar>>::$fn,
- &$scalar_fn,
- &|_, _| true,
- );
- }
-
- fn scalar_lhs<const LANES: usize>() {
- test_helpers::test_binary_scalar_lhs_elementwise(
- &<$scalar as core::ops::$trait<Simd<$scalar, LANES>>>::$fn,
- &$scalar_fn,
- &|_, _| true,
- );
- }
-
+/// Implements a test on a unary operation using proptest.
+///
+/// Compares the vector operation to the equivalent scalar operation.
+#[macro_export]
+macro_rules! impl_unary_op_test {
+ { $scalar:ty, $trait:ident :: $fn:ident, $scalar_fn:expr } => {
+ test_helpers::test_lanes! {
+ fn $fn<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &<core_simd::Simd<$scalar, LANES> as core::ops::$trait>::$fn,
+ &$scalar_fn,
+ &|_| true,
+ );
+ }
+ }
+ };
+ { $scalar:ty, $trait:ident :: $fn:ident } => {
+ impl_unary_op_test! { $scalar, $trait::$fn, <$scalar as core::ops::$trait>::$fn }
+ };
+}
+
+/// Implements a test on a binary operation using proptest.
+///
+/// Compares the vector operation to the equivalent scalar operation.
+#[macro_export]
+macro_rules! impl_binary_op_test {
+ { $scalar:ty, $trait:ident :: $fn:ident, $trait_assign:ident :: $fn_assign:ident, $scalar_fn:expr } => {
+ mod $fn {
+ use super::*;
+ use core_simd::Simd;
+
+ test_helpers::test_lanes! {
+ fn normal<const LANES: usize>() {
+ test_helpers::test_binary_elementwise(
+ &<Simd<$scalar, LANES> as core::ops::$trait>::$fn,
+ &$scalar_fn,
+ &|_, _| true,
+ );
+ }
+
-
- fn assign_scalar_rhs<const LANES: usize>() {
- test_helpers::test_binary_scalar_rhs_elementwise(
- &|mut a, b| { <Simd<$scalar, LANES> as core::ops::$trait_assign<$scalar>>::$fn_assign(&mut a, b); a },
- &$scalar_fn,
- &|_, _| true,
- );
- }
+ fn assign<const LANES: usize>() {
+ test_helpers::test_binary_elementwise(
+ &|mut a, b| { <Simd<$scalar, LANES> as core::ops::$trait_assign>::$fn_assign(&mut a, b); a },
+ &$scalar_fn,
+ &|_, _| true,
+ );
+ }
- fn scalar_rhs<const LANES: usize>() {
- test_helpers::test_binary_scalar_rhs_elementwise(
- &<Simd<$scalar, LANES> as core::ops::$trait<$scalar>>::$fn,
- &$scalar_fn,
- &|x, y| x.iter().all(|x| $check_fn(*x, y)),
- );
- }
-
- fn scalar_lhs<const LANES: usize>() {
- test_helpers::test_binary_scalar_lhs_elementwise(
- &<$scalar as core::ops::$trait<Simd<$scalar, LANES>>>::$fn,
- &$scalar_fn,
- &|x, y| y.iter().all(|y| $check_fn(x, *y)),
- );
- }
-
+ }
+ }
+ };
+ { $scalar:ty, $trait:ident :: $fn:ident, $trait_assign:ident :: $fn_assign:ident } => {
+ impl_binary_op_test! { $scalar, $trait::$fn, $trait_assign::$fn_assign, <$scalar as core::ops::$trait>::$fn }
+ };
+}
+
+/// Implements a test on a binary operation using proptest.
+///
+/// Like `impl_binary_op_test`, but allows providing a function for rejecting particular inputs
+/// (like the `proptest_assume` macro).
+///
+/// Compares the vector operation to the equivalent scalar operation.
+#[macro_export]
+macro_rules! impl_binary_checked_op_test {
+ { $scalar:ty, $trait:ident :: $fn:ident, $trait_assign:ident :: $fn_assign:ident, $scalar_fn:expr, $check_fn:expr } => {
+ mod $fn {
+ use super::*;
+ use core_simd::Simd;
+
+ test_helpers::test_lanes! {
+ fn normal<const LANES: usize>() {
+ test_helpers::test_binary_elementwise(
+ &<Simd<$scalar, LANES> as core::ops::$trait>::$fn,
+ &$scalar_fn,
+ &|x, y| x.iter().zip(y.iter()).all(|(x, y)| $check_fn(*x, *y)),
+ );
+ }
+
-
- fn assign_scalar_rhs<const LANES: usize>() {
- test_helpers::test_binary_scalar_rhs_elementwise(
- &|mut a, b| { <Simd<$scalar, LANES> as core::ops::$trait_assign<$scalar>>::$fn_assign(&mut a, b); a },
- &$scalar_fn,
- &|x, y| x.iter().all(|x| $check_fn(*x, y)),
- )
- }
+ fn assign<const LANES: usize>() {
+ test_helpers::test_binary_elementwise(
+ &|mut a, b| { <Simd<$scalar, LANES> as core::ops::$trait_assign>::$fn_assign(&mut a, b); a },
+ &$scalar_fn,
+ &|x, y| x.iter().zip(y.iter()).all(|(x, y)| $check_fn(*x, *y)),
+ )
+ }
+ }
+ }
+ };
+ { $scalar:ty, $trait:ident :: $fn:ident, $trait_assign:ident :: $fn_assign:ident, $check_fn:expr } => {
+ impl_binary_checked_op_test! { $scalar, $trait::$fn, $trait_assign::$fn_assign, <$scalar as core::ops::$trait>::$fn, $check_fn }
+ };
+}
+
+#[macro_export]
+macro_rules! impl_common_integer_tests {
+ { $vector:ident, $scalar:ident } => {
+ test_helpers::test_lanes! {
+ fn horizontal_sum<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ $vector::<LANES>::from_array(x).horizontal_sum(),
+ x.iter().copied().fold(0 as $scalar, $scalar::wrapping_add),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_product<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ $vector::<LANES>::from_array(x).horizontal_product(),
+ x.iter().copied().fold(1 as $scalar, $scalar::wrapping_mul),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_and<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ $vector::<LANES>::from_array(x).horizontal_and(),
+ x.iter().copied().fold(-1i8 as $scalar, <$scalar as core::ops::BitAnd>::bitand),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_or<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ $vector::<LANES>::from_array(x).horizontal_or(),
+ x.iter().copied().fold(0 as $scalar, <$scalar as core::ops::BitOr>::bitor),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_xor<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ $vector::<LANES>::from_array(x).horizontal_xor(),
+ x.iter().copied().fold(0 as $scalar, <$scalar as core::ops::BitXor>::bitxor),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_max<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ $vector::<LANES>::from_array(x).horizontal_max(),
+ x.iter().copied().max().unwrap(),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_min<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ $vector::<LANES>::from_array(x).horizontal_min(),
+ x.iter().copied().min().unwrap(),
+ );
+ Ok(())
+ });
+ }
+ }
+ }
+}
+
+/// Implement tests for signed integers.
+#[macro_export]
+macro_rules! impl_signed_tests {
+ { $scalar:tt } => {
+ mod $scalar {
+ type Vector<const LANES: usize> = core_simd::Simd<Scalar, LANES>;
+ type Scalar = $scalar;
+
+ impl_common_integer_tests! { Vector, Scalar }
+
+ test_helpers::test_lanes! {
+ fn neg<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &<Vector::<LANES> as core::ops::Neg>::neg,
+ &<Scalar as core::ops::Neg>::neg,
+ &|x| !x.contains(&Scalar::MIN),
+ );
+ }
+
+ fn is_positive<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_positive,
+ &Scalar::is_positive,
+ &|_| true,
+ );
+ }
+
+ fn is_negative<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_negative,
+ &Scalar::is_negative,
+ &|_| true,
+ );
+ }
+
+ fn signum<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &Vector::<LANES>::signum,
+ &Scalar::signum,
+ &|_| true,
+ )
+ }
+
+ }
+
+ test_helpers::test_lanes_panic! {
+ fn div_min_overflow_panics<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(Scalar::MIN);
+ let b = Vector::<LANES>::splat(-1);
+ let _ = a / b;
+ }
+
+ fn div_by_all_zeros_panics<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(42);
+ let b = Vector::<LANES>::splat(0);
+ let _ = a / b;
+ }
+
+ fn div_by_one_zero_panics<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(42);
+ let mut b = Vector::<LANES>::splat(21);
+ b[0] = 0 as _;
+ let _ = a / b;
+ }
+
+ fn rem_min_overflow_panic<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(Scalar::MIN);
+ let b = Vector::<LANES>::splat(-1);
+ let _ = a % b;
+ }
+
+ fn rem_zero_panic<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(42);
+ let b = Vector::<LANES>::splat(0);
+ let _ = a % b;
+ }
+ }
+
+ test_helpers::test_lanes! {
+ fn div_neg_one_no_panic<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(42);
+ let b = Vector::<LANES>::splat(-1);
+ let _ = a / b;
+ }
+
+ fn rem_neg_one_no_panic<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(42);
+ let b = Vector::<LANES>::splat(-1);
+ let _ = a % b;
+ }
+ }
+
+ impl_binary_op_test!(Scalar, Add::add, AddAssign::add_assign, Scalar::wrapping_add);
+ impl_binary_op_test!(Scalar, Sub::sub, SubAssign::sub_assign, Scalar::wrapping_sub);
+ impl_binary_op_test!(Scalar, Mul::mul, MulAssign::mul_assign, Scalar::wrapping_mul);
+
+ // Exclude Div and Rem panicking cases
+ impl_binary_checked_op_test!(Scalar, Div::div, DivAssign::div_assign, Scalar::wrapping_div, |x, y| y != 0 && !(x == Scalar::MIN && y == -1));
+ impl_binary_checked_op_test!(Scalar, Rem::rem, RemAssign::rem_assign, Scalar::wrapping_rem, |x, y| y != 0 && !(x == Scalar::MIN && y == -1));
+
+ impl_unary_op_test!(Scalar, Not::not);
+ impl_binary_op_test!(Scalar, BitAnd::bitand, BitAndAssign::bitand_assign);
+ impl_binary_op_test!(Scalar, BitOr::bitor, BitOrAssign::bitor_assign);
+ impl_binary_op_test!(Scalar, BitXor::bitxor, BitXorAssign::bitxor_assign);
+ }
+ }
+}
+
+/// Implement tests for unsigned integers.
+#[macro_export]
+macro_rules! impl_unsigned_tests {
+ { $scalar:tt } => {
+ mod $scalar {
+ type Vector<const LANES: usize> = core_simd::Simd<Scalar, LANES>;
+ type Scalar = $scalar;
+
+ impl_common_integer_tests! { Vector, Scalar }
+
+ test_helpers::test_lanes_panic! {
+ fn rem_zero_panic<const LANES: usize>() {
+ let a = Vector::<LANES>::splat(42);
+ let b = Vector::<LANES>::splat(0);
+ let _ = a % b;
+ }
+ }
+
+ impl_binary_op_test!(Scalar, Add::add, AddAssign::add_assign, Scalar::wrapping_add);
+ impl_binary_op_test!(Scalar, Sub::sub, SubAssign::sub_assign, Scalar::wrapping_sub);
+ impl_binary_op_test!(Scalar, Mul::mul, MulAssign::mul_assign, Scalar::wrapping_mul);
+
+ // Exclude Div and Rem panicking cases
+ impl_binary_checked_op_test!(Scalar, Div::div, DivAssign::div_assign, Scalar::wrapping_div, |_, y| y != 0);
+ impl_binary_checked_op_test!(Scalar, Rem::rem, RemAssign::rem_assign, Scalar::wrapping_rem, |_, y| y != 0);
+
+ impl_unary_op_test!(Scalar, Not::not);
+ impl_binary_op_test!(Scalar, BitAnd::bitand, BitAndAssign::bitand_assign);
+ impl_binary_op_test!(Scalar, BitOr::bitor, BitOrAssign::bitor_assign);
+ impl_binary_op_test!(Scalar, BitXor::bitxor, BitXorAssign::bitxor_assign);
+ }
+ }
+}
+
+/// Implement tests for floating point numbers.
+#[macro_export]
+macro_rules! impl_float_tests {
+ { $scalar:tt, $int_scalar:tt } => {
+ mod $scalar {
+ type Vector<const LANES: usize> = core_simd::Simd<Scalar, LANES>;
+ type Scalar = $scalar;
+
+ impl_unary_op_test!(Scalar, Neg::neg);
+ impl_binary_op_test!(Scalar, Add::add, AddAssign::add_assign);
+ impl_binary_op_test!(Scalar, Sub::sub, SubAssign::sub_assign);
+ impl_binary_op_test!(Scalar, Mul::mul, MulAssign::mul_assign);
+ impl_binary_op_test!(Scalar, Div::div, DivAssign::div_assign);
+ impl_binary_op_test!(Scalar, Rem::rem, RemAssign::rem_assign);
+
+ test_helpers::test_lanes! {
+ fn is_sign_positive<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_sign_positive,
+ &Scalar::is_sign_positive,
+ &|_| true,
+ );
+ }
+
+ fn is_sign_negative<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_sign_negative,
+ &Scalar::is_sign_negative,
+ &|_| true,
+ );
+ }
+
+ fn is_finite<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_finite,
+ &Scalar::is_finite,
+ &|_| true,
+ );
+ }
+
+ fn is_infinite<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_infinite,
+ &Scalar::is_infinite,
+ &|_| true,
+ );
+ }
+
+ fn is_nan<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_nan,
+ &Scalar::is_nan,
+ &|_| true,
+ );
+ }
+
+ fn is_normal<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_normal,
+ &Scalar::is_normal,
+ &|_| true,
+ );
+ }
+
+ fn is_subnormal<const LANES: usize>() {
+ test_helpers::test_unary_mask_elementwise(
+ &Vector::<LANES>::is_subnormal,
+ &Scalar::is_subnormal,
+ &|_| true,
+ );
+ }
+
+ fn abs<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &Vector::<LANES>::abs,
+ &Scalar::abs,
+ &|_| true,
+ )
+ }
+
+ fn recip<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &Vector::<LANES>::recip,
+ &Scalar::recip,
+ &|_| true,
+ )
+ }
+
+ fn to_degrees<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &Vector::<LANES>::to_degrees,
+ &Scalar::to_degrees,
+ &|_| true,
+ )
+ }
+
+ fn to_radians<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &Vector::<LANES>::to_radians,
+ &Scalar::to_radians,
+ &|_| true,
+ )
+ }
+
+ fn signum<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &Vector::<LANES>::signum,
+ &Scalar::signum,
+ &|_| true,
+ )
+ }
+
+ fn copysign<const LANES: usize>() {
+ test_helpers::test_binary_elementwise(
+ &Vector::<LANES>::copysign,
+ &Scalar::copysign,
+ &|_, _| true,
+ )
+ }
+
+ fn min<const LANES: usize>() {
+ // Regular conditions (both values aren't zero)
+ test_helpers::test_binary_elementwise(
+ &Vector::<LANES>::min,
+ &Scalar::min,
+ // Reject the case where both values are zero with different signs
+ &|a, b| {
+ for (a, b) in a.iter().zip(b.iter()) {
+ if *a == 0. && *b == 0. && a.signum() != b.signum() {
+ return false;
+ }
+ }
+ true
+ }
+ );
+
+ // Special case where both values are zero
+ let p_zero = Vector::<LANES>::splat(0.);
+ let n_zero = Vector::<LANES>::splat(-0.);
+ assert!(p_zero.min(n_zero).to_array().iter().all(|x| *x == 0.));
+ assert!(n_zero.min(p_zero).to_array().iter().all(|x| *x == 0.));
+ }
+
+ fn max<const LANES: usize>() {
+ // Regular conditions (both values aren't zero)
+ test_helpers::test_binary_elementwise(
+ &Vector::<LANES>::max,
+ &Scalar::max,
+ // Reject the case where both values are zero with different signs
+ &|a, b| {
+ for (a, b) in a.iter().zip(b.iter()) {
+ if *a == 0. && *b == 0. && a.signum() != b.signum() {
+ return false;
+ }
+ }
+ true
+ }
+ );
+
+ // Special case where both values are zero
+ let p_zero = Vector::<LANES>::splat(0.);
+ let n_zero = Vector::<LANES>::splat(-0.);
+ assert!(p_zero.max(n_zero).to_array().iter().all(|x| *x == 0.));
+ assert!(n_zero.max(p_zero).to_array().iter().all(|x| *x == 0.));
+ }
+
+ fn clamp<const LANES: usize>() {
+ test_helpers::test_3(&|value: [Scalar; LANES], mut min: [Scalar; LANES], mut max: [Scalar; LANES]| {
+ for (min, max) in min.iter_mut().zip(max.iter_mut()) {
+ if max < min {
+ core::mem::swap(min, max);
+ }
+ if min.is_nan() {
+ *min = Scalar::NEG_INFINITY;
+ }
+ if max.is_nan() {
+ *max = Scalar::INFINITY;
+ }
+ }
+
+ let mut result_scalar = [Scalar::default(); LANES];
+ for i in 0..LANES {
+ result_scalar[i] = value[i].clamp(min[i], max[i]);
+ }
+ let result_vector = Vector::from_array(value).clamp(min.into(), max.into()).to_array();
+ test_helpers::prop_assert_biteq!(result_scalar, result_vector);
+ Ok(())
+ })
+ }
+
+ fn horizontal_sum<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ Vector::<LANES>::from_array(x).horizontal_sum(),
+ x.iter().sum(),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_product<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ test_helpers::prop_assert_biteq! (
+ Vector::<LANES>::from_array(x).horizontal_product(),
+ x.iter().product(),
+ );
+ Ok(())
+ });
+ }
+
+ fn horizontal_max<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ let vmax = Vector::<LANES>::from_array(x).horizontal_max();
+ let smax = x.iter().copied().fold(Scalar::NAN, Scalar::max);
+ // 0 and -0 are treated the same
+ if !(x.contains(&0.) && x.contains(&-0.) && vmax.abs() == 0. && smax.abs() == 0.) {
+ test_helpers::prop_assert_biteq!(vmax, smax);
+ }
+ Ok(())
+ });
+ }
+
+ fn horizontal_min<const LANES: usize>() {
+ test_helpers::test_1(&|x| {
+ let vmax = Vector::<LANES>::from_array(x).horizontal_min();
+ let smax = x.iter().copied().fold(Scalar::NAN, Scalar::min);
+ // 0 and -0 are treated the same
+ if !(x.contains(&0.) && x.contains(&-0.) && vmax.abs() == 0. && smax.abs() == 0.) {
+ test_helpers::prop_assert_biteq!(vmax, smax);
+ }
+ Ok(())
+ });
+ }
+ }
+
+ #[cfg(feature = "std")]
+ mod std {
+ use super::*;
+ test_helpers::test_lanes! {
+ fn sqrt<const LANES: usize>() {
+ test_helpers::test_unary_elementwise(
+ &Vector::<LANES>::sqrt,
+ &Scalar::sqrt,
+ &|_| true,
+ )
+ }
+
+ fn mul_add<const LANES: usize>() {
+ test_helpers::test_ternary_elementwise(
+ &Vector::<LANES>::mul_add,
+ &Scalar::mul_add,
+ &|_, _, _| true,
+ )
+ }
+ }
+ }
+ }
+ }
+}
--- /dev/null
+pub mod array;
+
+#[cfg(target_arch = "wasm32")]
+pub mod wasm;
+
+#[macro_use]
+pub mod biteq;
+
+/// Specifies the default strategy for testing a type.
+///
+/// This strategy should be what "makes sense" to test.
+pub trait DefaultStrategy {
+ type Strategy: proptest::strategy::Strategy<Value = Self>;
+ fn default_strategy() -> Self::Strategy;
+}
+
+macro_rules! impl_num {
+ { $type:tt } => {
+ impl DefaultStrategy for $type {
+ type Strategy = proptest::num::$type::Any;
+ fn default_strategy() -> Self::Strategy {
+ proptest::num::$type::ANY
+ }
+ }
+ }
+}
+
+impl_num! { i8 }
+impl_num! { i16 }
+impl_num! { i32 }
+impl_num! { i64 }
+impl_num! { isize }
+impl_num! { u8 }
+impl_num! { u16 }
+impl_num! { u32 }
+impl_num! { u64 }
+impl_num! { usize }
+impl_num! { f32 }
+impl_num! { f64 }
+
+#[cfg(not(target_arch = "wasm32"))]
+impl DefaultStrategy for u128 {
+ type Strategy = proptest::num::u128::Any;
+ fn default_strategy() -> Self::Strategy {
+ proptest::num::u128::ANY
+ }
+}
+
+#[cfg(not(target_arch = "wasm32"))]
+impl DefaultStrategy for i128 {
+ type Strategy = proptest::num::i128::Any;
+ fn default_strategy() -> Self::Strategy {
+ proptest::num::i128::ANY
+ }
+}
+
+#[cfg(target_arch = "wasm32")]
+impl DefaultStrategy for u128 {
+ type Strategy = crate::wasm::u128::Any;
+ fn default_strategy() -> Self::Strategy {
+ crate::wasm::u128::ANY
+ }
+}
+
+#[cfg(target_arch = "wasm32")]
+impl DefaultStrategy for i128 {
+ type Strategy = crate::wasm::i128::Any;
+ fn default_strategy() -> Self::Strategy {
+ crate::wasm::i128::ANY
+ }
+}
+
+impl<T: core::fmt::Debug + DefaultStrategy, const LANES: usize> DefaultStrategy for [T; LANES] {
+ type Strategy = crate::array::UniformArrayStrategy<T::Strategy, Self>;
+ fn default_strategy() -> Self::Strategy {
+ Self::Strategy::new(T::default_strategy())
+ }
+}
+
+/// Test a function that takes a single value.
+pub fn test_1<A: core::fmt::Debug + DefaultStrategy>(
+ f: &dyn Fn(A) -> proptest::test_runner::TestCaseResult,
+) {
+ let mut runner = proptest::test_runner::TestRunner::default();
+ runner.run(&A::default_strategy(), f).unwrap();
+}
+
+/// Test a function that takes two values.
+pub fn test_2<A: core::fmt::Debug + DefaultStrategy, B: core::fmt::Debug + DefaultStrategy>(
+ f: &dyn Fn(A, B) -> proptest::test_runner::TestCaseResult,
+) {
+ let mut runner = proptest::test_runner::TestRunner::default();
+ runner
+ .run(&(A::default_strategy(), B::default_strategy()), |(a, b)| {
+ f(a, b)
+ })
+ .unwrap();
+}
+
+/// Test a function that takes two values.
+pub fn test_3<
+ A: core::fmt::Debug + DefaultStrategy,
+ B: core::fmt::Debug + DefaultStrategy,
+ C: core::fmt::Debug + DefaultStrategy,
+>(
+ f: &dyn Fn(A, B, C) -> proptest::test_runner::TestCaseResult,
+) {
+ let mut runner = proptest::test_runner::TestRunner::default();
+ runner
+ .run(
+ &(
+ A::default_strategy(),
+ B::default_strategy(),
+ C::default_strategy(),
+ ),
+ |(a, b, c)| f(a, b, c),
+ )
+ .unwrap();
+}
+
+/// Test a unary vector function against a unary scalar function, applied elementwise.
+#[inline(never)]
+pub fn test_unary_elementwise<Scalar, ScalarResult, Vector, VectorResult, const LANES: usize>(
+ fv: &dyn Fn(Vector) -> VectorResult,
+ fs: &dyn Fn(Scalar) -> ScalarResult,
+ check: &dyn Fn([Scalar; LANES]) -> bool,
+) where
+ Scalar: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ ScalarResult: Copy + Default + biteq::BitEq + core::fmt::Debug + DefaultStrategy,
+ Vector: Into<[Scalar; LANES]> + From<[Scalar; LANES]> + Copy,
+ VectorResult: Into<[ScalarResult; LANES]> + From<[ScalarResult; LANES]> + Copy,
+{
+ test_1(&|x: [Scalar; LANES]| {
+ proptest::prop_assume!(check(x));
+ let result_1: [ScalarResult; LANES] = fv(x.into()).into();
+ let result_2: [ScalarResult; LANES] = {
+ let mut result = [ScalarResult::default(); LANES];
+ for (i, o) in x.iter().zip(result.iter_mut()) {
+ *o = fs(*i);
+ }
+ result
+ };
+ crate::prop_assert_biteq!(result_1, result_2);
+ Ok(())
+ });
+}
+
+/// Test a unary vector function against a unary scalar function, applied elementwise.
+#[inline(never)]
+pub fn test_unary_mask_elementwise<Scalar, Vector, Mask, const LANES: usize>(
+ fv: &dyn Fn(Vector) -> Mask,
+ fs: &dyn Fn(Scalar) -> bool,
+ check: &dyn Fn([Scalar; LANES]) -> bool,
+) where
+ Scalar: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ Vector: Into<[Scalar; LANES]> + From<[Scalar; LANES]> + Copy,
+ Mask: Into<[bool; LANES]> + From<[bool; LANES]> + Copy,
+{
+ test_1(&|x: [Scalar; LANES]| {
+ proptest::prop_assume!(check(x));
+ let result_1: [bool; LANES] = fv(x.into()).into();
+ let result_2: [bool; LANES] = {
+ let mut result = [false; LANES];
+ for (i, o) in x.iter().zip(result.iter_mut()) {
+ *o = fs(*i);
+ }
+ result
+ };
+ crate::prop_assert_biteq!(result_1, result_2);
+ Ok(())
+ });
+}
+
+/// Test a binary vector function against a binary scalar function, applied elementwise.
+#[inline(never)]
+pub fn test_binary_elementwise<
+ Scalar1,
+ Scalar2,
+ ScalarResult,
+ Vector1,
+ Vector2,
+ VectorResult,
+ const LANES: usize,
+>(
+ fv: &dyn Fn(Vector1, Vector2) -> VectorResult,
+ fs: &dyn Fn(Scalar1, Scalar2) -> ScalarResult,
+ check: &dyn Fn([Scalar1; LANES], [Scalar2; LANES]) -> bool,
+) where
+ Scalar1: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ Scalar2: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ ScalarResult: Copy + Default + biteq::BitEq + core::fmt::Debug + DefaultStrategy,
+ Vector1: Into<[Scalar1; LANES]> + From<[Scalar1; LANES]> + Copy,
+ Vector2: Into<[Scalar2; LANES]> + From<[Scalar2; LANES]> + Copy,
+ VectorResult: Into<[ScalarResult; LANES]> + From<[ScalarResult; LANES]> + Copy,
+{
+ test_2(&|x: [Scalar1; LANES], y: [Scalar2; LANES]| {
+ proptest::prop_assume!(check(x, y));
+ let result_1: [ScalarResult; LANES] = fv(x.into(), y.into()).into();
+ let result_2: [ScalarResult; LANES] = {
+ let mut result = [ScalarResult::default(); LANES];
+ for ((i1, i2), o) in x.iter().zip(y.iter()).zip(result.iter_mut()) {
+ *o = fs(*i1, *i2);
+ }
+ result
+ };
+ crate::prop_assert_biteq!(result_1, result_2);
+ Ok(())
+ });
+}
+
+/// Test a binary vector-scalar function against a binary scalar function, applied elementwise.
+#[inline(never)]
+pub fn test_binary_scalar_rhs_elementwise<
+ Scalar1,
+ Scalar2,
+ ScalarResult,
+ Vector,
+ VectorResult,
+ const LANES: usize,
+>(
+ fv: &dyn Fn(Vector, Scalar2) -> VectorResult,
+ fs: &dyn Fn(Scalar1, Scalar2) -> ScalarResult,
+ check: &dyn Fn([Scalar1; LANES], Scalar2) -> bool,
+) where
+ Scalar1: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ Scalar2: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ ScalarResult: Copy + Default + biteq::BitEq + core::fmt::Debug + DefaultStrategy,
+ Vector: Into<[Scalar1; LANES]> + From<[Scalar1; LANES]> + Copy,
+ VectorResult: Into<[ScalarResult; LANES]> + From<[ScalarResult; LANES]> + Copy,
+{
+ test_2(&|x: [Scalar1; LANES], y: Scalar2| {
+ proptest::prop_assume!(check(x, y));
+ let result_1: [ScalarResult; LANES] = fv(x.into(), y).into();
+ let result_2: [ScalarResult; LANES] = {
+ let mut result = [ScalarResult::default(); LANES];
+ for (i, o) in x.iter().zip(result.iter_mut()) {
+ *o = fs(*i, y);
+ }
+ result
+ };
+ crate::prop_assert_biteq!(result_1, result_2);
+ Ok(())
+ });
+}
+
+/// Test a binary vector-scalar function against a binary scalar function, applied elementwise.
+#[inline(never)]
+pub fn test_binary_scalar_lhs_elementwise<
+ Scalar1,
+ Scalar2,
+ ScalarResult,
+ Vector,
+ VectorResult,
+ const LANES: usize,
+>(
+ fv: &dyn Fn(Scalar1, Vector) -> VectorResult,
+ fs: &dyn Fn(Scalar1, Scalar2) -> ScalarResult,
+ check: &dyn Fn(Scalar1, [Scalar2; LANES]) -> bool,
+) where
+ Scalar1: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ Scalar2: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ ScalarResult: Copy + Default + biteq::BitEq + core::fmt::Debug + DefaultStrategy,
+ Vector: Into<[Scalar2; LANES]> + From<[Scalar2; LANES]> + Copy,
+ VectorResult: Into<[ScalarResult; LANES]> + From<[ScalarResult; LANES]> + Copy,
+{
+ test_2(&|x: Scalar1, y: [Scalar2; LANES]| {
+ proptest::prop_assume!(check(x, y));
+ let result_1: [ScalarResult; LANES] = fv(x, y.into()).into();
+ let result_2: [ScalarResult; LANES] = {
+ let mut result = [ScalarResult::default(); LANES];
+ for (i, o) in y.iter().zip(result.iter_mut()) {
+ *o = fs(x, *i);
+ }
+ result
+ };
+ crate::prop_assert_biteq!(result_1, result_2);
+ Ok(())
+ });
+}
+
+/// Test a ternary vector function against a ternary scalar function, applied elementwise.
+#[inline(never)]
+pub fn test_ternary_elementwise<
+ Scalar1,
+ Scalar2,
+ Scalar3,
+ ScalarResult,
+ Vector1,
+ Vector2,
+ Vector3,
+ VectorResult,
+ const LANES: usize,
+>(
+ fv: &dyn Fn(Vector1, Vector2, Vector3) -> VectorResult,
+ fs: &dyn Fn(Scalar1, Scalar2, Scalar3) -> ScalarResult,
+ check: &dyn Fn([Scalar1; LANES], [Scalar2; LANES], [Scalar3; LANES]) -> bool,
+) where
+ Scalar1: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ Scalar2: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ Scalar3: Copy + Default + core::fmt::Debug + DefaultStrategy,
+ ScalarResult: Copy + Default + biteq::BitEq + core::fmt::Debug + DefaultStrategy,
+ Vector1: Into<[Scalar1; LANES]> + From<[Scalar1; LANES]> + Copy,
+ Vector2: Into<[Scalar2; LANES]> + From<[Scalar2; LANES]> + Copy,
+ Vector3: Into<[Scalar3; LANES]> + From<[Scalar3; LANES]> + Copy,
+ VectorResult: Into<[ScalarResult; LANES]> + From<[ScalarResult; LANES]> + Copy,
+{
+ test_3(
+ &|x: [Scalar1; LANES], y: [Scalar2; LANES], z: [Scalar3; LANES]| {
+ proptest::prop_assume!(check(x, y, z));
+ let result_1: [ScalarResult; LANES] = fv(x.into(), y.into(), z.into()).into();
+ let result_2: [ScalarResult; LANES] = {
+ let mut result = [ScalarResult::default(); LANES];
+ for ((i1, (i2, i3)), o) in
+ x.iter().zip(y.iter().zip(z.iter())).zip(result.iter_mut())
+ {
+ *o = fs(*i1, *i2, *i3);
+ }
+ result
+ };
+ crate::prop_assert_biteq!(result_1, result_2);
+ Ok(())
+ },
+ );
+}
+
+/// Expand a const-generic test into separate tests for each possible lane count.
+#[macro_export]
+macro_rules! test_lanes {
+ {
+ $(fn $test:ident<const $lanes:ident: usize>() $body:tt)*
+ } => {
+ $(
+ mod $test {
+ use super::*;
+
+ fn implementation<const $lanes: usize>()
+ where
+ core_simd::LaneCount<$lanes>: core_simd::SupportedLaneCount,
+ $body
+
+ #[cfg(target_arch = "wasm32")]
+ wasm_bindgen_test::wasm_bindgen_test_configure!(run_in_browser);
+
+ #[test]
+ #[cfg_attr(target_arch = "wasm32", wasm_bindgen_test::wasm_bindgen_test)]
+ fn lanes_1() {
+ implementation::<1>();
+ }
+
+ #[test]
+ #[cfg_attr(target_arch = "wasm32", wasm_bindgen_test::wasm_bindgen_test)]
+ fn lanes_2() {
+ implementation::<2>();
+ }
+
+ #[test]
+ #[cfg_attr(target_arch = "wasm32", wasm_bindgen_test::wasm_bindgen_test)]
+ fn lanes_4() {
+ implementation::<4>();
+ }
+
+ #[test]
+ #[cfg_attr(target_arch = "wasm32", wasm_bindgen_test::wasm_bindgen_test)]
+ fn lanes_8() {
+ implementation::<8>();
+ }
+
+ #[test]
+ #[cfg_attr(target_arch = "wasm32", wasm_bindgen_test::wasm_bindgen_test)]
+ fn lanes_16() {
+ implementation::<16>();
+ }
+
+ #[test]
+ #[cfg_attr(target_arch = "wasm32", wasm_bindgen_test::wasm_bindgen_test)]
+ fn lanes_32() {
+ implementation::<32>();
+ }
++
++ #[test]
++ #[cfg_attr(target_arch = "wasm32", wasm_bindgen_test::wasm_bindgen_test)]
++ fn lanes_64() {
++ implementation::<64>();
++ }
+ }
+ )*
+ }
+}
+
+/// Expand a const-generic `#[should_panic]` test into separate tests for each possible lane count.
+#[macro_export]
+macro_rules! test_lanes_panic {
+ {
+ $(fn $test:ident<const $lanes:ident: usize>() $body:tt)*
+ } => {
+ $(
+ mod $test {
+ use super::*;
+
+ fn implementation<const $lanes: usize>()
+ where
+ core_simd::LaneCount<$lanes>: core_simd::SupportedLaneCount,
+ $body
+
+ #[test]
+ #[should_panic]
+ fn lanes_1() {
+ implementation::<1>();
+ }
+
+ #[test]
+ #[should_panic]
+ fn lanes_2() {
+ implementation::<2>();
+ }
+
+ #[test]
+ #[should_panic]
+ fn lanes_4() {
+ implementation::<4>();
+ }
+
+ #[test]
+ #[should_panic]
+ fn lanes_8() {
+ implementation::<8>();
+ }
+
+ #[test]
+ #[should_panic]
+ fn lanes_16() {
+ implementation::<16>();
+ }
+
+ #[test]
+ #[should_panic]
+ fn lanes_32() {
+ implementation::<32>();
+ }
++
++ #[test]
++ #[should_panic]
++ fn lanes_64() {
++ implementation::<64>();
++ }
+ }
+ )*
+ }
+}