Data-Parallel Math

This library implements a feature-complete version of the C++ data-parallel math library TS with optional extensions.

SIMD intrinsics support

DPM currently supports the following architectures for SIMD vectorization:

x86
- SSE
- SSE2
- SSE3
- SSSE3
- SSE4.1
- SSE4.2
- AVX
- AVX2
- FMA
- AVX512 (see notes)

On architectures without SIMD support, vectorization is emulated via scalar operations.

Library options

#define macro	CMake option	Default value	Description
DPM_INLINE_EXTENSIONS	-DDPM_INLINE_EXTENSIONS	ON	Toggles inlining of the library extension namespace (see notes)
DPM_HANDLE_ERRORS	-DDPM_HANDLE_ERRORS	ON	Toggles detection of math errors & reporting via math_errhandling (see notes)
DPM_PROPAGATE_NAN	-DDPM_PROPAGATE_NAN	ON	Toggles guaranteed propagation of NaN (see notes)
DPM_USE_SVML	-DDPM_USE_SVML	OFF	Enables use of math functions provided by SVML (see notes)
N/A	-DDPM_BUILD_SHARED	ON	Toggles build of shared library target
N/A	-DDPM_BUILD_STATIC	ON	Toggles build of static library target
N/A	-DDPM_USE_IPO	ON	Toggles support for inter-procedural optimization
N/A	-DDPM_TESTS	OFF	Enables unit test target

Note that the default value applies only to CMake options.

Build & usage instructions

In order to build the library using CMake, run the following commands:

mkdir build
cd build
cmake ..
cmake --build .

Build artifacts will be found in build/bin and build/lib. Minimum required CMake version is 3.20

In order to use the library as a CMake link dependency, you must link to one of the following targets:

dpm-interface - interface headers of the library.
dpm - static or shared library target, depending on the value of BUILD_SHARED_LIBS.

Notes

Extensions

DPM provides the following utilities and extensions to the standard API:

ABI tags
- struct aligned_vector
- using packed_buffer = implementation-defined
- using common = implementation-defined
- x86
  - using sse = implementation-defined
  - using avx = implementation-defined
Storage traits & accessors
- struct native_data_type
- struct native_data_size
- std::span to_native_data(simd<T, Abi> &)
- std::span to_native_data(const simd<T, Abi> &)
- std::span to_native_data(simd_mask<T, Abi> &)
- std::span to_native_data(const simd_mask<T, Abi> &)
Shuffle functions
- simd<T, Abi> shuffle<Is...>(const simd<T, Abi> &)
- simd_mask<T, Abi> shuffle<Is...>(const simd_mask<T, Abi> &)
- simd_mask<T, Abi> shuffle<Is...>(const V &)
Constant-N bit shifts
- simd<T, Abi> lsl<N>(const simd<T, Abi> &)
- simd<T, Abi> lsr<N>(const simd<T, Abi> &)
- simd<T, Abi> asl<N>(const simd<T, Abi> &)
- simd<T, Abi> asr<N>(const simd<T, Abi> &)
Reductions
- T hadd(const simd<T, Abi> &)
- T hmul(const simd<T, Abi> &)
- T hand(const simd<T, Abi> &)
- T hxor(const simd<T, Abi> &)
- T hor(const simd<T, Abi> &)
Basic math functions
- simd<T, Abi> remquo(const simd<T, Abi> &, const simd<T, Abi> &, simd<T, Abi> &)
- simd<T, Abi> nan<T, Abi>(const char *)
Power math functions
- simd<T, Abi> rcp(const simd<T, Abi> &)
- simd<T, Abi> rsqrt(const simd<T, Abi> &)
Nearest integer functions
- rebind_simd_t<I, simd<T, Abi>> iround<I>(const simd<T, Abi> &)
- rebind_simd_t<I, simd<T, Abi>> irint<I>(const simd<T, Abi> &)
- rebind_simd_t<I, simd<T, Abi>> itrunc<I>(const simd<T, Abi> &)
- rebind_simd_t<long, simd<T, Abi>> ltrunc(const simd<T, Abi> &)
- rebind_simd_t<long long, simd<T, Abi>> lltrunc(const simd<T, Abi> &)
Floating-point manipulation functions
- simd<T, Abi> frexp(const simd<T, Abi> &x, simd<int, Abi> &)
- simd<T, Abi> modf(const simd<T, Abi> &x, simd<T, Abi> &)
Optimization hints
- #define DPM_UNREACHABLE()
- #define DPM_NEVER_INLINE
- #define DPM_FORCEINLINE
- #define DPM_ASSUME(cnd)
Other utilities
- class cpuid

Additionally, versions of some operators and math functions accepting a scalar as one of the arguments are provided.

All extensions are available from the dpm::ext and dpm::simd_abi::ext namespaces. If DPM_INLINE_EXTENSIONS option is enabled, the ext namespaces are declared as inline.

Error handling & NaN

The standard specifies correct behavior for math functions such as sin, cos, etc. for invalid (ex. outside of domain) inputs. When DPM_HANDLE_ERRORS option is enabled, DPM will preform explicit runtime checks for such erroneous inputs as specified by the standard. If DPM_HANDLE_ERRORS is disabled, results for erroneous inputs are undefined. When DPM_PROPAGATE_NAN option is enabled, NaN arguments are guaranteed to result in NaN results (unless otherwise specified), regardless of whether error handling is enabled or not. Note that signalling NaNs may lose their value.

DPM_HANDLE_ERRORS does not guarantee that correct FP exceptions will be raised.

Examples of DPM_HANDLE_ERRORS and DPM_PROPAGATE_NAN configuration:

Expression	DPM_HANDLE_ERRORS	DPM_PROPAGATE_NAN	None
sin(0)	0	0	0
sin(Pi/2)	1	1	1
sin(inf)	NaN	undefined	undefined
sin(NaN)	NaN	NaN	undefined
asin(0)	0	0	0
asin(1)	Pi/2	Pi/2	Pi/2
asin(-2)	NaN	undefined	undefined
asin(inf)	NaN	undefined	undefined
asin(NaN)	NaN	NaN	undefined

AVX512

While DPM does utilize AVX512 instructions for 128- and 256-bit operations, there is no support for 512-wide vector data types. The main reasons being the increased complexity of implementation due to both the fracturing of AVX512 standard, and complexity of most 512-bit wide instructions; as well as relative inefficiency of 512-bit wide registers (for general-purpose use cases) on certain CPUs. See the following articles for details:

SVML

When DPM_USE_SVML is enabled, DPM will use mathematical functions provided by SVML for trigonometric, hyperbolic, exponential, nearest-integer and error functions instead of the built-in implementation. Note that if DPM_USE_SVML is enabled, NaN propagation and error handling options are ignored for affected functions, any error handling is left to SVML.

Name		Name	Last commit message	Last commit date
Latest commit History 259 Commits
dpm		dpm
test		test
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-Parallel Math

SIMD intrinsics support

Library options

Build & usage instructions

Notes

Extensions

Error handling & NaN

AVX512

SVML

About

Uh oh!

Releases

Packages

Languages

License

switch-blade-stuff/dpm

Folders and files

Latest commit

History

Repository files navigation

Data-Parallel Math

SIMD intrinsics support

Library options

Build & usage instructions

Notes

Extensions

Error handling & NaN

AVX512

SVML

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages