Compatibility note: This MR rewrites
large parts of IPPL to depend on C++20 compliance. Do not mark as ready until a meaningful number of target machines have been updated to CUDA 12 or otherwise support this standard.
Closes #148 (closed).
Replaces all rank-dependent
parallel_forkernels with templated lambdas
Implements a functor wrapper and convenience function for reduction kernels and replaces all rank-dependent
parallel_reducekernels with wrapped, templated lambdas
- Implements a wrapper for all rank independent kernels
- Generalizes IPPL algorithms (ORB, field operations, PIC) to work with problems in any number of dimensions
- Adds some tests for the new features
- Expands unit testing to check functionality for all supported dimensionalities
Note that FFT is still dependent on heFFTe, so the source is written to support any number of dimensions, but
Dim <= 3 is still enforced because heFFTe doesn't support higher dimensionality.
This MR also introduces a few other quality-of-life changes.
- Makes it easier to convert between index and coordinate spaces with
NDIndexobjects by using vector expressions
- Implements ranged iteration (
for (auto x : v)) for IPPL's