Buffer Preallocation and other Optimizations
buffer-factory branch mainly improves performance by reducing the number of memory allocations performed during MPI communication, which is especially important for GPU performance. New tests are added and various other optimizations are also made.
- Added a globally accessible interface for requesting buffers to be used for MPI communication
- Overallocate memory for communication buffers the first time and reuse the same buffers to avoid reallocation calls
- Particle deletion re-implemented as a partitioning algorithm
- Field layouts and particle regions now use the same mesh 1
resizecalls replaced with
realloc, which does not preserve old memory contents 2
- Unnecessary barriers and if-conditions removed
- Added a non-blocking receive function to the
Communicateclass but it is not currently used
- Reduced host-space memory reallocations by allocating vectors of MPI requests just once
- Introduced plasma mini-apps
- Updated the
Solversmodule to point to the branch with the FFT solver
- Redundant test programs deleted and replaced with equivalent unit tests where appropriate
- Various type changes introduced to ensure successful compilation on Piz Daint
- Introduced a new header file with type aliases
Possible additional changes to be made:
- Some more cleanup in FFT files and test programs
- Expand use of new type aliases
Also closes #79 (closed).
Introduces charge conservation errors in the Penning trap test. Fixed in periodic-bcs-for-scatter-gather branch.
unpackcall still resizes to preserve old particle data