DKSBase module
The base part of DKS library provides the basic functions needed to communicate with the hardware accelerators. The base class is implemented using CUDA, OpenCL and OpenMP allowing DKS to target various hardware accelerators. Using the base class host application can query the available devices and get the device information as well as select the device that should be used. Base class also allows host application can perform memory management and data transfers, including gathering and scattering the data when multiple CPU cores share the same GPU memory region. For GPU devices the base class also handles the creation of streams and synchronization functions, that allow overlapping memory transfers and GPU kernel execution.
Function | Description |
---|---|
setDevice | Set device used by DKS |
getDevices | Get information about available devices |
getDeviceCount | Get number of available devices |
pushData | Allocate memory and transfer data to device |
pullData | Transfer data from device and free memory |
allocateMemory | Allocate memory on the device |
registerHostMemory | Page lock allocated host memory |
unregisterHostMemory | Unregister page locked memory |
writeDataAsync | Write data to the device |
readDataAsync | Read data from the device |
gather3DDataAsync | Gather 3D data from multiple mpi processes to one memory region |
scatter3DDataAsync | Scatter 3D data to multiple MPI processes from one device memory region |
freeMemory | Free memory allocated on device |
sendPointer | Send pointer to device memory from one MPI process to another |
receivePointer | Receive pointer to device memory from another MPI process |
closeHandle | Close handle to device memory created by receivePointer |
createStream | Create stream for asynchronous kernel execution and data transfer on GPUs |
syncDevice | Wait till all tasks running on device are completed |