Offloading speedup with different parameters
With different mesh points (maybe 64^3, 128^3 and 256^3(if possible)) and particles per cell (1, 5 and 10) see how the speedup changes. Intuitively the increase in particles per cell would increase the speedup (as only the GPUs have more work) whereas increase in mesh points would decrease it (as the CPUs have more work than GPUs I think).
Verify speedup from CPU offloading compared to GPU-only runs under different conditions.