Aerodynamics of a cow

2 years ago

This video may not change your life, but my FluidX3D software will, if you do research in CFD. It's 100-200x faster than super expensive commercial FVM solvers, on the same hardware. FluidX3D source code on GitHub: https://github.com/ProjectPhysX/FluidX3D

This 10s video shows 10s in real time with 1m/s wind speed. 476×952×476 #LBM grid (215 million voxels), 28k time steps, 23 minutes for compute+rendering on my PC with Titan Xp GPU.

How is it possible to squeeze 215 million grid points in only 12GB?
I'm using two techniques here, which together form the holy grail of lattice Boltzmann, cutting memory demand down to only 55 Bytes/node for D3Q19 LBM, or 1/3 of conventional codes:

1. In-place streaming with Esoteric-Pull. This almost cuts memory demand in half and slightly increases performance due to implicit bounce-back boundaries.
Paper: https://doi.org/10.3390/computation10...

2. Decoupled arithmetic precision (FP32) and memory precision (FP16): all arithmetic is done in FP32, but LBM density distribution functions in memory are compressed to FP16. This almost cuts memory demand in half and almost doubles performance, without impacting overall accuracy for most setups.
Paper: https://www.researchgate.net/publicat...

Graphics are done directly in FluidX3D with OpenCL, with the raw simulation data already residing in ultra-fast video memory. No volumetric data (1 frame of the velocity field is 2.5GB!) ever has to be copied to the CPU or hard drive, but only rendered 1080p frames (8MB) instead. Once on the CPU side, a copy of the frame is made in memory and a thread is detached to handle the slow .png compression, all while the simulation is already continuing.
Paper: https://www.researchgate.net/publicat...

#CFD #GPU #FluidX3D #OpenCL

Loading comments...