Abstract: Computed Tomography (CT) is a widely used technology that requires compute-intense algorithms for image reconstruction. We propose a novel back-projection algorithm that reduces the projection computation cost to 1/6 of the standard algorithm. We also propose an efficient implementation that takes advantage of the heterogeneity of GPU-accelerated systems by overlapping the filtering and back-projection stages on CPUs and GPUs, respectively. Finally, we propose a distributed framework for high-resolution image reconstruction on state-of-the-art GPU-accelerated supercomputers. The framework relies on an elaborate interleave of MPI collective communication steps to achieve scalable communication. Evaluation on a single Tesla V100 GPU demonstrates that our back-projection kernel performs up to 1.6 times faster than the standard FDK implementation. We also demonstrate the scalability and instantaneous CT capability of the distributed framework by using up to 2,048 V100 GPUs to solve a 4K and 8K problems within 30 seconds and 2 minutes, respectively (including I/O).
Back to Technical Papers Archive Listing