Poster 102: Fast Training of an AI Radiologist: Leveraging Data Pipelining to Efficiently Utilize GPUs
TimeThursday, 21 November 20198:30am - 5pm
DescriptionIn a distributed deep learning training setting, using accelerators such as GPUs can be challenging to develop a high throughput model. If the accelerators are not utilized effectively, this could mean more time to solution, and thus the model's throughput is low. To use accelerators effectively across multiple nodes, we need to utilize an effective data pipelining mechanism that handles scaling gracefully so GPUs can be exploited of their parallelism. We study the effect of using the correct pipelining mechanism that is followed by tensorflow official models vs a naive pipelining mechanism that doesn't scale well, on two image classification models. Both the models using the optimized data pipeline demonstrate effective linear scaling when GPUs are added. We also show that converting to TF Records is not always necessary .