Supervisor: Ioan Raicu (Illinois Institute of Technology)
Abstract: As the core counts of computing platforms continue to rise, parallel runtime systems with support for very fine-grained tasks become increasingly necessary to fully utilize the available resources. A critical feature of such task-based parallel runtime systems is the ability to balance work evenly and quickly across available cores. We highlight this by studying XTask, a custom parallel runtime system based on XQueue, which is a novel lock-less concurrent queuing system with relaxed ordering semantics that is geared to realizing scalability to hundreds of concurrent threads. We demonstrate the lack of adequate load balancing in the original XQueue design and present several solutions for improving load balancing. We also evaluate the corresponding improvements in performance on two sample workloads, computation of Fibonacci numbers and computation of Cholesky factorization. Finally, we compare the performance of several versions of XTask along with several implementations of the popular OpenMP runtime system.
ACM-SRC Semi-Finalist: no
Poster Summary: PDF
Back to Poster Archive Listing