Presentation
Analytical Cache Modeling and Tilesize Optimization for Tensor Contractions
Event Type
Paper
TP
Algorithms
Benchmarks
Graph Algorithms
Parallel Application Frameworks
Performance
Scalable Computing
TimeThursday, 21 November 20192:30pm - 3pm
Location301-302-303
DescriptionData movement between processor and memory hierarchy is a fundamental bottleneck that limits the performance of many applications on modern computer architectures. Tiling and loop permutation are key techniques for improving data locality. However, selecting effective tile-sizes and loop permutations is particularly challenging for tensor contractions due to the large number of loops. Even state-of-the-art compilers usually produce sub-optimal tile-sizes and loop permutations, as they rely on naïve cost models. In this paper, we provide an analytical model based approach to multi-level tile size optimization and permutation selection for tensor contractions. Our experimental results show that this approach achieves comparable or better performance than state-of-the-art frameworks and libraries for tensor contractions.
Download PDF
Archive