AMULET: Adaptive Matrix-Multiplication-Like Tasks. (arXiv:2305.08872v1 [cs.PL])

Many useful tasks in data science and machine learning applications can be
written as simple variations of matrix multiplication. However, users have
difficulty performing such tasks as existing matrix/vector libraries support
only a limited class of computations hand-tuned for each unique hardware
platform. Users can alternatively write the task as a simple nested loop but
current compilers are not sophisticated enough to generate fast code for the
task written in this way. To address these issues, we extend an open-source
compiler to recognize and optimize these matrix multiplication-like tasks. Our
framework, called Amulet, uses both database-style and compiler optimization
techniques to generate fast code tailored to its execution environment. We show
through experiments that Amulet achieves speedups on a variety of matrix
multiplication-like tasks compared to existing compilers. For large matrices
Amulet typically performs within 15% of hand-tuned matrix multiplication
libraries, while handling a much broader class of computations.



Related post