MIT MLSys Discussion Group

Monarch: Expressive Structured Matrices for Efficient and Accurate Training.

Summary. This paper proposes to transform dense matrix into factors of block sparse diagonal matrices (interspersed with permutation matrices) that 1) have fewer parameters than dense models and 2) can run faster than dense models. This paper is an important episode in the recent development of butterfly-matrix-inspired sparsity patterns that aims to accelerate training with sparsity, which used to be impossible without accuracy degradations.