Sparsity, the Key Ingredient from HPC to Efficient LLMs

A Workshop Co-Located with MICRO 2025

Seoul, Korea
October 18, 2025

This workshop aims to bring together the systems, HPC, and machine learning communities to explore the growing role of sparsity as a foundational tool for scaling efficiency across modern computing workloads, from scientific computing and HPC to LLMs. By fostering collaboration among researchers, practitioners, and industry experts, the workshop will focus on (a) developing novel architectures and system techniques that exploit sparsity at multiple levels and (b) deploying sparsity-aware models effectively in real-world scientific and AI applications. Topics of interest include unstructured sparsity, quantization, MoE architectures, and other innovations that drive computational efficiency, scalability, and sustainability across the full system stack.

Call For Papers

Sparsity has become a defining feature in modern computing workloads, from scientific simulations on HPC platforms to inference and training in cutting-edge LLMs. Sparsity appears across all layers of the stack: bit-level computations, sparse data structures, irregular memory access patterns, and high-level architectural design such as MoEs and dynamic routing. Although sparsity offers enormous potential to improve computing efficiency, reduce energy consumption, and enable scalability, its integration into modern systems introduces significant architectural, algorithmic, and programming challenges.

We invite submissions that address any aspect of sparsity in computing systems. Topics of interest include, but are not limited to:

We welcome complete papers, early stage work, and position papers that inspire discussion and foster community building. We target a soft limit of 4 pages, formatted in double-column style, similar to the main MICRO submission. If you have any questions please feel free to reach out to Bahar Asgari [bahar at umd dot edu] or Ramyad Hadidi [rhadidi at d-matrix dot ai]

Important Info:

Organizers: