North Carolina State University

TensorWorld studies sparse and irregular computation across HPC, systems, and AI.

TensorWorld is a research group led by Dr. Jiajia Li at North Carolina State University. We build parallel algorithms, software, and systems for sparse and irregular tensor workloads, performance analysis, heterogeneous computing, and large-scale scientific and AI applications.

Research Themes

Sparse and irregular tensor methods
GPU and heterogeneous systems
Performance profiling and analysis
Software infrastructure for AI and science

What We Build

Projects

TensorSuite

A real and synthetic sparse tensor suite: curated real-world sparse tensors with element-wise and block-wise patterns, plus synthetic tensor generators for benchmarking under configurable size, density, and memory constraints.

CROSS

Cross-layer coordination and optimization for scalable and sparse tensor networks. TensorWorld leads NSF-supported work on abstractions, systems, and performance portability for tensor network workloads.

NSF PPoSS LARGE, 2023-2028

Ragged Tensor Operators

Hierarchical optimization of ragged (irregularly shaped) tensor operators for deep learning workloads, spanning representations, scheduling, and code generation across the software stack.

NSF SHF, 2026-2029

CS2

Formally verified and performance-optimized tensor contraction sequences for quantum many-body computations, combining formal methods with high-performance code generation.

NSF CS2, 2026-2029

Sparse Tensor Systems

We design algorithms, data structures, and runtime support for sparse tensor contraction, decomposition, and related kernels on CPUs, GPUs, and emerging architectures.

Performance Profiling

Our group builds profiling and analysis tools for modern ML workloads, with an emphasis on data types, memory behavior, GPU execution, and framework-level performance bottlenecks.

Irregular Applications

We study performance optimization for irregular algorithms in graph, hypergraph, tensor, and scientific computing workloads, including auto-tuning and architecture-aware implementation strategies.

New Results

Recent Publications

TIDES: Tiered Block-Sparse Tensor Contraction with Streaming Transpose on GPUs

Zecheng Li, Sri Harshavardhan Reddy Deverapalli, Yanbo Zhao, Frank Mueller, Karl Pierce, Jiajia Li

SC 2026 (Accepted)
TypeCraft: A Lightweight Data Type Profiler with High Resolution

Zecheng Li, Xu Liu, Namhyung Kim, Blake Jones, Alexey Alexandrov, Jiajia Li

OSDI 2026 (Accepted)
Leveraging AI Ecosystem for Portable and Sustainable GPU Kernels in HPC

Yanbo Zhao, Zhaonan Meng, Sai Krishna Teja Varma Manthena, Xu Liu, Ajay Panyala, Jiajia Li

ARRAY 2026, co-located with PLDI 2026
SmartDispatch: Dynamic Substitution of NumPy-style APIs on Heterogenous CPU-GPU Systems

Jinku Cui, Yueming Hao, Shuyin Jiao, Jiajia Li, Xu Liu

FSE 2026 (Accepted)
STTID: High-Performance Sparse Tensor-Train Interpolative Decomposition

Zhaonan Meng, Miles Stoudenmire, Karl Pierce, Frank Mueller, Jiajia Li

IPDPS 2026
RedSan: A Redundant Memory Instruction Sanitizer for GPU Programs

Yanbo Zhao, Yueming Hao, Zecheng Li, Shuyin Jiao, Xu Liu, Jiajia Li

SC 2025
DeepContext: A Context-aware, Cross-platform, and Cross-framework Tool for Performance Profiling and Analysis of Deep Learning Workloads

Qidong Zhao, Hao Wu, Yueming Hao, Zilingfeng Ye, Jiajia Li, Xu Liu, Keren Zhou

ASPLOS 2025
SymProp: Scaling Sparse Symmetric Tucker Decomposition via Symmetry Propagation

Zecheng Li, Shruti Shivakumar, Jiajia Li, Ramakrishnan Kannan

IPDPS 2025
PINE: Efficient Yet Effective Piecewise Linear Trees

Zecheng Li, Jiajia Li

SC 2024 (Poster)
FASTEN: Fast GPU-accelerated Segmented Matrix Multiplication for Heterogenous Graph Neural Networks

Keren Zhou, Karthik Ganapathi Subramanian, Po-Hsun Lin, Matthias Fey, Binqian Yin, Jiajia Li

ICS 2024

Open Source

People

Faculty

Jiajia Li PI, Department of Computer Science, NC State University

Ph.D. Students

Current

Feiyang ZhengStarted 2025
Devadatta MandaoganeStarted 2025
Rahmy SalmanStarted 2024
Zhaonan MengStarted 2024
Sai Krishna Teja Varma ManthenaStarted 2024
Zecheng LiStarted 2023
Sogolsadat MansouriStarted 2022
Yanbo ZhaoCo-advise, started 2022
Yi WangCo-advise, started 2021
Jinku CuiCo-advise, started 2020

Alumni

Qidong ZhaoCo-advised, graduated 2025, Google

Master's Students

Current

Zizhong WangStarted 2025

Alumni

Sri Harshavardhan Reddy DeverapalliGraduated 2026, NCSU
Mushtaq Ahmed ShaikhGraduated 2025
Ahmed TaimoorGraduated 2025, NCSU
Devadatta MandaoganeGraduated 2025, NCSU
Swarnamalya MohanGraduated 2024
Sounder RajendranGraduated 2024, AMD
Sai Krishna Teja Varma ManthenaGraduated 2024, NCSU
Karthik Ganapathi SubramanianGraduated 2024
Po-Hsun LinGraduated 2024

TensorWorld studies sparse and irregular computation across HPC, systems, and AI.

Research Themes

Projects

TensorSuite

CROSS

Ragged Tensor Operators

CS2

Sparse Tensor Systems

Performance Profiling

Irregular Applications

Recent Publications

TIDES: Tiered Block-Sparse Tensor Contraction with Streaming Transpose on GPUs

TypeCraft: A Lightweight Data Type Profiler with High Resolution

Leveraging AI Ecosystem for Portable and Sustainable GPU Kernels in HPC

SmartDispatch: Dynamic Substitution of NumPy-style APIs on Heterogenous CPU-GPU Systems

STTID: High-Performance Sparse Tensor-Train Interpolative Decomposition

RedSan: A Redundant Memory Instruction Sanitizer for GPU Programs

DeepContext: A Context-aware, Cross-platform, and Cross-framework Tool for Performance Profiling and Analysis of Deep Learning Workloads

SymProp: Scaling Sparse Symmetric Tucker Decomposition via Symmetry Propagation

PINE: Efficient Yet Effective Piecewise Linear Trees

FASTEN: Fast GPU-accelerated Segmented Matrix Multiplication for Heterogenous Graph Neural Networks

Software

STTID

RedSan

SymProp

HiParTI

PASTA

ParTI

AdaTM

InTensLi

SMAT

People

Faculty

Ph.D. Students

Current

Alumni

Master's Students

Current

Alumni