CROSS
Cross-layer coordination and optimization for scalable and sparse tensor networks. TensorWorld leads NSF-supported work on abstractions, systems, and performance portability for tensor network workloads.
North Carolina State University
TensorWorld is a research group led by Dr. Jiajia Li at North Carolina State University. We build parallel algorithms, software, and systems for sparse and irregular tensor workloads, performance analysis, heterogeneous computing, and large-scale scientific and AI applications.
What We Build
Cross-layer coordination and optimization for scalable and sparse tensor networks. TensorWorld leads NSF-supported work on abstractions, systems, and performance portability for tensor network workloads.
We design algorithms, data structures, and runtime support for sparse tensor contraction, decomposition, and related kernels on CPUs, GPUs, and emerging architectures.
Our group builds profiling and analysis tools for modern ML workloads, with an emphasis on data types, memory behavior, GPU execution, and framework-level performance bottlenecks.
We study performance optimization for irregular algorithms in graph, hypergraph, tensor, and scientific computing workloads, including auto-tuning and architecture-aware implementation strategies.
New Results
Zecheng Li, Xu Liu, Namhyung Kim, Blake Jones, Alexey Alexandrov, Jiajia Li
Jinku Cui, Yueming Hao, Shuyin Jiao, Jiajia Li, Xu Liu
Ran Ran, Zhaoting Gong, Zhaowei Li, Xianting Lu, Jiajia Li, Wujie Wen
Zhaonan Meng, Miles Stoudenmire, Karl Pierce, Frank Mueller, Jiajia Li
Yanbo Zhao, Yueming Hao, Zecheng Li, Shuyin Jiao, Xu Liu, Jiajia Li
Qidong Zhao, Hao Wu, Yueming Hao, Zilingfeng Ye, Jiajia Li, Xu Liu, Keren Zhou
Keren Zhou, Karthik Ganapathi Subramanian, Po-Hsun Lin, Matthias Fey, Binqian Yin, Jiajia Li
Open Source
STTID: High-Performance Sparse Tensor-Train Interpolative Decomposition.
RedSan: A Redundant Memory Instruction Sanitizer for GPU Programs.
SymProp: Scaling Sparse Symmetric Tucker Decomposition via Symmetry Propagation.
A hierarchical parallel tensor infrastructure for sparse and structured tensor computing.
A parallel sparse tensor algorithm benchmark suite spanning CPUs and GPUs.
A parallel tensor infrastructure for data analysis and tensor kernels.
Adaptive tensor memoization support for CP decomposition.
Input-adaptive and in-place dense tensor-times-matrix multiplication.
A sparse matrix-vector multiplication auto-tuner.
Team