24 29 6

Jintao Zhang

jt-zhang

https://jt-zhang.github.io/

jt-zhang

AI & ML interests

Efficient ML

Recent Activity

upvoted a paper 2 minutes ago

HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

submitted a paper 2 minutes ago

HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

authored a paper 5 days ago

SVG-EAR: Parameter-Free Linear Compensation for Sparse Video Generation via Error-aware Routing

View all activity

Organizations

commented a paper 25 days ago

SLA2: Sparse-Linear Attention with Learnable Routing and QAT

Paper • 2602.12675 • Published about 1 month ago • 55 •

commented a paper 3 months ago

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 95 •

New activity in TurboDiffusion/TurboWan2.2-I2V-A14B-720P 3 months ago

Anyone converted these to safetensors?

#5 opened 3 months ago by

VainGuard

GitHub link 404

#1 opened 3 months ago by

aergo

commented 3 papers 6 months ago

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118 •

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118 •

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118 •

New activity in huggingface/HuggingDiscussions 7 months ago

[FEEDBACK] Daily Papers

🔥 ❤️ 21

183

#32 opened almost 2 years ago by

kramp

New activity in jt-zhang/SageAttention3 8 months ago

Any improvement on Ada Lovelace (RTX 4xxx) ?

👀 1

#1 opened 8 months ago by

NielsGx

New activity in jt-zhang/SageAttention2_plus 8 months ago

The performance of sageattention2.2 is worse than sageattention2.1.

#7 opened 9 months ago by

triplemu

New activity in jt-zhang/SageAttention2_plus 9 months ago

5090

#6 opened 9 months ago by

xiaomingxu1995

In the latest commit, we set the default sageattn API to SageAttn2++

🚀 ❤️ 2

#5 opened 9 months ago by

jt-zhang

SageAttention2++ needs CUDA 12.8

🔥 1

#3 opened 9 months ago by

jt-zhang

commented 2 papers 9 months ago

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14, 2025 • 76 •

SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27, 2025 • 45 •

commented 3 papers 10 months ago

SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27, 2025 • 45 •

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16, 2025 • 75 •

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16, 2025 • 75 •

commented 2 papers about 1 year ago

SAGE: A Framework of Precise Retrieval for RAG

Paper • 2503.01713 • Published Mar 3, 2025 • 7 •

Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28, 2025 • 8 •

Jintao Zhang

AI & ML interests

Recent Activity

Organizations

jt-zhang's activity

Anyone converted these to safetensors?

GitHub link 404

[FEEDBACK] Daily Papers

Any improvement on Ada Lovelace (RTX 4xxx) ?

The performance of sageattention2.2 is worse than sageattention2.1.

5090

In the latest commit, we set the default sageattn API to SageAttn2++

SageAttention2++ needs CUDA 12.8