43 95 2393

Rosswill

Kutches

AI & ML interests

Recent Activity

updated a model about 8 hours ago

Kutches/Anim4

updated a model about 11 hours ago

Kutches/Top-ModelsV2

updated a model about 20 hours ago

Kutches/ImageZV2

View all activity

Organizations

None yet

upvoted a paper 2 days ago

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

Paper • 2602.13515 • Published 9 days ago • 40

upvoted a paper 3 days ago

SLA2: Sparse-Linear Attention with Learnable Routing and QAT

Paper • 2602.12675 • Published 10 days ago • 49

upvoted an article 5 days ago

Article

Qwen3.5: Nobody Agrees on Attention Anymore

5 days ago

•

upvoted a paper 10 days ago

GENIUS: Generative Fluid Intelligence Evaluation Suite

Paper • 2602.11144 • Published 11 days ago • 53

upvoted a paper 15 days ago

Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis

Paper • 2602.03139 • Published 20 days ago • 41

upvoted an article 16 days ago

Article

Community Evals: Because we're done trusting black-box leaderboards over the community

19 days ago

•

upvoted a paper 18 days ago

FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space

Paper • 2602.02092 • Published 20 days ago • 18

upvoted a collection about 1 month ago

Qwen3-TTS

Collection

7 items • Updated Jan 22 • 301

upvoted 2 papers about 1 month ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 193

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 226

upvoted 4 papers about 2 months ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published Jan 5 • 109

upvoted 2 articles about 2 months ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

118

Article

Geometric Manifold Walking: Stable High-Accuracy Multi-Encoder Fusion Without Backbone Training

Dec 25, 2025

•

upvoted a collection 2 months ago

Qwen3 4b Zimage Clip Candidates

Collection

The quest to find the best Clip (text encoder) models for use with Zimage • 47 items • Updated Dec 21, 2025 • 6

upvoted 2 papers 2 months ago

Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model

Paper • 2512.13507 • Published Dec 15, 2025 • 40

MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published Dec 16, 2025 • 119

upvoted a paper 3 months ago

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published Dec 3, 2025 • 48

Rosswill

AI & ML interests

Recent Activity

Organizations

Kutches's activity

Qwen3.5: Nobody Agrees on Attention Anymore

Community Evals: Because we're done trusting black-box leaderboards over the community

The Optimal Architecture for Small Language Models

Geometric Manifold Walking: Stable High-Accuracy Multi-Encoder Fusion Without Backbone Training