chen-yingfa
/

HypeNet-2B

linear-attention

Model card Files Files and versions

Links:

GitHub repo: https://github.com/thunlp/hybrid-linear-attention
Paper: https://arxiv.org/abs/2601.22156

This is the final HypeNet-2B checkpoint from the paper Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts, distilled from Qwen3-1.7B using the HALO pipeline proposed in our paper. For more information, please refer to our GitHub repo.

Downloads last month: 56

Safetensors

Model size

2B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for chen-yingfa/HypeNet-2B

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Finetuned

(721)

this model

Dataset used to train chen-yingfa/HypeNet-2B

Collection including chen-yingfa/HypeNet-2B

HypeNet

The models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts • 2 items • Updated 17 days ago

Paper for chen-yingfa/HypeNet-2B

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Paper • 2601.22156 • Published Jan 29 • 14