3 6 10

BinghengWu

wubingheng

https://github.com/wubingheng111

AI & ML interests

I like to fine-tune the small models of the Doge series.

Organizations

Articles 1

Article

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

Papers 3

arxiv:2508.02124

arxiv:2505.19716

arxiv:2412.11834

models 3

datasets 15

wubingheng/MixtureOfThoughts-Chinese-tryrun

Viewer • Updated Jul 18, 2025 • 10 • 83

wubingheng/Mixture-of-Thoughts-zh-try-run

Viewer • Updated Jul 17, 2025 • 10 • 74

wubingheng/Budget-aware-2048

Viewer • Updated Apr 29, 2025 • 25k • 4

wubingheng/Budget-aware-2048-in

Viewer • Updated Apr 29, 2025 • 25k • 4

wubingheng/Budget-aware-2048-in-try-run

Viewer • Updated Apr 29, 2025 • 2 • 4

wubingheng/Budget-aware-2048-try-run

Viewer • Updated Apr 29, 2025 • 2 • 4

wubingheng/L1-2048

Viewer • Updated Apr 28, 2025 • 25k • 3

wubingheng/L1-1024

Viewer • Updated Apr 28, 2025 • 25k • 4

wubingheng/compressed-openthoughts-50

Viewer • Updated Apr 28, 2025 • 25k • 13

wubingheng/compressed-openthoughts-90

Viewer • Updated Apr 28, 2025 • 25k • 6

View 15 datasets

BinghengWu

AI & ML interests

Organizations

Articles 1

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

Papers 3

models 3 Sort: Recently updated

datasets 15 Sort: Recently updated

BinghengWu

AI & ML interests

Organizations

Articles 1

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

Papers 3

models 3 Sort: Recently updated

datasets 15 Sort: Recently updated

models 3

datasets 15

models 3

datasets 15