Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing Paper • 2602.03845 • Published 3 days ago • 24
Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration Paper • 2602.03647 • Published 3 days ago • 7
Horizon-LM: A RAM-Centric Architecture for LLM Training Paper • 2602.04816 • Published 2 days ago • 16
EPAS: Efficient Training with Progressive Activation Sharing Paper • 2601.19089 • Published 11 days ago • 1
Expanding the Capabilities of Reinforcement Learning via Text Feedback Paper • 2602.02482 • Published 4 days ago • 2
Beyond Output Critique: Self-Correction via Task Distillation Paper • 2602.00871 • Published 6 days ago • 1
Chronicals: A High-Performance Framework for LLM Fine-Tuning with 3.51x Speedup over Unsloth Paper • 2601.02609 • Published Jan 6 • 1
No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data Paper • 2602.04442 • Published 3 days ago • 4
Falcon-H1-Tiny Collection A series of extremely small, yet powerful language models redefining capabilities at small scale • 22 items • Updated 22 days ago • 34
Reinforcement Learning from Meta-Evaluation: Aligning Language Models Without Ground-Truth Labels Paper • 2601.21268 • Published 9 days ago • 3
FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning Paper • 2601.19001 • Published 11 days ago • 4
Mechanistic Data Attribution: Tracing the Training Origins of Interpretable LLM Units Paper • 2601.21996 • Published 8 days ago • 4
ECO: Quantized Training without Full-Precision Master Weights Paper • 2601.22101 • Published 8 days ago • 6
KromHC: Manifold-Constrained Hyper-Connections with Kronecker-Product Residual Matrices Paper • 2601.21579 • Published 9 days ago • 6
Beyond Imitation: Reinforcement Learning for Active Latent Planning Paper • 2601.21598 • Published 9 days ago • 9
Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts Paper • 2601.22156 • Published 8 days ago • 10
Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening Paper • 2601.21590 • Published 9 days ago • 12
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 9 days ago • 15
Language-based Trial and Error Falls Behind in the Era of Experience Paper • 2601.21754 • Published 9 days ago • 16