OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Paper • 2602.04804 • Published 5 days ago • 44
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published 6 days ago • 41
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers Paper • 2602.03510 • Published 6 days ago • 26
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 5 days ago • 30
Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published 6 days ago • 74
ProRAG Collection The models of the paper "ProRAG: Process-Supervised Reinforcement Learning for Retrieval-Augmented Generation" • 2 items • Updated 6 days ago • 2
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published 12 days ago • 21
Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning Paper • 2601.20209 • Published 13 days ago • 22
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published 14 days ago • 78
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 12 days ago • 116
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published 14 days ago • 122
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning Paper • 2601.18631 • Published 14 days ago • 47
AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking Paper • 2601.17645 • Published 16 days ago • 23
FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning Paper • 2601.18116 • Published 15 days ago • 12
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective 14 days ago • 51
Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow Paper • 2601.14243 • Published 20 days ago • 21