Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation
DocTron
DocTron
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 20 hours ago
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR
upvoted
a
paper
8 days ago
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
updated
a model
8 days ago
DocTron/OCRVerse