emre/DeepSeek-R1-Qwen-14B-tr-ORPO · Training metrics