Train one epoch SFT on UltraChat200K
Zizhuo Zhang PRO
resistz
AI & ML interests
None yet
Organizations
models
10
resistz/GT-GRPO_Llama-3.2-3B-Instruct_NQ-HotpotQA
Updated
resistz/sft_Llama-3.2-1B_ultra200k
Text Generation
•
0.3B
•
Updated
•
1
resistz/sft_Qwen3-8B-Base_ultra200k_merged
8B
•
Updated
resistz/sft_Qwen3-8B-Base_ultra200k_lora32
Text Generation
•
Updated
resistz/sft_Qwen3-4B-Base_ultra200k
Text Generation
•
1B
•
Updated
•
1
resistz/sft_Qwen3-1.7B-Base_ultra200k
Text Generation
•
0.4B
•
Updated
•
1
resistz/sft_Qwen3-0.6B-Base_ultra200k
Text Generation
•
0.8B
•
Updated
•
1
resistz/sft_Llama-3.2-3B_ultra200k
Text Generation
•
0.8B
•
Updated
•
1
resistz/sft_Llama-3.1-8B_ultra200k_merged
8B
•
Updated
resistz/sft_Llama-3.1-8B_ultra200k_lora
Text Generation
•
Updated
•
1
datasets
0
None public yet