z-lab/LLaMA3.1-8B-Instruct-DFlash-UltraChat
Text Generation
•
1B
•
Updated
•
61
•
2
Efficient AI
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference