Distil-AI-Slop-Detector-Gemma
A fine-tuned Gemma 3 270M model for detecting AI-generated text ("slop"). Trained using knowledge distillation from GPT OSS 120B, this compact 270M parameter model delivers strong AI detection performance while being lightweight enough for browser deployment.
Performance Results
| Metric | GPT OSS 120B (Teacher) | Gemma 3 270M (Base) | This Model |
|---|---|---|---|
| Test Accuracy | 100% | ~40% | 100% |
| Precision | 100% | ~55% | 100% |
Achieves 100% accuracy with only 270M parameters - matching the 120B teacher while being over 400x smaller.
Quantized Version
| Format | Size | Accuracy | Use Case |
|---|---|---|---|
| Full precision (this model) | ~512 MB | 100% | Server deployment |
| Q4_K_M (quantized) | ~242 MB | ~95% | Browser extension |
Quick Start
Using Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")
text_to_analyze = "We recognize the value of your feedback and remain committed to continuous improvement."
messages = [
{
"role": "system",
"content": """You are a problem solving model working on task_description XML block:
<task_description>Classify user-generated text content to detect whether it was likely generated by AI or written by a human.
ai_generated: Content that shows signs of AI generation: overly formal or generic language, repetitive patterns, lack of personal voice, artificial enthusiasm, typical AI writing markers like 'delve', 'tapestry', 'landscape', 'paradigm shift', excessive politeness, corporate-speak in informal contexts, perfectly structured responses without natural flow, or absence of casual mistakes.
human_written: Content that appears genuinely human-written: natural conversational flow, authentic personal voice, casual mistakes or typos, informal language, slang, abbreviations (lol, wtf, ngl), specific personal details, genuine emotion, creative expression, internet culture references, or casual grammar that lacks typical AI patterns.</task_description>
You will be given a single task in the question XML block
Solve only the task in question block.
Generate only the answer, do not generate anything else"""
},
{
"role": "user",
"content": f"""Now for the real task, solve the task in question block.
Generate only the solution, do not generate anything else
<question>{text_to_analyze}</question>"""
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: ai_generated
Model Details
| Property | Value |
|---|---|
| Base Model | google/gemma-3-1b-it |
| Parameters | 270 million |
| Architecture | Gemma3ForCausalLM |
| Context Length | 32,768 tokens |
| Precision | bfloat16 |
| Training Data | ~10,000 synthetic examples |
| Teacher Model | GPT OSS 120B |
Training Details
- Seed Data: ~50 hand-validated examples covering obvious AI markers, subtle corporate-speak, casual human text, and edge cases
- Synthetic Generation: Expanded to ~10,000 examples using knowledge distillation from GPT OSS 120B via Distil Labs
- Fine-tuning: 4 epochs using LoRA
- Evaluation: Test accuracy and precision metrics
Task Format
{
"input": "lmao no way that actually worked you're a genius thanks so much!!!",
"output": "human_written"
}
{
"input": "We recognize the value of your feedback and remain committed to continuous improvement. Your satisfaction is our top priority.",
"output": "ai_generated"
}
Real-World Validation
| Content Type | Sample Size | Accuracy |
|---|---|---|
| Reddit comments | 100+ | ~92% |
| ChatGPT outputs | 50 | 98% |
| Human tweets | 50 | 94% |
| Formal emails | 30 | 88% |
The model struggles most with formal human writing (business emails, academic text) - these sometimes trigger false positives because they share stylistic patterns with AI output.
Use Cases
- Browser extensions for local AI detection
- Content moderation pipelines
- Academic integrity tools
- Social media analysis
- Edge deployment where privacy matters
Limitations
- Achieves ~95% accuracy after quantization (Q4_K_M)
- May flag formal human writing as AI-generated
- Trained on English text only
- Best for short-to-medium text snippets
Browser Extension
This model powers the AI Slop Detector Chrome extension, which runs entirely locally with no data sent to external servers.
License
Gemma License - see LICENSE file for details.
Links
- AI Slop Detector Extension
- Distil Labs Homepage
- Distil Labs Documentation
- Distil Labs on HuggingFace
Built with Distil Labs - turn a prompt and a few examples into production-ready small language models.
- Downloads last month
- 80