Distil-AI-Slop-Detector-Gemma

A fine-tuned Gemma 3 270M model for detecting AI-generated text ("slop"). Trained using knowledge distillation from GPT OSS 120B, this compact 270M parameter model delivers strong AI detection performance while being lightweight enough for browser deployment.

Performance Results

Metric GPT OSS 120B (Teacher) Gemma 3 270M (Base) This Model
Test Accuracy 100% ~40% 100%
Precision 100% ~55% 100%

Achieves 100% accuracy with only 270M parameters - matching the 120B teacher while being over 400x smaller.

Quantized Version

Format Size Accuracy Use Case
Full precision (this model) ~512 MB 100% Server deployment
Q4_K_M (quantized) ~242 MB ~95% Browser extension

Quick Start

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")

text_to_analyze = "We recognize the value of your feedback and remain committed to continuous improvement."

messages = [
    {
        "role": "system",
        "content": """You are a problem solving model working on task_description XML block:
<task_description>Classify user-generated text content to detect whether it was likely generated by AI or written by a human.

ai_generated: Content that shows signs of AI generation: overly formal or generic language, repetitive patterns, lack of personal voice, artificial enthusiasm, typical AI writing markers like 'delve', 'tapestry', 'landscape', 'paradigm shift', excessive politeness, corporate-speak in informal contexts, perfectly structured responses without natural flow, or absence of casual mistakes.

human_written: Content that appears genuinely human-written: natural conversational flow, authentic personal voice, casual mistakes or typos, informal language, slang, abbreviations (lol, wtf, ngl), specific personal details, genuine emotion, creative expression, internet culture references, or casual grammar that lacks typical AI patterns.</task_description>
You will be given a single task in the question XML block
Solve only the task in question block.
Generate only the answer, do not generate anything else"""
    },
    {
        "role": "user",
        "content": f"""Now for the real task, solve the task in question block.
Generate only the solution, do not generate anything else
<question>{text_to_analyze}</question>"""
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: ai_generated

Model Details

Property Value
Base Model google/gemma-3-1b-it
Parameters 270 million
Architecture Gemma3ForCausalLM
Context Length 32,768 tokens
Precision bfloat16
Training Data ~10,000 synthetic examples
Teacher Model GPT OSS 120B

Training Details

  1. Seed Data: ~50 hand-validated examples covering obvious AI markers, subtle corporate-speak, casual human text, and edge cases
  2. Synthetic Generation: Expanded to ~10,000 examples using knowledge distillation from GPT OSS 120B via Distil Labs
  3. Fine-tuning: 4 epochs using LoRA
  4. Evaluation: Test accuracy and precision metrics

Task Format

{
  "input": "lmao no way that actually worked you're a genius thanks so much!!!",
  "output": "human_written"
}
{
  "input": "We recognize the value of your feedback and remain committed to continuous improvement. Your satisfaction is our top priority.",
  "output": "ai_generated"
}

Real-World Validation

Content Type Sample Size Accuracy
Reddit comments 100+ ~92%
ChatGPT outputs 50 98%
Human tweets 50 94%
Formal emails 30 88%

The model struggles most with formal human writing (business emails, academic text) - these sometimes trigger false positives because they share stylistic patterns with AI output.

Use Cases

  • Browser extensions for local AI detection
  • Content moderation pipelines
  • Academic integrity tools
  • Social media analysis
  • Edge deployment where privacy matters

Limitations

  • Achieves ~95% accuracy after quantization (Q4_K_M)
  • May flag formal human writing as AI-generated
  • Trained on English text only
  • Best for short-to-medium text snippets

Browser Extension

This model powers the AI Slop Detector Chrome extension, which runs entirely locally with no data sent to external servers.

License

Gemma License - see LICENSE file for details.

Links


Built with Distil Labs - turn a prompt and a few examples into production-ready small language models.

Downloads last month
80
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for distil-labs/distil-ai-slop-detector-gemma

Finetuned
(429)
this model
distil-labs/distil-ai-slop-detector-gemma · Hugging Face

Distil-AI-Slop-Detector-Gemma

A fine-tuned Gemma 3 270M model for detecting AI-generated text ("slop"). Trained using knowledge distillation from GPT OSS 120B, this compact 270M parameter model delivers strong AI detection performance while being lightweight enough for browser deployment.

Performance Results

Metric GPT OSS 120B (Teacher) Gemma 3 270M (Base) This Model
Test Accuracy 100% ~40% 100%
Precision 100% ~55% 100%

Achieves 100% accuracy with only 270M parameters - matching the 120B teacher while being over 400x smaller.

Quantized Version

Format Size Accuracy Use Case
Full precision (this model) ~512 MB 100% Server deployment
Q4_K_M (quantized) ~242 MB ~95% Browser extension

Quick Start

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")

text_to_analyze = "We recognize the value of your feedback and remain committed to continuous improvement."

messages = [
    {
        "role": "system",
        "content": """You are a problem solving model working on task_description XML block:
<task_description>Classify user-generated text content to detect whether it was likely generated by AI or written by a human.

ai_generated: Content that shows signs of AI generation: overly formal or generic language, repetitive patterns, lack of personal voice, artificial enthusiasm, typical AI writing markers like 'delve', 'tapestry', 'landscape', 'paradigm shift', excessive politeness, corporate-speak in informal contexts, perfectly structured responses without natural flow, or absence of casual mistakes.

human_written: Content that appears genuinely human-written: natural conversational flow, authentic personal voice, casual mistakes or typos, informal language, slang, abbreviations (lol, wtf, ngl), specific personal details, genuine emotion, creative expression, internet culture references, or casual grammar that lacks typical AI patterns.</task_description>
You will be given a single task in the question XML block
Solve only the task in question block.
Generate only the answer, do not generate anything else"""
    },
    {
        "role": "user",
        "content": f"""Now for the real task, solve the task in question block.
Generate only the solution, do not generate anything else
<question>{text_to_analyze}</question>"""
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: ai_generated

Model Details

Property Value
Base Model google/gemma-3-1b-it
Parameters 270 million
Architecture Gemma3ForCausalLM
Context Length 32,768 tokens
Precision bfloat16
Training Data ~10,000 synthetic examples
Teacher Model GPT OSS 120B

Training Details

  1. Seed Data: ~50 hand-validated examples covering obvious AI markers, subtle corporate-speak, casual human text, and edge cases
  2. Synthetic Generation: Expanded to ~10,000 examples using knowledge distillation from GPT OSS 120B via Distil Labs
  3. Fine-tuning: 4 epochs using LoRA
  4. Evaluation: Test accuracy and precision metrics

Task Format

{
  "input": "lmao no way that actually worked you're a genius thanks so much!!!",
  "output": "human_written"
}
{
  "input": "We recognize the value of your feedback and remain committed to continuous improvement. Your satisfaction is our top priority.",
  "output": "ai_generated"
}

Real-World Validation

Content Type Sample Size Accuracy
Reddit comments 100+ ~92%
ChatGPT outputs 50 98%
Human tweets 50 94%
Formal emails 30 88%

The model struggles most with formal human writing (business emails, academic text) - these sometimes trigger false positives because they share stylistic patterns with AI output.

Use Cases

  • Browser extensions for local AI detection
  • Content moderation pipelines
  • Academic integrity tools
  • Social media analysis
  • Edge deployment where privacy matters

Limitations

  • Achieves ~95% accuracy after quantization (Q4_K_M)
  • May flag formal human writing as AI-generated
  • Trained on English text only
  • Best for short-to-medium text snippets

Browser Extension

This model powers the AI Slop Detector Chrome extension, which runs entirely locally with no data sent to external servers.

License

Gemma License - see LICENSE file for details.

Links


Built with Distil Labs - turn a prompt and a few examples into production-ready small language models.

Downloads last month
80
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for distil-labs/distil-ai-slop-detector-gemma

Finetuned
(429)
this model