Distil-AI-Slop-Detector-Gemma

A fine-tuned Gemma 3 270M model for detecting AI-generated text ("slop"). Trained using knowledge distillation from GPT OSS 120B, this compact 270M parameter model delivers strong AI detection performance while being lightweight enough for browser deployment.

Performance Results

Metric	GPT OSS 120B (Teacher)	Gemma 3 270M (Base)	This Model
Test Accuracy	100%	~40%	100%
Precision	100%	~55%	100%

Achieves 100% accuracy with only 270M parameters - matching the 120B teacher while being over 400x smaller.

Quantized Version

Format	Size	Accuracy	Use Case
Full precision (this model)	~512 MB	100%	Server deployment
Q4_K_M (quantized)	~242 MB	~95%	Browser extension

Quick Start

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")

text_to_analyze = "We recognize the value of your feedback and remain committed to continuous improvement."

messages = [
    {
        "role": "system",
        "content": """You are a problem solving model working on task_description XML block:
<task_description>Classify user-generated text content to detect whether it was likely generated by AI or written by a human.

ai_generated: Content that shows signs of AI generation: overly formal or generic language, repetitive patterns, lack of personal voice, artificial enthusiasm, typical AI writing markers like 'delve', 'tapestry', 'landscape', 'paradigm shift', excessive politeness, corporate-speak in informal contexts, perfectly structured responses without natural flow, or absence of casual mistakes.

human_written: Content that appears genuinely human-written: natural conversational flow, authentic personal voice, casual mistakes or typos, informal language, slang, abbreviations (lol, wtf, ngl), specific personal details, genuine emotion, creative expression, internet culture references, or casual grammar that lacks typical AI patterns.</task_description>
You will be given a single task in the question XML block
Solve only the task in question block.
Generate only the answer, do not generate anything else"""
    },
    {
        "role": "user",
        "content": f"""Now for the real task, solve the task in question block.
Generate only the solution, do not generate anything else
<question>{text_to_analyze}</question>"""
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: ai_generated

Model Details

Property	Value
Base Model	google/gemma-3-1b-it
Parameters	270 million
Architecture	Gemma3ForCausalLM
Context Length	32,768 tokens
Precision	bfloat16
Training Data	~10,000 synthetic examples
Teacher Model	GPT OSS 120B

Training Details

Seed Data: ~50 hand-validated examples covering obvious AI markers, subtle corporate-speak, casual human text, and edge cases
Synthetic Generation: Expanded to ~10,000 examples using knowledge distillation from GPT OSS 120B via Distil Labs
Fine-tuning: 4 epochs using LoRA
Evaluation: Test accuracy and precision metrics

Task Format

{
  "input": "lmao no way that actually worked you're a genius thanks so much!!!",
  "output": "human_written"
}

{
  "input": "We recognize the value of your feedback and remain committed to continuous improvement. Your satisfaction is our top priority.",
  "output": "ai_generated"
}

Real-World Validation

Content Type	Sample Size	Accuracy
Reddit comments	100+	~92%
ChatGPT outputs	50	98%
Human tweets	50	94%
Formal emails	30	88%

The model struggles most with formal human writing (business emails, academic text) - these sometimes trigger false positives because they share stylistic patterns with AI output.

Use Cases

Browser extensions for local AI detection
Content moderation pipelines
Academic integrity tools
Social media analysis
Edge deployment where privacy matters

Limitations

Achieves ~95% accuracy after quantization (Q4_K_M)
May flag formal human writing as AI-generated
Trained on English text only
Best for short-to-medium text snippets

Browser Extension

This model powers the AI Slop Detector Chrome extension, which runs entirely locally with no data sent to external servers.

License

Gemma License - see LICENSE file for details.

Model tree for distil-labs/distil-ai-slop-detector-gemma

Base model

google/gemma-3-1b-pt

Finetuned

google/gemma-3-1b-it

Finetuned

(429)

this model

distil-labs
/

distil-ai-slop-detector-gemma

text-generation-inference

Model card Files Files and versions

xet

Community

Distil-AI-Slop-Detector-Gemma

Performance Results

Metric	GPT OSS 120B (Teacher)	Gemma 3 270M (Base)	This Model
Test Accuracy	100%	~40%	100%
Precision	100%	~55%	100%

Achieves 100% accuracy with only 270M parameters - matching the 120B teacher while being over 400x smaller.

Quantized Version

Format	Size	Accuracy	Use Case
Full precision (this model)	~512 MB	100%	Server deployment
Q4_K_M (quantized)	~242 MB	~95%	Browser extension

Quick Start

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-ai-slop-detector-gemma")

text_to_analyze = "We recognize the value of your feedback and remain committed to continuous improvement."

messages = [
    {
        "role": "system",
        "content": """You are a problem solving model working on task_description XML block:
<task_description>Classify user-generated text content to detect whether it was likely generated by AI or written by a human.

ai_generated: Content that shows signs of AI generation: overly formal or generic language, repetitive patterns, lack of personal voice, artificial enthusiasm, typical AI writing markers like 'delve', 'tapestry', 'landscape', 'paradigm shift', excessive politeness, corporate-speak in informal contexts, perfectly structured responses without natural flow, or absence of casual mistakes.

human_written: Content that appears genuinely human-written: natural conversational flow, authentic personal voice, casual mistakes or typos, informal language, slang, abbreviations (lol, wtf, ngl), specific personal details, genuine emotion, creative expression, internet culture references, or casual grammar that lacks typical AI patterns.</task_description>
You will be given a single task in the question XML block
Solve only the task in question block.
Generate only the answer, do not generate anything else"""
    },
    {
        "role": "user",
        "content": f"""Now for the real task, solve the task in question block.
Generate only the solution, do not generate anything else
<question>{text_to_analyze}</question>"""
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: ai_generated

Model Details

Property	Value
Base Model	google/gemma-3-1b-it
Parameters	270 million
Architecture	Gemma3ForCausalLM
Context Length	32,768 tokens
Precision	bfloat16
Training Data	~10,000 synthetic examples
Teacher Model	GPT OSS 120B

Training Details

Seed Data: ~50 hand-validated examples covering obvious AI markers, subtle corporate-speak, casual human text, and edge cases
Synthetic Generation: Expanded to ~10,000 examples using knowledge distillation from GPT OSS 120B via Distil Labs
Fine-tuning: 4 epochs using LoRA
Evaluation: Test accuracy and precision metrics

Task Format

{
  "input": "lmao no way that actually worked you're a genius thanks so much!!!",
  "output": "human_written"
}

{
  "input": "We recognize the value of your feedback and remain committed to continuous improvement. Your satisfaction is our top priority.",
  "output": "ai_generated"
}

Real-World Validation

Content Type	Sample Size	Accuracy
Reddit comments	100+	~92%
ChatGPT outputs	50	98%
Human tweets	50	94%
Formal emails	30	88%

The model struggles most with formal human writing (business emails, academic text) - these sometimes trigger false positives because they share stylistic patterns with AI output.

Use Cases

Browser extensions for local AI detection
Content moderation pipelines
Academic integrity tools
Social media analysis
Edge deployment where privacy matters

Limitations

Achieves ~95% accuracy after quantization (Q4_K_M)
May flag formal human writing as AI-generated
Trained on English text only
Best for short-to-medium text snippets

Browser Extension

This model powers the AI Slop Detector Chrome extension, which runs entirely locally with no data sent to external servers.

License

Gemma License - see LICENSE file for details.

Model tree for distil-labs/distil-ai-slop-detector-gemma

Base model

google/gemma-3-1b-pt

Finetuned

google/gemma-3-1b-it

Finetuned

(429)

this model

distil-labs
/

distil-ai-slop-detector-gemma

Distil-AI-Slop-Detector-Gemma

Performance Results

Quantized Version

Quick Start

Using Transformers

Model Details

Training Details

Task Format

Real-World Validation

Use Cases

Limitations

Browser Extension

License

Links

Model tree for distil-labs/distil-ai-slop-detector-gemma

distil-labs
/

distil-ai-slop-detector-gemma

Distil-AI-Slop-Detector-Gemma

Performance Results

Quantized Version

Quick Start

Using Transformers

Model Details

Training Details

Task Format

Real-World Validation

Use Cases

Limitations

Browser Extension

License

Links

Model tree for distil-labs/distil-ai-slop-detector-gemma