Alexey Dontsov's picture

9 43 5

Alexey Dontsov

therem

·

somvy

AI & ML interests

None yet

Recent Activity

authored a paper about 6 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

upvoted a paper about 8 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

submitted a paper about 8 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

View all activity

Organizations

authored a paper about 6 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published 3 days ago • 49

submitted a paper to Daily Papers about 8 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published 3 days ago • 49

authored a paper 4 months ago

OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features

Paper • 2509.22033 • Published Sep 26, 2025 • 19

authored a paper 5 months ago

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Paper • 2509.22067 • Published Sep 26, 2025 • 28

authored a paper 11 months ago

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24, 2025 • 119

therem (Alexey Dontsov)

Alexey Dontsov's picture

9 43 5

Alexey Dontsov

therem

·

somvy

AI & ML interests

None yet

Recent Activity

authored a paper about 6 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

upvoted a paper about 8 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

submitted a paper about 8 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

View all activity

Organizations

authored a paper about 6 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published 3 days ago • 49

submitted a paper to Daily Papers about 8 hours ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published 3 days ago • 49

authored a paper 4 months ago

OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features

Paper • 2509.22033 • Published Sep 26, 2025 • 19

authored a paper 5 months ago

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Paper • 2509.22067 • Published Sep 26, 2025 • 28

authored a paper 11 months ago

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24, 2025 • 119