Ilya Pereverzin's picture

Ilya Pereverzin

NodeLinker

·

PlyMxt

AI & ML interests

Isn't it amazing that we let a computer think like a human?

Recent Activity

upvoted an article about 17 hours ago

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

upvoted a paper 3 days ago

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

upvoted a paper 3 days ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

View all activity

Organizations

upvoted an article about 17 hours ago

Article

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

+4

1 day ago

•

201

upvoted 2 papers 3 days ago

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published 10 days ago • 219

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published 12 days ago • 46

upvoted a collection 3 days ago

onevision-encoder

2 items • Updated 11 days ago • 5

upvoted a paper 3 days ago

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Paper • 2602.14041 • Published 6 days ago • 42

upvoted a paper 5 days ago

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published 16 days ago • 320

upvoted a collection 5 days ago

Qwen3.5

2 items • Updated 3 days ago • 176

upvoted 2 papers 6 days ago

DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

Paper • 2602.12205 • Published 9 days ago • 78

Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception

Paper • 2602.11858 • Published 9 days ago • 58

upvoted a collection 6 days ago

Zooming-without-Zooming

6 items • Updated 7 days ago • 5

upvoted a collection 10 days ago

Ming-V2

Ming is the multi-modal series of any-to-any models developed by Ant Ling team. • 14 items • Updated 7 days ago • 34

upvoted 2 papers 10 days ago

Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation

Paper • 2510.24821 • Published Oct 28, 2025 • 41

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11, 2025 • 31

upvoted a collection 11 days ago

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 502

upvoted 3 papers 11 days ago

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 297

Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names

Paper • 2408.00298 • Published Aug 1, 2024 • 11

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Paper • 2401.10224 • Published Jan 18, 2024 • 3

upvoted a paper 12 days ago

PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing

Paper • 2601.21957 • Published 23 days ago • 19

upvoted 2 papers 13 days ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 333

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4, 2025 • 272

NodeLinker (Ilya Pereverzin)

Ilya Pereverzin's picture

Ilya Pereverzin

NodeLinker

·

PlyMxt

AI & ML interests

Isn't it amazing that we let a computer think like a human?

Recent Activity

upvoted an article about 17 hours ago

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

upvoted a paper 3 days ago

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

upvoted a paper 3 days ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

View all activity

Organizations

upvoted an article about 17 hours ago

Article

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

+4

1 day ago

•

201

upvoted 2 papers 3 days ago

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published 10 days ago • 219

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published 12 days ago • 46

upvoted a collection 3 days ago

onevision-encoder

2 items • Updated 11 days ago • 5

upvoted a paper 3 days ago

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Paper • 2602.14041 • Published 6 days ago • 42

upvoted a paper 5 days ago

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published 16 days ago • 320

upvoted a collection 5 days ago

Qwen3.5

2 items • Updated 3 days ago • 176

upvoted 2 papers 6 days ago

DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

Paper • 2602.12205 • Published 9 days ago • 78

Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception

Paper • 2602.11858 • Published 9 days ago • 58

upvoted a collection 6 days ago

Zooming-without-Zooming

6 items • Updated 7 days ago • 5

upvoted a collection 10 days ago

Ming-V2

Ming is the multi-modal series of any-to-any models developed by Ant Ling team. • 14 items • Updated 7 days ago • 34

upvoted 2 papers 10 days ago

Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation

Paper • 2510.24821 • Published Oct 28, 2025 • 41

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11, 2025 • 31

upvoted a collection 11 days ago

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 502

upvoted 3 papers 11 days ago

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 297

Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names

Paper • 2408.00298 • Published Aug 1, 2024 • 11

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Paper • 2401.10224 • Published Jan 18, 2024 • 3

upvoted a paper 12 days ago

PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing

Paper • 2601.21957 • Published 23 days ago • 19

upvoted 2 papers 13 days ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 333

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4, 2025 • 272