OpenGVLab

community

https://github.com/opengvlab

opengvlab

OpenGVLab

Activity Feed Request to join this org

AI & ML interests

Computer Vision

Recent Activity

LiruiZhao authored a paper about 4 hours ago

$χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies

LiruiZhao authored a paper about 4 hours ago

UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation

ganlinyang updated a collection 1 day ago

View all activity

Papers

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs

View all Papers

LiruiZhao

authored 2 papers about 4 hours ago

$χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies

Paper • 2602.09021 • Published 8 days ago • 25

UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation

Paper • 2506.17202 • Published Jun 20, 2025 • 10

prithivMLmods

posted an update 1 day ago

Post

1321

Dropping the Qwen3 VL Series of Unredacted MAX-VL models. These models have undergone multi-stage training to minimize refusal rates through continuous abliterated optimization. You can find the models in BF16, FP8-Dynamic, and GGUF formats at the links below.🔥🚀

Unredacted MAX - VL:
➜ prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX
➜ prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX
➜ prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX
➜ prithivMLmods/Qwen3-VL-8B-Thinking-Unredacted-MAX

Unredacted MAX - VL [FP8]
➜ prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX-FP8
➜ prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX-FP8
➜ prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX-FP8
➜ prithivMLmods/Qwen3-VL-8B-Thinking-Unredacted-MAX-FP8

Unredacted MAX - VL [GGUF]
➜ prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX-GGUF
➜ prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX-GGUF
➜ prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX-GGUF
➜ prithivMLmods/Qwen3-VL-8B-Thinking-Unredacted-MAX-GGUF

Unredacted MAX - VL [Collection]
➜ https://huggingface.co/collections/prithivMLmods/unredacted-max-vl-fp8
➜ https://huggingface.co/collections/prithivMLmods/unredacted-max-vl
➜ https://huggingface.co/collections/prithivMLmods/unredacted-max-vl-gguf

To learn more, visit the app page or the respective model pages.

ganlinyang

updated a collection 1 day ago

Vlaser

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning • 6 items • Updated 1 day ago • 4

heroding77

authored 2 papers 8 days ago

TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents

Paper • 2602.02196 • Published 15 days ago • 33

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

Paper • 2602.05843 • Published 12 days ago • 57

prithivMLmods

posted an update 9 days ago

Post

2869

Introducing FLUX.2-Klein-LoRA-Studio, a demo for image editing using specialized LoRA adapters built for the FLUX.2-Klein-Distilled model. It features an edit-style gallery for multi-style image editing, including de-light, face swap, mannequin, and more. Try the demo below.

🤗Demo: prithivMLmods/FLUX.2-Klein-LoRA-Studio
🤗Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
🤗GitHub: https://github.com/PRITHIVSAKTHIUR/FLUX.2-Klein-LoRA-Studio

To learn more, visit the app page or the respective model pages.

Xrenya

in OpenGVLab/InternVideo2-Stage2_1B-224p-f4 11 days ago

Error when using model

#2 opened 11 days ago by

yangxue

submitted a paper to Daily Papers 12 days ago

RISE-Video: Can Video Generators Decode Implicit World Rules?

Paper • 2602.05986 • Published 12 days ago • 26

prithivMLmods

posted an update 12 days ago

Post

834

GLM OCR, a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. It delivers high accuracy and strong generalization with a blazing-fast inference pipeline. The demo is live . Try it now. 🤗🚀

✨ Demo: prithivMLmods/GLM-OCR-Demo
✨ Multimodal Implementations: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
✨ GitHub: https://github.com/PRITHIVSAKTHIUR/GLM-OCR-Demo

Xrenya

in OpenGVLab/InternVL2_5-1B 13 days ago

unable to load model on google collab notebook

#8 opened 14 days ago by

prithivMLmods

posted an update 14 days ago

Post

2142

Introducing the Qwen-Image-Edit-3D-Lighting-Control app, featuring 8× horizontal and 3× elevational lighting positions for precise 3D lighting control. It enables studio-level lighting using fast Qwen Image Edit fast inference, paired with Multi-Angle-Lighting adapters. 🔦

🔥 Space: prithivMLmods/Qwen-Image-Edit-3D-Lighting-Control
✅ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
📂 GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-3D-Lighting-Control

ynhe

in OpenGVLab/VideoChat-Flash-Qwen2_5-7B_InternVideo2-1B 19 days ago

Enable on CPU

#1 opened 19 days ago by

prithivMLmods

posted an update 19 days ago

Post

3628

Daggr UI version of the Qwen3-TTS demo.🔥
(custom voice, voice design, qwen3-asr and voice cloning) nodes.
No remote spaces used for API inference; all functions run in-app fn.
Powered by t4-m and built with daggr@0.5.2 and gradio@6.

👉Demo: prithivMLmods/Qwen3-TTS-Daggr-UI
⭐Github: https://github.com/PRITHIVSAKTHIUR/Qwen3-TTS-Daggr-UI

1 reply

·

kpzhang996

submitted a paper to Daily Papers 21 days ago

World Craft: Agentic Framework to Create Visualizable Worlds via Text

Paper • 2601.09150 • Published Jan 14 • 20

prithivMLmods

posted an update 22 days ago

Post

2684

Qwen-Image-Edit-Object-Manipulator Space is now featured in Hugging Face Space of the Week. It enables object manipulation such as extracting objects, adding designs, and removing objects or designs from the red highlighted area using specialized adapters.

🔥Do enjoy the demo! ~ prithivMLmods/Qwen-Image-Edit-Object-Manipulator

Collections:
🧨Adapters-1: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-exps
🧨Adapters-2: https://huggingface.co/collections/prithivMLmods/qie-jan-23-26
🧨Adapters-3: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-object-manipulator

⭐Github: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-Object-Manipulator

To learn more, visit the app page or the respective model pages.

1 reply

·

kpzhang996

submitted a paper to Daily Papers 23 days ago

MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences

Paper • 2601.07251 • Published Jan 12 • 11

prithivMLmods

posted an update 25 days ago

Post

3041

Introducing QIE-2511-Zoom-Master for highlight-guided area zoom-in, enabling lossless zooming within a drawn square area, and QIE-2511-Object-Remover-v2 for precise object or highlight-guided area cleanup. These experimental adapters are trained based on QIE-2511. Find the adapters below.

🕹️QIE-2511-Zoom-Master : prithivMLmods/QIE-2511-Zoom-Master
🕹️QIE-2511-Object-Remover-v2: prithivMLmods/QIE-2511-Object-Remover-v2

🤗Demo: prithivMLmods/Qwen-Image-Edit-Object-Manipulator

📂Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-exps

To learn more, visit the app page or the respective model pages.

2 replies

·

Eurayka

authored a paper 28 days ago

LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning

Paper • 2601.10129 • Published Jan 15 • 11

OpenGVLab (OpenGVLab)

OpenGVLab

community

https://github.com/opengvlab

opengvlab

OpenGVLab

Activity Feed Request to join this org

AI & ML interests

Computer Vision

Recent Activity

LiruiZhao authored a paper about 4 hours ago

$χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies

LiruiZhao authored a paper about 4 hours ago

UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation

ganlinyang updated a collection 1 day ago

View all activity

Papers

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs

View all Papers

LiruiZhao

authored 2 papers about 4 hours ago

$χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies

Paper • 2602.09021 • Published 8 days ago • 25

UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation

Paper • 2506.17202 • Published Jun 20, 2025 • 10

prithivMLmods

posted an update 1 day ago

Post

1321

Dropping the Qwen3 VL Series of Unredacted MAX-VL models. These models have undergone multi-stage training to minimize refusal rates through continuous abliterated optimization. You can find the models in BF16, FP8-Dynamic, and GGUF formats at the links below.🔥🚀

Unredacted MAX - VL:
➜ prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX
➜ prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX
➜ prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX
➜ prithivMLmods/Qwen3-VL-8B-Thinking-Unredacted-MAX

Unredacted MAX - VL [FP8]
➜ prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX-FP8
➜ prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX-FP8
➜ prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX-FP8
➜ prithivMLmods/Qwen3-VL-8B-Thinking-Unredacted-MAX-FP8

Unredacted MAX - VL [GGUF]
➜ prithivMLmods/Qwen3-VL-4B-Instruct-Unredacted-MAX-GGUF
➜ prithivMLmods/Qwen3-VL-4B-Thinking-Unredacted-MAX-GGUF
➜ prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX-GGUF
➜ prithivMLmods/Qwen3-VL-8B-Thinking-Unredacted-MAX-GGUF

Unredacted MAX - VL [Collection]
➜ https://huggingface.co/collections/prithivMLmods/unredacted-max-vl-fp8
➜ https://huggingface.co/collections/prithivMLmods/unredacted-max-vl
➜ https://huggingface.co/collections/prithivMLmods/unredacted-max-vl-gguf

To learn more, visit the app page or the respective model pages.

ganlinyang

updated a collection 1 day ago

Vlaser

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning • 6 items • Updated 1 day ago • 4

heroding77

authored 2 papers 8 days ago

TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents

Paper • 2602.02196 • Published 15 days ago • 33

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

Paper • 2602.05843 • Published 12 days ago • 57

prithivMLmods

posted an update 9 days ago

Post

2869

Introducing FLUX.2-Klein-LoRA-Studio, a demo for image editing using specialized LoRA adapters built for the FLUX.2-Klein-Distilled model. It features an edit-style gallery for multi-style image editing, including de-light, face swap, mannequin, and more. Try the demo below.

🤗Demo: prithivMLmods/FLUX.2-Klein-LoRA-Studio
🤗Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
🤗GitHub: https://github.com/PRITHIVSAKTHIUR/FLUX.2-Klein-LoRA-Studio

To learn more, visit the app page or the respective model pages.

Xrenya

in OpenGVLab/InternVideo2-Stage2_1B-224p-f4 11 days ago

Error when using model

#2 opened 11 days ago by

yangxue

submitted a paper to Daily Papers 12 days ago

RISE-Video: Can Video Generators Decode Implicit World Rules?

Paper • 2602.05986 • Published 12 days ago • 26

prithivMLmods

posted an update 12 days ago

Post

834

GLM OCR, a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. It delivers high accuracy and strong generalization with a blazing-fast inference pipeline. The demo is live . Try it now. 🤗🚀

✨ Demo: prithivMLmods/GLM-OCR-Demo
✨ Multimodal Implementations: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
✨ GitHub: https://github.com/PRITHIVSAKTHIUR/GLM-OCR-Demo

Xrenya

in OpenGVLab/InternVL2_5-1B 13 days ago

unable to load model on google collab notebook

#8 opened 14 days ago by

prithivMLmods

posted an update 14 days ago

Post

2142

Introducing the Qwen-Image-Edit-3D-Lighting-Control app, featuring 8× horizontal and 3× elevational lighting positions for precise 3D lighting control. It enables studio-level lighting using fast Qwen Image Edit fast inference, paired with Multi-Angle-Lighting adapters. 🔦

🔥 Space: prithivMLmods/Qwen-Image-Edit-3D-Lighting-Control
✅ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
📂 GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-3D-Lighting-Control

ynhe

in OpenGVLab/VideoChat-Flash-Qwen2_5-7B_InternVideo2-1B 19 days ago

Enable on CPU

#1 opened 19 days ago by

prithivMLmods

posted an update 19 days ago

Post

3628

Daggr UI version of the Qwen3-TTS demo.🔥
(custom voice, voice design, qwen3-asr and voice cloning) nodes.
No remote spaces used for API inference; all functions run in-app fn.
Powered by t4-m and built with daggr@0.5.2 and gradio@6.

👉Demo: prithivMLmods/Qwen3-TTS-Daggr-UI
⭐Github: https://github.com/PRITHIVSAKTHIUR/Qwen3-TTS-Daggr-UI

1 reply

·

kpzhang996

submitted a paper to Daily Papers 21 days ago

World Craft: Agentic Framework to Create Visualizable Worlds via Text

Paper • 2601.09150 • Published Jan 14 • 20

prithivMLmods

posted an update 22 days ago

Post

2684

Qwen-Image-Edit-Object-Manipulator Space is now featured in Hugging Face Space of the Week. It enables object manipulation such as extracting objects, adding designs, and removing objects or designs from the red highlighted area using specialized adapters.

🔥Do enjoy the demo! ~ prithivMLmods/Qwen-Image-Edit-Object-Manipulator

Collections:
🧨Adapters-1: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-exps
🧨Adapters-2: https://huggingface.co/collections/prithivMLmods/qie-jan-23-26
🧨Adapters-3: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-object-manipulator

⭐Github: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-Object-Manipulator

To learn more, visit the app page or the respective model pages.

1 reply

·

kpzhang996

submitted a paper to Daily Papers 23 days ago

MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences

Paper • 2601.07251 • Published Jan 12 • 11

prithivMLmods

posted an update 25 days ago

Post

3041

Introducing QIE-2511-Zoom-Master for highlight-guided area zoom-in, enabling lossless zooming within a drawn square area, and QIE-2511-Object-Remover-v2 for precise object or highlight-guided area cleanup. These experimental adapters are trained based on QIE-2511. Find the adapters below.

🕹️QIE-2511-Zoom-Master : prithivMLmods/QIE-2511-Zoom-Master
🕹️QIE-2511-Object-Remover-v2: prithivMLmods/QIE-2511-Object-Remover-v2

🤗Demo: prithivMLmods/Qwen-Image-Edit-Object-Manipulator

📂Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-exps

To learn more, visit the app page or the respective model pages.

2 replies

·

Eurayka

authored a paper 28 days ago

LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning

Paper • 2601.10129 • Published Jan 15 • 11