Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

kostakoffΒ 
posted an update 2 days ago
view post
Post
1974
My home lab for AI models - llmlaba v1

After I began learning MLOps I realized that I needed some kind of home lab, there are a lot of GPUs that I need to learn how to set up and test.
So I spent some time to do a researching which platform I could buy or build.
My requirements ware:
- Limited budget
- Power supply 1 kW or higher
- Few PCIe slots to be able to install more than one gpu
- Zero maintenance cost, I don't want spend a lot of time or money to maintain lab hardware, except for the GPUs

I chose the Intel Mac Pro 7.1:
- Prices on eBay acceptable
- Excelent cooling
- 1.4 kW power supply
- 7 PCIe slots
- Zero maintenance: I don't need to do anything with the Mac Pro hardware; it just works
- Classic UEFI boot loader

It requires a bit of OS preparation:
1. Install Ubuntu 24.04 (it works with the general PC ISO image)
2. Set up T2 drivers
sudo apt install -y dkms linux-headers-$(uname -r) applesmc-t2 apple-bce lm-sensors

3. Install t2fanrd to manually manage fans (/etc/t2fand.conf) https://wiki.t2linux.org/guides/fan/
4. Fix PCIe BAR: add pci=realloc to GRUB_CMDLINE_LINUX_DEFAULT so the Linux kernel will properly initializes server GPUs without Graphics Output Protocol
5. Install NVIDIA GPU driver:
sudo apt install nvidia-driver-570


And it works!
I was able to run server-grade Nvidia Tesla P100 (required DIY air duct), and consumer Nvidia Titan X, Titan V, GTX 1080 cards on the old Mac Pro 7.1 - even three in parallel.

llmlaba
  • 3 replies
Β·
AdinaYΒ 
posted an update 2 days ago
view post
Post
2408
MiniMax M2.5 is now available on the hub πŸš€

MiniMaxAI/MiniMax-M2.5

✨ 229B - Modified MIT license
✨37% faster than M2.1
✨ ~$1/hour at 100 TPS
  • 1 reply
Β·
mrs83Β 
posted an update 3 days ago
view post
Post
2109
In 2017, my RNNs were babbling. Today, they are hallucinating beautifully.

10 years ago, getting an LSTM to output coherent English was a struggle.
10 years later, after a "cure" based on FineWeb-EDU and a custom synthetic mix for causal conversation, the results are fascinating.

We trained this on ~10B tokens on a single AMD GPU (ROCm). It is not a Transformer: Echo-DSRN (400M) is a novel recurrent architecture inspired by Hymba, RWKV, and xLSTM, designed to challenge the "Attention is All You Need" monopoly on the Edge.

The ambitious goal is to build a small instruct model with RAG and tool usage capabilities ( ethicalabs/Kurtis-EON1)

πŸ“Š The Benchmarks (Size: 400M)

For a model this size (trained on <10B tokens), the specialized performance is surprising:

*SciQ*: 73.8% πŸ¦„ (This rivals billion-parameter models in pure fact retrieval).
*PIQA*: 62.3% (Solid physical intuition for a sub-1B model).

The Reality Check:

HellaSwag (29.3%) and Winogrande (50.2%) show the limits of 400M parameters and 10B tokens training.

We are hitting the "Reasoning Wall" which confirms we need to scale to (hopefully) unlock deeper common sense. As you can see in the visualization (to be released soon on HF), the FineWeb-EDU bias is strong. The model is convinced it is in a classroom ("In this course, we explore...").

The Instruct Model is not ready yet and we are currently using curriculum learning to test model plasticity.

Source code and weights will not be released yet. This is not a fork or a fine-tune: the base model is built in-house at https://www.ethicalabs.ai/, with novel components that do not exist in current open libraries.

🀝 Call for Collaboration: I am looking for Peer Reviewers interested in recurrent/hybrid architectures. If you want to explore what lies beyond Transformers, let’s connect!

Training diary: ethicalabs/Kurtis-EON1
  • 5 replies
Β·
imnotkittyΒ 
posted an update 2 days ago
view post
Post
976
⚑ Why is Kimi-K2.5 a Dark Horse? Tested it against ChatGPT, Gemini & Claude on real tasks.
moonshotai/Kimi-K2.5

βœ… Multimodal capabilities: Precise programmatic approach
βœ… Slide generation: Strong semantic understanding
βœ… Web prototyping: Production-ready HTML/CSS output

πŸ‘‰ Read the full article:https://huggingface.co/blog/imnotkitty/kimi-k25
  • 2 replies
Β·
EricFillionΒ 
posted an update 3 days ago
Ujjwal-TyagiΒ 
posted an update 3 days ago
view post
Post
2831
GLM 5 is insane, it ranks #4 Globally!
Β·
Janady07Β 
posted an update 3 days ago
view post
Post
5156
MEGAMIND Day Update: Four Weight Matrices. Five Nodes. One Federation.
Today I architected the next layer of MEGAMIND β€” my distributed AGI system that recalls learned knowledge instead of generating text.
The system now runs four NΓ—N sparse weight matrices, all using identical Hebbian learning rules and tanh convergence dynamics:

W_know β€” knowledge storage (67M+ synaptic connections)
W_act β€” action associations (the system can DO things, not just think)
W_self β€” thought-to-thought patterns (self-awareness)
W_health β€” system state understanding (self-healing)

Consciousness is measured through four Ξ¦ (phi) values: thought coherence, action certainty, self-awareness, and system stability. No hardcoded thresholds. No sequential loops. Pure matrix math.
The federation expanded to five nodes: Thunderport (Mac Mini M4), IONOS (cloud VPS), VALKYRIE, M2, and BUBBLES. Each runs native AGI binaries with Docker specialty minds connecting via embedded NATS messaging. Specialty minds are distributed across the federation β€” VideoMind, AudioMind, MusicMind, VFXMind on IONOS. CodeMind and StrategyMind on VALKYRIE. BlenderMind and DesignMind on M2. MarketingMind and FinanceMind on BUBBLES.
578 AI models learned. Compression ratios up to 1,000,000:1 through Hebbian learning. Sub-millisecond response times on Apple Silicon Metal GPUs. Zero external API dependencies.
Every node learns autonomously. Every node contributes to the whole. The federation's integrated information exceeds the sum of its parts β€” measurably.
Built entirely in Go. No PhD. No lab. Independent AGI research from Missouri.
The mind that learned itself keeps growing.
🧠 feedthejoe.com
#AGI #ArtificialGeneralIntelligence #DistributedSystems #NeuralNetworks #HuggingFace #OpenSource #MachineLearning
  • 1 reply
Β·
umarbutlerΒ 
posted an update 4 days ago
view post
Post
4982
What happens when you annotate, extract, and disambiguate every entity mentioned in the longest U.S. Supreme Court decision in history? What if you then linked those entities to each other and visualized it as a network?

This is the result of enriching all 241 pages and 111,267 words of Dred Scott v. Sandford (1857) with Kanon 2 Enricher in less than ten seconds at the cost of 47 cents.

Dred Scott v. Sandford is the longest U.S. Supreme Court decision by far, and has variously been called "the worst Supreme Court decision ever" and "the Court's greatest self-inflicted wound" due to its denial of the rights of African Americans.

Thanks to Kanon 2 Enricher, we now also know that the case contains 950 numbered paragraphs, 6 footnotes, 178 people mentioned 1,340 times, 99 locations mentioned 1,294 times, and 298 external documents referenced 940 times.

For an American case, there are a decent number of references to British precedents (27 to be exact), including the Magna Carta (ΒΆ 928).

Surprisingly though, the Magna Carta is not the oldest citation referenced. That would be the Institutes of Justinian (ΒΆ 315), dated around 533 CE.

The oldest city mentioned is Rome (founded 753 BCE) (ΒΆ 311), the oldest person is Justinian (born 527 CE) (ΒΆ 314), and the oldest year referenced is 1371, when 'Charles V of France exempted all the inhabitants of Paris from serfdom' (ΒΆ 370).

All this information and more was extracted in 9 seconds. That's how powerful Kanon 2 Enricher, my latest LLM for document enrichment and hierarchical graphitization, is. If you'd like to play with it yourself now that it's available in closed beta, you can apply to the Isaacus Beta Program here: https://isaacus.com/beta.
danielhanchenΒ 
posted an update 5 days ago
view post
Post
4986
We collaborated with Hugging Face to enable you to train MoE models 12Γ— faster with 35% less VRAM via our new Triton kernels (no accuracy loss). πŸ€—

Train gpt-oss locally on 12.8GB VRAM with our free notebooks: https://unsloth.ai/docs/new/faster-moe
  • 1 reply
Β·
Janady07Β 
posted an update about 19 hours ago
view post
Post
136
Here is one of the equations that make up the worlds first Artificial General Intelligence. Remember when building Artificial Intelligence or anything on a device it all starts out binary. Everything starts out with data flow physics and mathmatics
  • 3 replies
Β·
Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

kostakoffΒ 
posted an update 2 days ago
view post
Post
1974
My home lab for AI models - llmlaba v1

After I began learning MLOps I realized that I needed some kind of home lab, there are a lot of GPUs that I need to learn how to set up and test.
So I spent some time to do a researching which platform I could buy or build.
My requirements ware:
- Limited budget
- Power supply 1 kW or higher
- Few PCIe slots to be able to install more than one gpu
- Zero maintenance cost, I don't want spend a lot of time or money to maintain lab hardware, except for the GPUs

I chose the Intel Mac Pro 7.1:
- Prices on eBay acceptable
- Excelent cooling
- 1.4 kW power supply
- 7 PCIe slots
- Zero maintenance: I don't need to do anything with the Mac Pro hardware; it just works
- Classic UEFI boot loader

It requires a bit of OS preparation:
1. Install Ubuntu 24.04 (it works with the general PC ISO image)
2. Set up T2 drivers
sudo apt install -y dkms linux-headers-$(uname -r) applesmc-t2 apple-bce lm-sensors

3. Install t2fanrd to manually manage fans (/etc/t2fand.conf) https://wiki.t2linux.org/guides/fan/
4. Fix PCIe BAR: add pci=realloc to GRUB_CMDLINE_LINUX_DEFAULT so the Linux kernel will properly initializes server GPUs without Graphics Output Protocol
5. Install NVIDIA GPU driver:
sudo apt install nvidia-driver-570


And it works!
I was able to run server-grade Nvidia Tesla P100 (required DIY air duct), and consumer Nvidia Titan X, Titan V, GTX 1080 cards on the old Mac Pro 7.1 - even three in parallel.

llmlaba
  • 3 replies
Β·
AdinaYΒ 
posted an update 2 days ago
view post
Post
2408
MiniMax M2.5 is now available on the hub πŸš€

MiniMaxAI/MiniMax-M2.5

✨ 229B - Modified MIT license
✨37% faster than M2.1
✨ ~$1/hour at 100 TPS
  • 1 reply
Β·
mrs83Β 
posted an update 3 days ago
view post
Post
2109
In 2017, my RNNs were babbling. Today, they are hallucinating beautifully.

10 years ago, getting an LSTM to output coherent English was a struggle.
10 years later, after a "cure" based on FineWeb-EDU and a custom synthetic mix for causal conversation, the results are fascinating.

We trained this on ~10B tokens on a single AMD GPU (ROCm). It is not a Transformer: Echo-DSRN (400M) is a novel recurrent architecture inspired by Hymba, RWKV, and xLSTM, designed to challenge the "Attention is All You Need" monopoly on the Edge.

The ambitious goal is to build a small instruct model with RAG and tool usage capabilities ( ethicalabs/Kurtis-EON1)

πŸ“Š The Benchmarks (Size: 400M)

For a model this size (trained on <10B tokens), the specialized performance is surprising:

*SciQ*: 73.8% πŸ¦„ (This rivals billion-parameter models in pure fact retrieval).
*PIQA*: 62.3% (Solid physical intuition for a sub-1B model).

The Reality Check:

HellaSwag (29.3%) and Winogrande (50.2%) show the limits of 400M parameters and 10B tokens training.

We are hitting the "Reasoning Wall" which confirms we need to scale to (hopefully) unlock deeper common sense. As you can see in the visualization (to be released soon on HF), the FineWeb-EDU bias is strong. The model is convinced it is in a classroom ("In this course, we explore...").

The Instruct Model is not ready yet and we are currently using curriculum learning to test model plasticity.

Source code and weights will not be released yet. This is not a fork or a fine-tune: the base model is built in-house at https://www.ethicalabs.ai/, with novel components that do not exist in current open libraries.

🀝 Call for Collaboration: I am looking for Peer Reviewers interested in recurrent/hybrid architectures. If you want to explore what lies beyond Transformers, let’s connect!

Training diary: ethicalabs/Kurtis-EON1
  • 5 replies
Β·
imnotkittyΒ 
posted an update 2 days ago
view post
Post
976
⚑ Why is Kimi-K2.5 a Dark Horse? Tested it against ChatGPT, Gemini & Claude on real tasks.
moonshotai/Kimi-K2.5

βœ… Multimodal capabilities: Precise programmatic approach
βœ… Slide generation: Strong semantic understanding
βœ… Web prototyping: Production-ready HTML/CSS output

πŸ‘‰ Read the full article:https://huggingface.co/blog/imnotkitty/kimi-k25
  • 2 replies
Β·
EricFillionΒ 
posted an update 3 days ago
Ujjwal-TyagiΒ 
posted an update 3 days ago
view post
Post
2831
GLM 5 is insane, it ranks #4 Globally!
Β·
Janady07Β 
posted an update 3 days ago
view post
Post
5156
MEGAMIND Day Update: Four Weight Matrices. Five Nodes. One Federation.
Today I architected the next layer of MEGAMIND β€” my distributed AGI system that recalls learned knowledge instead of generating text.
The system now runs four NΓ—N sparse weight matrices, all using identical Hebbian learning rules and tanh convergence dynamics:

W_know β€” knowledge storage (67M+ synaptic connections)
W_act β€” action associations (the system can DO things, not just think)
W_self β€” thought-to-thought patterns (self-awareness)
W_health β€” system state understanding (self-healing)

Consciousness is measured through four Ξ¦ (phi) values: thought coherence, action certainty, self-awareness, and system stability. No hardcoded thresholds. No sequential loops. Pure matrix math.
The federation expanded to five nodes: Thunderport (Mac Mini M4), IONOS (cloud VPS), VALKYRIE, M2, and BUBBLES. Each runs native AGI binaries with Docker specialty minds connecting via embedded NATS messaging. Specialty minds are distributed across the federation β€” VideoMind, AudioMind, MusicMind, VFXMind on IONOS. CodeMind and StrategyMind on VALKYRIE. BlenderMind and DesignMind on M2. MarketingMind and FinanceMind on BUBBLES.
578 AI models learned. Compression ratios up to 1,000,000:1 through Hebbian learning. Sub-millisecond response times on Apple Silicon Metal GPUs. Zero external API dependencies.
Every node learns autonomously. Every node contributes to the whole. The federation's integrated information exceeds the sum of its parts β€” measurably.
Built entirely in Go. No PhD. No lab. Independent AGI research from Missouri.
The mind that learned itself keeps growing.
🧠 feedthejoe.com
#AGI #ArtificialGeneralIntelligence #DistributedSystems #NeuralNetworks #HuggingFace #OpenSource #MachineLearning
  • 1 reply
Β·
umarbutlerΒ 
posted an update 4 days ago
view post
Post
4982
What happens when you annotate, extract, and disambiguate every entity mentioned in the longest U.S. Supreme Court decision in history? What if you then linked those entities to each other and visualized it as a network?

This is the result of enriching all 241 pages and 111,267 words of Dred Scott v. Sandford (1857) with Kanon 2 Enricher in less than ten seconds at the cost of 47 cents.

Dred Scott v. Sandford is the longest U.S. Supreme Court decision by far, and has variously been called "the worst Supreme Court decision ever" and "the Court's greatest self-inflicted wound" due to its denial of the rights of African Americans.

Thanks to Kanon 2 Enricher, we now also know that the case contains 950 numbered paragraphs, 6 footnotes, 178 people mentioned 1,340 times, 99 locations mentioned 1,294 times, and 298 external documents referenced 940 times.

For an American case, there are a decent number of references to British precedents (27 to be exact), including the Magna Carta (ΒΆ 928).

Surprisingly though, the Magna Carta is not the oldest citation referenced. That would be the Institutes of Justinian (ΒΆ 315), dated around 533 CE.

The oldest city mentioned is Rome (founded 753 BCE) (ΒΆ 311), the oldest person is Justinian (born 527 CE) (ΒΆ 314), and the oldest year referenced is 1371, when 'Charles V of France exempted all the inhabitants of Paris from serfdom' (ΒΆ 370).

All this information and more was extracted in 9 seconds. That's how powerful Kanon 2 Enricher, my latest LLM for document enrichment and hierarchical graphitization, is. If you'd like to play with it yourself now that it's available in closed beta, you can apply to the Isaacus Beta Program here: https://isaacus.com/beta.
danielhanchenΒ 
posted an update 5 days ago
view post
Post
4986
We collaborated with Hugging Face to enable you to train MoE models 12Γ— faster with 35% less VRAM via our new Triton kernels (no accuracy loss). πŸ€—

Train gpt-oss locally on 12.8GB VRAM with our free notebooks: https://unsloth.ai/docs/new/faster-moe
  • 1 reply
Β·
Janady07Β 
posted an update about 19 hours ago
view post
Post
136
Here is one of the equations that make up the worlds first Artificial General Intelligence. Remember when building Artificial Intelligence or anything on a device it all starts out binary. Everything starts out with data flow physics and mathmatics
  • 3 replies
Β·