FineData

Team
community
Activity Feed

AI & ML interests

We release large pre-training datasets to accelerate open LLM development. Part of the Hugging Face Science team (hf.co/science)

Recent Activity

cfahlgren1Β  submitted a paper 8 days ago
How AI Impacts Skill Formation
hynkyΒ  updated a Space 27 days ago
HuggingFaceFW/README
hynkyΒ  updated a collection 28 days ago
πŸ“„ FinePDFs
View all activity

HuggingFaceFW 's collections 7

πŸ“š FineWeb-Edu
FineWeb-Edu datasets, classifier and ablation model
πŸ§ͺ FineWeb v1 data experiments
Ablation models trained for our data experiments.
πŸ“š FineWeb-Edu
FineWeb-Edu datasets, classifier and ablation model
πŸ“€ Dataset comparison models
1.8B models trained on 350BT to compare different pretraining datasets
πŸ§ͺ FineWeb v1 data experiments
Ablation models trained for our data experiments.
HuggingFaceFW (FineData)

FineData

Team
community
Activity Feed

AI & ML interests

We release large pre-training datasets to accelerate open LLM development. Part of the Hugging Face Science team (hf.co/science)

Recent Activity

cfahlgren1Β  submitted a paper 8 days ago
How AI Impacts Skill Formation
hynkyΒ  updated a Space 27 days ago
HuggingFaceFW/README
hynkyΒ  updated a collection 28 days ago
πŸ“„ FinePDFs
View all activity

HuggingFaceFW 's collections 7

πŸ“š FineWeb-Edu
FineWeb-Edu datasets, classifier and ablation model
πŸ§ͺ FineWeb v1 data experiments
Ablation models trained for our data experiments.
πŸ“š FineWeb-Edu
FineWeb-Edu datasets, classifier and ablation model
πŸ“€ Dataset comparison models
1.8B models trained on 350BT to compare different pretraining datasets
πŸ§ͺ FineWeb v1 data experiments
Ablation models trained for our data experiments.