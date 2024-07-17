Hugging Face today unveiled SmolLM, a new family of compact language models that surpass similar offerings from Microsoft, Meta, and Alibaba’s Qwen in performance. These models bring advanced AI capabilities to personal devices without sacrificing performance or privacy. The SmolLM lineup features three sizes — 135 million, 360 million, and 1.7 billion parameters — designed to accommodate various computational resources. Despite their small footprint, these models have demonstrated superior results on benchmarks testing common sense reasoning and world knowledge. Loubna Ben Allal, lead ML engineer on SmolLM at Hugging Face, emphasized the efficacy of targeted, compact models in an interview with VentureBeat. “We don’t need big foundational models for every task, just like we don’t need a wrecking ball to drill a hole in a wall,” she said. “Small models designed for specific tasks can accomplish a lot.” The smallest model, SmolLM-135M, outperforms Meta’s MobileLM-125M despite training on fewer tokens. SmolLM-360M surpasses all models under 500 million parameters, including offerings from Meta and Qwen. The flagship SmolLM-1.7B model beats Microsoft’s Phi-1.5, Meta’s MobileLM-1.5B, and Qwen2-1.5B across multiple benchmarks. Hugging Face distinguishes itself by making the entire development process open-source, from data curation to training steps. This transparency aligns with the company’s commitment to open-source values and reproducible research. The models owe their impressive performance to meticulously curated training data. SmolLM builds on the Cosmo-Corpus, which includes Cosmopedia v2 (synthetic textbooks and stories), Python-Edu (educational Python samples), and FineWeb-Edu (curated educational web content).

