spot_img
Wednesday, March 19, 2025
spot_img
HomeBlock ChainNew open-source math model Light-R1-32B surpasses equivalent DeepSeek performance with only $1000...

New open-source math model Light-R1-32B surpasses equivalent DeepSeek performance with only $1000 in training costs

-


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Researchers have introduced Light-R1-32B, a new open-source AI model optimized to solve advanced math problems. It is now available on Hugging Face under a permissive Apache 2.0 license — free for enterprises and researchers to take, deploy, fine-tune or modify as they wish, even for commercial purposes.

The 32-billion parameter (number of model settings) model surpasses the performance of similarly sized (and even larger) open-source models such as DeepSeek-R1-Distill-Llama-70B and DeepSeek-R1-Distill-Qwen-32B on the third-party American Invitational Mathematics Examination (AIME) benchmark that contains 15 math problems designed for extremely advanced students and has an allotted time limit of 3 hours.

Developed by Liang Wen, Fenrui Xiao, Xin He, Yunke Cai, Qi An, Zhenyu Duan, Yimin Du, Junchen Liu, Lifu Tang, Xiaowei Lv, Haosheng Zou, Yongchao Deng, Shousheng Jia and Xiangzheng Zhang, the model surpasses previous open-source alternatives on competitive math benchmarks.

Incredibly, the researchers completed the model’s training in fewer than six hours on 12 Nvidia H800 GPUs at an estimated total cost of $1,000. This makes Light-R1-32B one of the most accessible and practical approaches for developing high-performing math-specialized AI models. However, it’s important to remember that the model was trained on a variant of Alibaba’s open-source Qwen 2.5-32B-Instruct, which itself is presumed to have had much higher upfront training costs.

Alongside the model, the team has released its training datasets and scripts and evaluation tools, providing a transparent and accessible framework for building math-focused AI models.

The arrival of Light-R1-32B follows similar efforts from rivals, such as Microsoft Orca-Math.

A new math king emerges

To help Light-R1-32B tackle complex mathematical reasoning, the researchers trained on a model that wasn’t equipped with long-chain-of-thought (COT) reasoning. They applied curriculum-based supervised fine-tuning (SFT) and direct preference otptimization (DPO) to refine its problem-solving capabilities.

When evaluated, Light-R1-32B achieved 76.6 on AIME24 and 64.6 on AIME25, surpassing DeepSeek-R1-Distill-Qwen-32B, which scored 72.6 and 54.9, respectively.

This improvement suggests that the curriculum-based training approach effectively enhances mathematical reasoning, even when training from models that initially lack long COT.

Fair benchmarking

To ensure fair benchmarking, the researchers decontaminated training data against common reasoning benchmarks, including AIME24/25, MATH-500 and GPQA Diamond, preventing data leakage.

They also implemented difficulty-based response filtering using DeepScaleR-1.5B-preview, ultimately forming a 76,000-example dataset for the first stage of supervised fine-tuning. A second, more challenging dataset of 3,000 examples further improved performance.

After training, the team merged multiple trained versions of Light-R1-32B, leading to additional gains. Notably, the model maintains strong generalization abilities on scientific reasoning tasks (GPQA), despite being math-specialized.

How enterprises can benefit

Light-R1-32B is released under the Apache License 2.0, a permissive open-source license that allows free use, modification and commercial deployment without requiring derivative works to be open-sourced. This makes it an attractive option for enterprises, AI developers and software engineers looking to integrate or customize the model for proprietary applications.

The license also includes a royalty-free, worldwide patent grant, reducing legal risks for businesses while discouraging patent disputes. Companies can freely deploy Light-R1-32B in commercial products, maintaining full control over their innovations while benefiting from an open and transparent AI ecosystem.

For CEOs, CTOs and IT leaders, Apache 2.0 ensures cost efficiency and vendor independence, eliminating licensing fees and restrictive dependencies on proprietary AI solutions. AI developers and engineers gain the flexibility to fine-tune, integrate and extend the model without limitations, making it ideal for specialized math reasoning, research and enterprise AI applications.

However, as the license provides no warranty or liability coverage, organizations should conduct their own security, compliance and performance assessments before deploying Light-R1-32B in critical environments.

Transparency in low-cost training and optimization for math problem solving

The researchers emphasize that Light-R1-32B provides a validated, cost-effective way to train strong long CoT models in specialized domains.

By sharing their methodology, training data and code, they aim to lower cost barriers for high-performance AI development. Looking ahead, they plan to explore reinforcement learning (RL) to further enhance the model’s reasoning capabilities.



Source link

Related articles

spot_img

Latest posts