Yazılar

DeepSeek claims AI model trained for just $294,000, challenging U.S. rivals

Chinese AI developer DeepSeek has disclosed that its reasoning-focused R1 model cost just $294,000 to train—dramatically below the hundreds of millions reportedly spent by U.S. leaders such as OpenAI. The figure, revealed in a Nature article co-authored by founder Liang Wenfeng, is the company’s first public estimate of training costs and is likely to reignite debate over China’s position in the global AI race.

According to the paper, R1 was trained on a cluster of 512 Nvidia H800 chips over 80 hours. DeepSeek acknowledged for the first time that it also owns Nvidia A100 GPUs, which were used in preparatory phases before training shifted to the China-specific H800s. The H800 was designed to comply with U.S. export restrictions that bar Nvidia from selling its more powerful H100 and A100 chips to China.

The cost revelation is striking: OpenAI CEO Sam Altman has said foundational models cost “much more” than $100 million to train, though OpenAI has never published detailed figures. DeepSeek’s claim of drastically lower costs fueled January’s investor selloff in global tech stocks, amid fears it could disrupt the market dominance of Nvidia and other AI giants.

Skepticism remains. U.S. officials have suggested DeepSeek may have obtained H100 chips despite restrictions, while U.S. companies have questioned whether its development relied on model distillation—a technique where one AI model learns from another. DeepSeek has admitted using Meta’s open-source Llama models and said its training data may have included content generated by OpenAI systems, though it insists this was incidental.

DeepSeek defends distillation as an efficient way to cut costs and expand access to AI by reducing the enormous energy and resource demands of large-scale training. Analysts note this could accelerate the spread of competitive AI models outside the U.S., though questions about intellectual property and national security will remain central to the debate.

OpenAI Launches Open-Weight Reasoning Models Optimized for Laptop Use

OpenAI announced on Tuesday the release of two open-weight language models designed for advanced reasoning tasks and optimized to run efficiently on laptops, delivering performance comparable to its smaller proprietary reasoning models. Unlike fully open-source models, open-weight models provide publicly accessible trained parameters (weights) but do not include full source code or training data, allowing developers to run and fine-tune them locally or behind their own firewalls.

OpenAI co-founder Greg Brockman highlighted that the ability to operate these models locally offers users greater control over security and infrastructure. The two models, gpt-oss-120b and gpt-oss-20b, differ in size: the larger model runs on a single GPU, while the smaller one can run directly on personal computers. Both excel at coding, competitive mathematics, and health-related questions, having been trained on text-focused datasets with an emphasis on science and math.

Separately, Amazon Web Services (AWS) announced that OpenAI’s open-weight models are now available on its Bedrock generative AI marketplace—a first for OpenAI on the platform. Bedrock director Atul Deo praised the models as strong open-weight options for AWS customers.

This launch marks OpenAI’s first release of open models since GPT-2 in 2019, entering a competitive landscape that includes Meta’s Llama series and China’s DeepSeek-R1, both of which have influenced open-weight and open-source AI development trajectories this year.

OpenAI, backed by Microsoft and valued at around $300 billion, is currently seeking to raise up to $40 billion in a funding round led by Softbank Group.

U.S. Senators Call for Probe into Data Security Risks of Chinese AI Model DeepSeek

A group of seven Republican U.S. senators led by Ted Budd urged the Commerce Department on Tuesday to investigate potential data security risks associated with Chinese open-source AI models such as DeepSeek.

The senators—including Jon Husted, Todd Young, John Cornyn, John Curtis, Bill Cassidy, and Marsha Blackburn—requested an assessment of whether applications using DeepSeek collect data that is transmitted back to Chinese servers, and if these AI models are sharing American personal or corporate information with China’s military or military-linked companies.

Their letter also sought information on any improper access by Chinese open-source models to export-controlled semiconductors or breaches of usage terms of U.S. AI models aimed at enhancing Chinese AI capabilities.

Bipartisan legislation has been proposed to ban DeepSeek’s use on federal government devices and networks, as well as prohibit its use by federal contractors in government projects.

Commerce Secretary Howard Lutnick stated in January that DeepSeek appeared to have misappropriated U.S. AI technology and promised to enforce restrictions. The Commerce Department did not immediately respond to requests for comment.

In June, Reuters reported that DeepSeek was assisting China’s military and intelligence services and was attempting to use Southeast Asian shell companies to obtain advanced semiconductors barred from shipment to China under U.S. export rules.

These developments underscore growing skepticism in Washington over DeepSeek’s rapid rise, with officials suggesting the Chinese firm’s AI prowess heavily depends on U.S. technology.

Based in Hangzhou, DeepSeek shocked the tech world in January by claiming its AI reasoning models matched or outperformed leading U.S. models at a fraction of the cost.