Home > News > ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

by David Mar 04,2025

OpenAI suspects that China's DeepSeek AI models, significantly cheaper than Western alternatives, were developed using OpenAI's data. This revelation, coupled with DeepSeek's rapid rise in popularity, triggered a stock market downturn for major AI players. Nvidia, a key GPU provider for AI, suffered the largest single-day loss in Wall Street history, losing almost $600 billion in market value. Other companies like Microsoft, Meta, Alphabet, and Dell also experienced significant drops.

DeepSeek's R1 model, built on the open-source DeepSeek-V3, boasts significantly lower training costs (estimated at $6 million) and computational requirements compared to Western counterparts like ChatGPT. While this claim is disputed by some, it has fueled investor concerns about the massive investments being made by American tech companies in AI.

OpenAI and Microsoft are investigating whether DeepSeek violated OpenAI's terms of service by employing "distillation," a technique that extracts data from larger models to train smaller ones. OpenAI confirmed its awareness of such attempts by Chinese and other companies to leverage leading US AI technology. They emphasized their commitment to protecting their intellectual property and are collaborating with the US government to safeguard advanced AI models.

David Sacks, President Trump's AI czar, supports the claim of data extraction, predicting that leading AI companies will implement preventative measures against distillation in the coming months.

The situation highlights a significant irony: OpenAI, itself accused of using copyrighted internet data to train ChatGPT, is now accusing DeepSeek of similar practices. This hypocrisy has been widely noted on social media, particularly given OpenAI's previous statement in January 2024 that creating AI models like ChatGPT without copyrighted material is impossible. This statement was made in a submission to the UK's House of Lords, echoing their defense against lawsuits from the New York Times and 17 authors alleging copyright infringement. These lawsuits, along with a 2018 US Copyright Office ruling that AI-generated art is not copyrightable, underscore the ongoing legal and ethical debates surrounding AI training data.