This post was originally published on this site.
RALEIGH â When DeepSeek, a Chinese artificial intelligence company, released their new R1 model on January 20, it took some time for the rest of the industry to mark its impact. While many tech giants were proudly sharing their multi-billion dollar investments in AI model development and infrastructure, China had quietly built a new reasoning model rivaling OpenAIâs offering for a measly $5.567 million. The model is also notable for requiring far less energy to use, a concern that has grown as models to this point have required ever-higher levels of compute.
How did they do it?
One of the most surprising aspects of the news was that China was able to achieve this level of model sophistication while facing a chip ban by the U.S. In an attempt to restrict or delay Chinese development in artificial intelligence and military applications, the U.S. has banned the sale of certain types of chips to more than 100 Chinese companies. However, in this case, the ban seems to have motivated the Chinese to find unique workarounds.
DeepSeekâs newest model uses a combination of innovative solutions. First, a âmixture of expertsâ, or MOE, which refers to focusing on specific areas of knowledge, or âexpertsâ, that may be required for a single engagement. A model then may only use one or two âexpertsâ for a prompt, thus narrowing the usage requirement, and model overhead. The new reasoning model has also utilized Multi-head latent attention or MLA. This is a major innovation in model architecture that provides improvements in managing the data required for large context windows.
Chris Hazard, the CTO and co-founder of Raleighâs Howso, talked about how these improvements have changed the âshapeâ of the models we work with. By removing the broad details that may not have been needed, focusing instead on âexperts,â we build a narrower model that is ideal for areas of expertise. This matches much of what businesses are clamoring for, which is reliable, innovative models built for their use case. Up to this point, however, the price tag looked much higher.
I asked Hazard what he thought about impact of DeepSeekâs model might be for the companies that had been pouring money into the status quo solutions. He suggested that there was still plenty of time for companies to âpivotâ in their AI strategies.
âWhat does it make obsolete? Maybe [thereâs] a little bit of egg on your face if youâre building this, you shouldnât keep doing that,â Hazard said. âSo itâs going to change some plans. âMaybe we lose a little bit of our investment here, but if we pivot it towards building this way more efficiently, what else can we do with that?’â
Hazard did note that the release of DeepSeek also has an impact on the market. Other companies, such as OpenAI, are selling AI services and reasoning models, while DeepSeekâs models are freely available to the general public. Everyone from investors to end users may now wonder: what are these companies selling, and what is it really worth?
Foreign impact
DeepSeek is owned by a Chinese hedge fund High-Flyer, which should also be a consideration. As Hazard pointed out, High-Flyer has very different motivations than a traditional AI firm, which may have informed when R1 was released, among other things.
âIs the product really just to make vibrations and movements in the market so that they can capitalize on it? Because thatâs what their firm is really good at. Itâs a different business model than selling AI as a service.â
There are also good questions to ask about the impact of utilizing a Chinese-owned model. The U.S. Supreme Court clearly stated that Tik Tokâs availability was a risk to national security. It seems likely the use of DeepSeek models would also be regulated or restricted.
Beyond that, DeepSeek has released R1 as an âopen sourceâ model, meaning it shares access to how the model was developed and the âweightsâ that control it. While that could conceivably be modified to remove Chinese influence, Hazard confirmed that there would be no way to be absolutely sure that a model wouldnât share data back to its original creators.
Continuing development
On January 21, leaders of several major companies including OpenAI, Oracle, and Softbank stood alongside President Trump to announce the creation of Stargate, a new company planning to pour $100 billion into AI infrastructure, with up to a $500 billion investment in the next 4 years. While the federal government is reportedly not directly funding this new company, it is supporting the initiative through policy and infrastructure facilitation. Though the announcement came a day after the release of DeepSeekâs R1, it appears to be unrelated.
Hazardâs co-founder at Howso, Mike Capps, was part of the Senate AI Forum last year, and I asked Hazard if he had heard anything regarding on the new administrationâs plans for AI or how the infrastructure might be developed. At this point, it seems to be only rumors and speculation.
âThereâs probably a small handful of people who have an idea, and and thatâs it,â Hazard said.
While there are still many artificial intelligence companies out there, itâs hard to deny that power is already centralized in the hands of a few companies, or models: OpenAI, Meta, Anthropic, and Google. Putting so much power to develop infrastructure in the hands of OpenAI and Sam Altman may lead to additional advantages for the industry leader.
But itâs just that kind of power that makes the emergence of DeepSeek more impressive. DeepSeek took advantage of their limitations to think more creatively, and delivered a more innovative, efficient solution. In a perfect world, Hazard would like to see federal funding go to a broad range of AI-related investments, including research and start-up support. He feels itâs critical to support an open market in AI.
âHow do we keep competition alive and open? And then also, how do we keep people in control of their data?â Hazard asked. âI think thereâs a lot of pieces to be solved around all of this, and I think that we should be investing in all of those.â