How DeepSeek’s new AI model is shaking up the status quo | WRAL TechWire

This post was originally published on this site.

RALEIGH — When DeepSeek, a Chinese artificial intelligence company, released their new R1 model on January 20, it took some time for the rest of the industry to mark its impact. While many tech giants were proudly sharing their multi-billion dollar investments in AI model development and infrastructure, China had quietly built a new reasoning model rivaling OpenAI’s offering for a measly $5.567 million. The model is also notable for requiring far less energy to use, a concern that has grown as models to this point have required ever-higher levels of compute.

How did they do it?

One of the most surprising aspects of the news was that China was able to achieve this level of model sophistication while facing a chip ban by the U.S. In an attempt to restrict or delay Chinese development in artificial intelligence and military applications, the U.S. has banned the sale of certain types of chips to more than 100 Chinese companies. However, in this case, the ban seems to have motivated the Chinese to find unique workarounds.

DeepSeek’s newest model uses a combination of innovative solutions. First, a “mixture of experts”, or MOE, which refers to focusing on specific areas of knowledge, or “experts”, that may be required for a single engagement. A model then may only use one or two “experts” for a prompt, thus narrowing the usage requirement, and model overhead. The new reasoning model has also utilized Multi-head latent attention or MLA. This is a major innovation in model architecture that provides improvements in managing the data required for large context windows.

Chris Hazard, the CTO and co-founder of Raleigh’s Howso, talked about how these improvements have changed the “shape” of the models we work with. By removing the broad details that may not have been needed, focusing instead on “experts,” we build a narrower model that is ideal for areas of expertise. This matches much of what businesses are clamoring for, which is reliable, innovative models built for their use case. Up to this point, however, the price tag looked much higher.

I asked Hazard what he thought about impact of DeepSeek’s model might be for the companies that had been pouring money into the status quo solutions. He suggested that there was still plenty of time for companies to “pivot” in their AI strategies.

“What does it make obsolete? Maybe [there’s] a little bit of egg on your face if you’re building this, you shouldn’t keep doing that,” Hazard said. “So it’s going to change some plans. ‘Maybe we lose a little bit of our investment here, but if we pivot it towards building this way more efficiently, what else can we do with that?’”

Hazard did note that the release of DeepSeek also has an impact on the market. Other companies, such as OpenAI, are selling AI services and reasoning models, while DeepSeek’s models are freely available to the general public. Everyone from investors to end users may now wonder: what are these companies selling, and what is it really worth?

Foreign impact

DeepSeek is owned by a Chinese hedge fund High-Flyer, which should also be a consideration. As Hazard pointed out, High-Flyer has very different motivations than a traditional AI firm, which may have informed when R1 was released, among other things.

“Is the product really just to make vibrations and movements in the market so that they can capitalize on it? Because that’s what their firm is really good at. It’s a different business model than selling AI as a service.”

There are also good questions to ask about the impact of utilizing a Chinese-owned model. The U.S. Supreme Court clearly stated that Tik Tok’s availability was a risk to national security. It seems likely the use of DeepSeek models would also be regulated or restricted.

Beyond that, DeepSeek has released R1 as an “open source” model, meaning it shares access to how the model was developed and the “weights” that control it. While that could conceivably be modified to remove Chinese influence, Hazard confirmed that there would be no way to be absolutely sure that a model wouldn’t share data back to its original creators.

Continuing development

On January 21, leaders of several major companies including OpenAI, Oracle, and Softbank stood alongside President Trump to announce the creation of Stargate, a new company planning to pour $100 billion into AI infrastructure, with up to a $500 billion investment in the next 4 years. While the federal government is reportedly not directly funding this new company, it is supporting the initiative through policy and infrastructure facilitation. Though the announcement came a day after the release of DeepSeek’s R1, it appears to be unrelated.

Hazard’s co-founder at Howso, Mike Capps, was part of the Senate AI Forum last year, and I asked Hazard if he had heard anything regarding on the new administration’s plans for AI or how the infrastructure might be developed. At this point, it seems to be only rumors and speculation.

“There’s probably a small handful of people who have an idea, and and that’s it,” Hazard said.

‘AI safety is imperative:’ Triangle thought leaders talk artificial intelligence with Senate panel

While there are still many artificial intelligence companies out there, it’s hard to deny that power is already centralized in the hands of a few companies, or models: OpenAI, Meta, Anthropic, and Google. Putting so much power to develop infrastructure in the hands of OpenAI and Sam Altman may lead to additional advantages for the industry leader.

But it’s just that kind of power that makes the emergence of DeepSeek more impressive. DeepSeek took advantage of their limitations to think more creatively, and delivered a more innovative, efficient solution. In a perfect world, Hazard would like to see federal funding go to a broad range of AI-related investments, including research and start-up support. He feels it’s critical to support an open market in AI.

“How do we keep competition alive and open? And then also, how do we keep people in control of their data?” Hazard asked. “I think there’s a lot of pieces to be solved around all of this, and I think that we should be investing in all of those.”