What's Hot

    I’m American and Studied at Universities in China, Which Was Cheaper | Invesloan.com

    March 15, 2026

    Candidates flip tables on Dem pitch to feminine voters as social gathering tries to ‘pigeonhole’ ladies | Invesloan.com

    March 15, 2026

    My Wife and I Left New York City to Move Near My in-Laws for My Son | Invesloan.com

    March 15, 2026
    Facebook Twitter Instagram
    Finance Pro
    Facebook Twitter Instagram
    invesloan.cominvesloan.com
    Subscribe for Alerts
    • Home
    • News
    • Politics
    • Money
    • Personal Finance
    • Business
    • Economy
    • Investing
    • Markets
      • Stocks
      • Futures & Commodities
      • Crypto
      • Forex
    • Technology
    invesloan.cominvesloan.com
    Home » DeepSeek Publishes New AI Training Method to Scale LLMs More Easily | Invesloan.com
    Money

    DeepSeek Publishes New AI Training Method to Scale LLMs More Easily | Invesloan.com

    January 1, 2026
    Share
    Facebook Twitter LinkedIn Pinterest Email

    DeepSeek got the year rolling with a new idea for training AI. And analysts say it could have a massive impact on the industry.

    The Chinese AI startup published a research paper on Wednesday, describing a method to train large language models that could shape “the evolution of foundational models,” it said.

    The paper, co-authored by its founder Liang Wenfeng, introduces what DeepSeek calls “Manifold-Constrained Hyper-Connections,” or mHC, a training approach designed to scale models without them becoming unstable or breaking altogether.

    As language models grow, researchers often try to improve performance by allowing different parts of a model to share more information internally. However, this increases the risk of the information becoming unstable, the paper said.

    DeepSeek’s latest research enables models to share richer internal communication in a constrained manner, preserving training stability and computational efficiency even as models scale, it added.

    Chong Ming Lee, Junior News Reporter at Business Insider's Singapore bureau.

    Every time Lee Chong Ming publishes a story, you’ll get an alert straight to your inbox!

    Stay connected to Lee Chong Ming and get more of their work as it publishes.

    DeepSeek’s new method is a ‘striking breakthrough’

    Wei Sun, the principal analyst for AI at Counterpoint Research, told Business Insider on Friday that the approach is a “striking breakthrough.”

    DeepSeek combined various techniques to minimize the extra cost of training a model, Sun said. She added that even with a slight increase in cost, the new training method could yield much higher performance.

    Sun said the paper reads as a statement of DeepSeek’s internal capabilities. By redesigning the training stack end-to-end, the company is signaling that it can pair “rapid experimentation with highly unconventional research ideas.”

    Deepseek can “once again, bypass compute bottlenecks and unlock leaps in intelligence,” she said, referring to its “Sputnik moment” in January 2025, when the company unveiled its R1 reasoning model.

    The launch shook the tech industry and the US stock market, showing that the R1 model could match top competitors, such as ChatGPT’s o1, at a fraction of the cost.

    Lian Jye Su, the chief analyst at Omdia, a technology research and consulting firm, told Business Insider on Friday that the published research could have a ripple effect across the industry, with rival AI labs developing their own versions of the approach.

    “The willingness to share important findings with the industry while continuing to deliver unique value through new models showcases a newfound confidence in the Chinese AI industry,” Su said of DeepSeek’s paper. Openness is embraced as “a strategic advantage and key differentiator,” he added.

    Is the next DeepSeek model on the horizon?

    The paper comes as DeepSeek is reportedly working toward the release of its next flagship model R2, following an earlier postponement.

    R2, which had been expected in mid-2025, was delayed after Liang expressed dissatisfaction with the model’s performance, according to a June report by The Information. The report said the launch was also complicated by shortages of advanced AI chips, a constraint that has increasingly shaped how Chinese labs train and deploy frontier models.

    While the paper does not mention R2, its timing has raised eyebrows. DeepSeek previously published foundational training research ahead of its R1 model launch.

    Su said DeepSeek’s track record suggests the new architecture will “definitely be implemented in their new model.”

    Sun, on the other hand, is more cautious. “There is most likely no standalone R2 coming,” Sun said. Since DeepSeek has already integrated earlier R1 updates in its V3 model, the technique could form the backbone of DeepSeek’s V4 model, she added.

    Business Insider’s Alistair Barr wrote in June that DeepSeek’s updates to its R1 model failed to generate much traction in the tech industry. Barr argued that distribution matters, and DeepSeek still lacks the broad reach enjoyed by leading AI labs — such as OpenAI and Google — particularly in Western markets.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Keep Reading

    I’m American and Studied at Universities in China, Which Was Cheaper | Invesloan.com

    My Wife and I Left New York City to Move Near My in-Laws for My Son | Invesloan.com

    How Much Uber, Lyft, DoorDash, Other Gig Workers Made Per Hour in 2025 | Invesloan.com

    I Left My Tech Career to Be a Content Career; It’s Lonely however Worth It | Invesloan.com

    FCC Chair Threatens Licenses of Broadcasters Over Iran Coverage | Invesloan.com

    Photos: Fire at Key UAE Oil Hub Day After US Attack on Iran Oil Depot | Invesloan.com

    Atoms Founder Travis Kalanick Says Robots Will Usher in a ‘Golden Age’ | Invesloan.com

    I Overcame Addiction and Opened My Own Candle Business | Invesloan.com

    I Went to My Daughter’s First Sleepover | Invesloan.com

    LATEST NEWS

    I’m American and Studied at Universities in China, Which Was Cheaper | Invesloan.com

    March 15, 2026

    Candidates flip tables on Dem pitch to feminine voters as social gathering tries to ‘pigeonhole’ ladies | Invesloan.com

    March 15, 2026

    My Wife and I Left New York City to Move Near My in-Laws for My Son | Invesloan.com

    March 15, 2026

    ‘I find that advice questionable’: Is it time to rethink the rule of tapping your Roth final — after your 401(okay) and IRA? | Invesloan.com

    March 15, 2026
    POPULAR

    China’s first passenger jet completes maiden commercial flight

    May 28, 2023

    Numbers taking US accountancy exams drop to lowest level in 17 years

    May 29, 2023

    Toyota chair faces removal vote over governance issues

    May 29, 2023
    Advertisement
    Load WordPress Sites in as fast as 37ms!
    Facebook Twitter Pinterest WhatsApp Instagram
    © 2007-2023 Invesloan.com All Rights Reserved.
    • Privacy
    • Terms
    • Press Release
    • Advertise
    • Contact

    Type above and press Enter to search. Press Esc to cancel.

    invesloan.com
    Manage Cookie Consent
    To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
    Functional Always active
    The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
    • Manage options
    • Manage services
    • Manage {vendor_count} vendors
    • Read more about these purposes
    View preferences
    • {title}
    • {title}
    • {title}