DeepSeek’s proposed “mHC” architecture could transform the training of large language models (LLMs) – the technology behind artificial intelligence chatbots – as developers look for ways to scale models without simply adding more computing power.
However, experts cautioned that while the approach could prove far-reaching, it might still prove difficult to put into practice.
In a technical paper released last week, co-authored by DeepSeek founder and CEO Liang Wenfeng, the company proposed...
DeepSeek pitches new route to scale AI, but researchers call for more testing
Published 4 days ago
Source: scmp.com

Related Articles from scmp.com
5 minutes ago
How far is China willing to go to help Cuba in face of increasing US pressure?
35 minutes ago
China says it is still ‘deeply committed’ to Venezuela as ambassador meets new leader
35 minutes ago
China cautions against ‘compound errors’ from clashing policies
35 minutes ago
US raid in Venezuela lays bare its law-of-the-jungle approach
42 minutes ago
Malaysia’s Umno seeks Najib’s pardon amid 1MDB scandal, while staying in PM Anwar’s coalition
54 minutes ago