DeepSeek pitches new route to scale AI, but researchers call for more testing

DeepSeek’s proposed “mHC” architecture could transform the training of large language models (LLMs) – the technology behind artificial intelligence chatbots – as developers look for ways to scale models without simply adding more computing power. However, experts cautioned that while the approach could prove far-reaching, it might still prove difficult to put into practice. In a technical paper released last week, co-authored by DeepSeek founder and CEO Liang Wenfeng, the company proposed...

DeepSeek pitches new route to scale AI, but researchers call for more testing

Related Articles from scmp.com

How far is China willing to go to help Cuba in face of increasing US pressure?

China says it is still ‘deeply committed’ to Venezuela as ambassador meets new leader

China cautions against ‘compound errors’ from clashing policies

US raid in Venezuela lays bare its law-of-the-jungle approach

Malaysia’s Umno seeks Najib’s pardon amid 1MDB scandal, while staying in PM Anwar’s coalition

8 arrested over attempting to defraud HK$8.24 million from Hong Kong subsidy scheme