Yazılar

Russia’s Sberbank to Launch Advanced Reasoning Large Language Model

Russia’s largest bank, Sberbank, is preparing to release an upgraded version of its large language model (LLM) called GigaChat, which will feature reasoning capabilities capable of scientific research and solving complex problems, according to First Deputy CEO Alexander Vedyakhin. He revealed that he is currently testing the beta version of this new model.

The enhanced GigaChat aims to handle sophisticated tasks in areas such as science, coding, and mathematics, similar to advanced LLMs launched by global leaders like OpenAI. Despite trailing U.S. and Chinese AI developers by six to nine months, Sberbank’s use of domestic cloud infrastructure and localized language adaptation makes GigaChat especially attractive to Russian corporate users.

Currently, about 15,000 Russian companies employ Sberbank’s GigaChat. Meanwhile, Yandex, a key domestic AI competitor, recently announced reasoning capabilities in its search engine, highlighting the competitive AI landscape in Russia.

Meta Delays Launch of Flagship ‘Behemoth’ AI Model Over Performance Concerns

Meta Platforms (META.O) is delaying the release of its much-anticipated Behemoth” AI model, the company’s most powerful large language model (LLM) to date, amid internal doubts about its performance and readiness, according to a report by the Wall Street Journal.

Originally slated for release in April to coincide with Meta’s inaugural developer AI conference, the internal launch target was later shifted to June. Now, the launch has been postponed to fall or later, people familiar with the matter said.

Reasons for Delay:

  • Engineers at Meta are reportedly struggling to make meaningful improvements in Behemoth’s performance compared to earlier models.

  • Staff have raised questions about whether the upgrades justify a public release, suggesting the model may not yet offer a significant leap over predecessors like Llama 3 or Llama 4.

Meta has not yet commented publicly on the delay, and the Behemoth model remains unreleased as of mid-May.

Development Context:

  • Meta had previously described Behemoth as one of the smartest LLMs in the world”, intended to act as a teacher model for training smaller, faster models.

  • In April, Meta released other variants in its LLM family, including Llama 4 Scout and Llama 4 Maverick, but did not follow through with Behemoth’s public debut.

Industry Implications:

  • The delay highlights the growing technical challenges in scaling LLMs meaningfully, especially as performance gains become harder to achieve beyond a certain model size.

  • It comes at a time when AI competitors like OpenAI, Google, and Anthropic are releasing increasingly powerful models and tools, raising competitive pressure in the LLM arms race.

Meta’s pivot may reflect a more cautious release strategy, likely aimed at avoiding backlash over underwhelming capabilities or potential AI safety concerns.

Foxconn Launches ‘FoxBrain’, Its First Large Language Model

Foxconn, the Taiwanese tech giant known for its role as the world’s largest contract electronics manufacturer, has unveiled its first large language model, named “FoxBrain.” The company announced on Monday that the model is designed to enhance its manufacturing processes and streamline supply chain management. This move marks a significant step for Foxconn in integrating artificial intelligence (AI) into its operations, potentially reshaping how the company handles everything from production workflows to data analysis.

FoxBrain was trained using 120 of Nvidia’s powerful H100 GPUs, completing the training in just four weeks. The rapid development showcases the advanced capabilities of both Foxconn’s infrastructure and the GPUs provided by Nvidia. The model is built on Meta’s Llama 3.1 architecture, which offers robust natural language processing features. This is Taiwan’s first large language model to incorporate reasoning capabilities, and it has been specifically optimized to handle traditional Chinese and Taiwanese language styles, addressing a critical gap for local businesses and users.

While Foxconn acknowledged that there is a slight performance gap when compared to China’s DeepSeek distillation model, the company emphasized that FoxBrain’s performance is still very close to world-class standards. This positions FoxBrain as a competitive force in the rapidly growing field of large language models, demonstrating that Foxconn is capable of developing AI technology with global relevance. The company is clearly looking to position FoxBrain as a versatile tool that can assist not only in internal operations but also in broader AI applications.

Initially, FoxBrain will be applied for internal purposes, focusing on areas like data analysis, decision support, document collaboration, mathematics, reasoning, problem-solving, and even code generation. This wide range of applications reflects the model’s versatility and its potential to drive efficiencies across different sectors of Foxconn’s business. With its reasoning capabilities, FoxBrain could play a key role in automating decision-making processes and improving the overall productivity of Foxconn’s vast manufacturing ecosystem.