Reading 2 minutes Views 2 Published Updated
On June 29, Palo Alto-based Inflection AI announced the completion of a $1.3 billion fundraiser led by Microsoft, Reed Hoffman, Bill Gates, Eric Schmidt and NVIDIA. The new capital will be partly used to build a 22,000-unit NVIDIA H100 Tensor GPU cluster, which the company claims is the largest in the world. GPUs will be used to develop large-scale AI models. The developers wrote:
“We estimate that if we entered our cluster on the recent TOP500 supercomputing list, it would be second and close to the first entry, despite being optimized for AI and not scientific applications.”
Inflection AI is also developing its own personal adjutant system called “Pi”. The firm explains that Pi is a “teacher, coach, confidant, creative partner and resonant body” that can be accessed directly via social media or WhatsApp. The company’s total funding has reached $1.525 billion since its inception in early 2022.
Despite growing investment in large AI models, experts warn that the actual effectiveness of their training could be severely limited by current technological limitations. In one example provided by Singapore-based venture capital firm Foresight, the researchers wrote, citing a large AI model with 175 billion parameters and 700 GB of data:
“Assuming we have 100 compute nodes and each node has to update all parameters at every step, each step will require about 70 TB of data to be transferred (700 GB * 100). If we optimistically assume that each step takes 1s, then 70TB of data would be required. will be transmitted per second. This demand for bandwidth far exceeds the capacity of most networks.”
Continuing with the example above, Foresight also warned that “due to communication latency and network congestion, data transfer times can be well in excess of 1 s,” meaning that compute nodes may spend most of their time waiting for data transfers instead of performing actual calculations. . In conclusion, Foresight analysts explained that given the current limitations, the solution lies in smaller AI models that are “easier to deploy and manage.”
“In many application scenarios, users or companies don’t need the more versatile reasoning capabilities of large language models, they only focus on a very precise prediction target.”