The BigScience international development team has launched the training of an open source artificial intelligence language model for 176 billion parameters.
BigScience main training just started💥 A large language model created as a tool for research🔬
Model: 176 billion parameters
Data: 46 languages
Cluster: 416 GPU – low carbon energy
Follow it live👇
— BigScience Research Workshop (@BigscienceW) March 15, 2022
The algorithm is trained on these 46 languages. The model is trained on the Jean Zay supercomputer of the French Institute for Development and Resources in the field of intensive scientific computing. It is based on Nvidia V100 and A100 video accelerators. Peak plant performance exceeds 28 petaflops.
Head of Research Department Hugging face Dau Kiela said that the training process will take three to four months.
According to the developers, the project is intended for research purposes. Proprietary language models from companies like OpenAI, Google or Microsoft exhibit similarly problematic behavior, generating toxic speech, bias and misinformation, engineers say. The open-source algorithm will help researchers understand these problems and fix them, they added.
“If we care about the democratization of research progress and want to make sure that the whole world can use this technology, we must find a solution for this. This is exactly what big science should do,” Kiela said.
The BigScience open project brings together about a thousand developers from around the world who create and maintain large datasets for training language models.
Recall that in January, OpenAI announced the creation of a less toxic version of GPT-3.
In December 2021, DeepMind introduced a language model with 280 billion parameters.
In October, Microsoft and Nvidia developed an algorithm three times the size of GPT-3.
Subscribe to Cryplogger news in Telegram: Cryplogger AI – all the news from the world of AI!
Found a mistake in the text? Select it and press CTRL+ENTER