The training of an open source artificial intelligence language model for 176 billion parameters from the BigScience international development team has begun.
BigScience main training just started💥 A large language model created as a tool for research🔬
Model: 176 billion parameters
Data: 46 languages
Cluster: 416 GPU – low carbon energy
Follow it live👇
— BigScience Research Workshop (@BigscienceW) March 15, 2022
Now the algorithm is trained on these 46 languages. The work of the model is organized on the Jean Zay supercomputer of the French Institute for Development and Resources in the field of intensive scientific computing. It is based on Nvidia V100 and A100 video accelerators. The maximum capacity of the plant exceeds 28 petaflops.
According to the head of Hugging Face Research Dau Kiela, the training process is planned to be carried out over a period of three to four months.
The developers created the project for research purposes. Proprietary language models from companies like OpenAI, Google or Microsoft exhibit similarly problematic behavior, generating toxic speech, bias and misinformation, engineers say. The open-source algorithm will help researchers understand these problems and fix them, they added.
“If we care about the democratization of research progress and want to make sure that the whole world can use this technology, we must find a solution for this. This is exactly what big science should do,” Kiela said.
The open source BigScience project involves more than a thousand developers from around the world who create and maintain large datasets for training language models.