Cerebras pioneering AI opens up general AI where OpenAI is in the dark


Cerebras’ Andromeda supercomputer was used to train seven language programs similar to OpenAI’s ChatGPT. Brain system
The world of artificial intelligence, especially the hugely popular aspect of it known as “next generation AI” — which automatically generates text and images — is in danger of closing vision because chilling effect of companies deciding not to publish details of their research.
But the move to secrecy may have prompted some participants in the AI world to step in and fill the disclosure gap.
On Tuesday, the AI pioneer Cerebras, the maker of specialized AI computers and Largest place of the World computer chips, Published as open source several versions of the general AI program for unrestricted use.
Programs are “trained” by Cerebras, that is, delivering optimal performance using the company’s powerful supercomputer, thus reducing some of the work that external researchers have to do. .
“Companies are making decisions that are different from what they were making a year or two ago, and we disagree with those decisions,” said Andrew Feldman, co-founder and CEO of Cerebras. in an interview with ZDNET, alluding to the decision of OpenAI, the creator of ChatGPT. technical details are not disclosed when it unveiled its latest generation AI program this monthGPT-4, a move widely criticized in AI research circles.
Also: With GPT-4, OpenAI chooses to keep secret instead of reveal
“We believe in a vibrant, open community — not just of researchers, and not just three or four or five or eight LLMs, but a vibrant community in which startups , medium-sized companies and enterprises that are training large language models — it’s good for us and it’s good for others,” Feldman said.
The term big language model refers to AI programs based on machine learning principles, where a neural network records the statistical distribution of words in sample data. That process allows a large language model to predict the next word in sequence. That capability underpins popular generic AI programs like ChatGPT.
The same type of machine learning approach is related to general AI in other fields, such as OpenAI’s Dall*E, create images based on a suggested phrase.
Also: Best AI Art Generator: DALL-E2 and other cool alternatives to try
Cerebras posted seven major language models in the same style as OpenAI’s GPT program, which started the general AI craze in 2018. The code is available on the website of the AI startup Hugging Face and on GitHub.
Programs vary in size, from 111 million parameters or neural weights to 13 billion parameters. In general, more parameters make the AI program more powerful, so the Cerebras code gives a variety of performance.
The company has posted not only the source code of the program, in Python and TensorFlow formats, under the Apache 2.0 open source license, but also details about the training mode in which the programs are brought to the state. functional state has evolved.
That disclosure allowed the researchers to examine and replicate Cerebras’ work.
Feldman said the Cerebras release is the first time a GPT-style program has been made public “using the most advanced training efficiency techniques.”
Other published AI training work that has concealed technical data, such as OpenAI’s GPT-4, or, programs that have not been optimized in development, i.e., data provided for the program that has not been adjusted to the size of the program, as explained in Cerebras technical blog post.
Such large language models are notoriously computationally intensive. Tuesday’s Cerebras work was developed on a cluster of 16 CS-2 computers, dorm-fridge-sized computers specially tuned for AI-style programs. Cluster, previously disclosed by the companycalled the Andromeda supercomputer, can dramatically cut down on LLM training on thousands of Nvidia GPU chips.
Also: Bengio, AI pioneer, says success of ChatGPT can drive privacy trends in AI
As part of Tuesday’s release, Cerebras rolled out what it calls the first open source scaling law, a standard rule of thumb for how exactly such programs increase with size program based on open source data. The dataset used is open source pilea collection of 825 gigabytes of text, mostly professional and academic text, introduced in 2020 by the nonprofit laboratory Eleuther.
Previous scaling rules from OpenAI and Google’s DeepMind used training data that was not open source.
In the past, Cerebras has demonstrated the efficiency advantage of its systems. Feldman says the ability to effectively train demanding natural language programs goes to the heart of open publishing problems.
“If you can achieve efficiency, you can afford to bring everything into the open source community,” Feldman said. “Efficiency allows us to do this quickly and easily and share our contributions with the community.”
A major reason OpenAI and others are starting to close their work to the rest of the world, he said, is because they have to protect their profits against the increasing training costs of AI.
Also: GPT-4: New ability to give illegal advice and display ’emerging malicious behavior’
“It’s too expensive, they decided it was a strategic asset and they decided to keep it out of the community because it was strategic for them,” he said. “And I think it’s a very sensible strategy.
“It’s a sensible strategy if a company wants to invest a lot of time, effort and money and not share the results with the rest of the world,” Feldman added.
However, “We think that makes for a less interesting ecosystem and in the long run it limits the rising wave of research,” he said.
Feldman observed that companies can “reserve” resources, such as datasets or model expertise, by hoarding them.
“The question is how these resources are strategically used in context,” he said. “We believe we can help by coming up with open models, using data that everyone can see.”
When asked what the product of the open source release could be, Feldman commented: “Hundreds of separate organizations can work with these GPT models that shouldn’t be possible and solve problems. which could have been set aside.”