About three years ago, researchers at Google and top labs like OpenAI started designing neural networks that learned from enormous amounts of prose, including unpublished books and Wikipedia articles by the thousands.
Mckay Wrigley, a 23-year-old computer programmer from Salt Lake City, was one of the few invited to tinker with the system, which uses everything it has learned from that vast sea of digital text to generate new language on its own. Wrigley wondered if it could imitate public figures — write like them, perhaps even chat like them. One of his experiments involved a pop psychologist, Scott Barry Kaufman. The system took in Kaufman’s name and a topic for discussion: creativity. Then, when asked “How do we become more creative?” GPT-3 responded instantly: I think creative expression is a natural by-product of growing up in a diverse world. The more diverse the world is, the more you get exposed to different people, to different opportunities, to different places and to different challenges. Later, when Wrigley posted the paragraph on Twitter, somebody looped in the real Scott Barry Kaufman. He was stunned. “It definitely sounds like something I would say,” the real Kaufman tweeted, later adding, “Crazy accurate A.I.” In the weeks since its arrival, GPT-3 has spawned dozens of other experiments that raise the eyebrows in much the same way. It generates tweets, pens poetry, summarises emails, answers trivia questions, translates languages and even writes its own computer programs, all with very little prompting. Some of these skills caught even the experts off guard. For many artificial intelligence researchers, it is an unexpected step toward machines that can understand the vagaries of human language — and perhaps even tackle other human skills. “It is surprising to me, and to a lot of people,” said Melanie Mitchell, an A.I. researcher at the Santa Fe Institute, an independent lab in New Mexico, who is among those experimenting with the system.
GPT-3 is what artificial intelligence researchers call a neural network, a mathematical system loosely modeled on the web of neurons in the brain. This is the same technology that identifies faces in the photos you post to Facebook and recognises the commands you bark into your iPhone. A neural network learns such skills by pinpointing patterns in vast amounts of digital data. By analysing thousands of cat photos, for instance, it can learn to recognise a cat.
About three years ago, researchers at Google and top labs like OpenAI started designing neural networks that learned from enormous amounts of prose, including unpublished books and Wikipedia articles by the thousands. These universal language models could be applied not just to one task, like translation, but to many. GPT-3 analysed digital prose on an unprecedented scale, spending months looking for patterns in huge amounts of text posted to the internet. In this way, it learned to predict the next word in a sequence. If you type a few words into GPT-3, it will keep going, completing your thought with entire paragraphs of text.
But in acquiring this specific skill, it learned much more. During its months of training, GPT-3 identified more than 175 billion parameters — mathematical representations of patterns — in that sea of books, Wikipedia articles and other online texts. These patterns amount to a map of human language: a mathematical description of the way we piece characters together, whether we are writing blogs or coding software programs. Using this map, GPT-3 can perform all sorts of tasks it was not built to do.
Cade Metz is a technology correspondent for NYT©2020
The New York Times