Self-generative AI might be heading for an apocalypse

Published in News

Self-generative AI might be heading for an apocalypse

by Nick Farrell on14 June 2023

font size decrease font size increase font size
Print
Email

Eating its own tail

A new study suggests that the enthusiasm for self-generative AI might dwindle as it starts looking at its own content instead of books and databases created by humans.

Large language models (LLMs) and other transformer models underpinning products such as ChatGPT, Stable Diffusion and Midjourney come initially from human sources -- books, articles, and photographs that were created without the help of artificial intelligence. But as more people use AI to produce and publish content that content will gradually pollute the internet, and AI models begin to train on it.

Writing in the open-access journal arXiv a team of boffins from Cambridge University and the University of Edinburgh found that model-generated content in training causes irreversible defects in the resulting models.

"Specifically looking at probability distributions for text-to-text and image-to-image AI generative models, the researchers concluded that "learning from data produced by other models causes model collapse -- a degenerative process whereby, over time, models forget the true underlying data distribution... this process is inevitable, even for cases with almost ideal conditions for long-term learning," the report warns.

In fact, the report thinks that model collapse will happen quickly as models can rapidly forget most of the original data from which they initially learned.

One of the paper's authors, Ross Anderson, professor of security engineering at Cambridge University and the University of Edinburgh warned that humanity was about to fill the internet with blah in the same way it put plastic into the oceans.

“This will make it harder to train newer models by scraping the web, giving an advantage to firms which already did that, or which control access to human interfaces at scale. Indeed, we already see AI startups hammering the Internet Archive for training data," he said.

Last modified on 14 June 2023

Rate this item

(0 votes)

Tagged under

More in this category: « Reddit boss tries to downplay the outfit’s self-created woes US antics divide subsea cable market into East and West »

Self-generative AI might be heading for an apocalypse

Most popular

Latest comments

Read more about: