What happens when ChatGPT starts to feed on its own writing?

Paige Vickers/Vox

AI chatbots won’t destroy human originality. But they may homogenize our lives and flatten our reality.

A few years ago, when Gmail rolled out its autocomplete feature, the big worry was that having a bot finish our sentences would homogenize our emails.

We were so damn cute back then.

That worry looks almost trivial now that we’ve got “generative AI,” a suite of tools ranging from ChatGPT and GPT-4 to DALL-E 2 and Stable Diffusion. These AI models don’t just finish our sentences; they can write an entire essay or create a whole portfolio of art in seconds. And they increase the old worry of homogenization by orders of magnitude.

I’m not just talking about concerns that AI will put writers or artists out of work. Nowadays, if you peer underneath the very real fears of “what if AI robs us humans of our jobs?” you can find a deeper anxiety: What if AI robs us humans of a capacity that’s core to our very humanness — our originality?

Here’s how some worry this might happen: Generative models like ChatGPT are trained on gobs and gobs of text from the internet — most of which, up until now, has been created by human beings. But if we fill the internet with more content created by ChatGPT, and then ChatGPT and its successors learn from that content, and so on and so on, will the narratives that frame how we see the world become a closed loop — ChatGPT all the way down — characterized by infinite regression to the mean? Will that homogenize our writing, our thinking, and ultimately our ways of being? Will it spell “the end of originality”?

Many philosophers have believed that our capacity for original thought is an essential part of human agency and dignity. “It is not by wearing down into uniformity all that is individual in themselves, but by cultivating it and calling it forth…that human beings become a noble and beautiful object of contemplation,” wrote the 19th-century British philosopher John Stuart Mill. He argued for the importance of “giving full freedom to human nature to expand itself in innumerable and conflicting directions.”

We know that new technologies can expand or constrict human nature, that they can literally change our brains. Generative AI models seem poised to constrict it, in part because derivativeness is at the core of how they work, relying as they do on past data to predict which words plausibly come next in whatever you’re writing. They use the past to construct the future.

This isn’t entirely new. Popular recommendation algorithms like Spotify or Netflix also use that trick: You liked this, so you might also like that. Many critics suspect — and some research supports the idea — that this homogenizes our consumption and production of culture over time. Music starts to sound the same; Hollywood worships reboots and sequels. We all cook the same Epicurious recipes and, more worryingly, read the same articles — which tends to be whatever plays well with the Google algorithm, not what’s been buried at the bottom of the search results.

Generative AI could have a similar homogenizing effect, but on a far greater scale. If most self-expression, from text to art to video, is made by AI based on AI’s determination of what appealed before to people on average, we might have a harder time thinking radically different thoughts or conceiving of radically different ways of living.

“I get the intuition that, yes, there would be some uniformization,” Raphaël Millière, an expert in philosophy of AI at Columbia University, told me. “I do worry about that.”

As a novelist as well as a journalist, I’ve felt some of this worry, too. But I’ve also wondered if the whole underlying premise is wrong. Are we humans ever truly original? Or are we always doing derivative and combinatorial work, mixing and matching ideas we’ve already seen before, just like ChatGPT?

The real risk is not exactly about “originality.” It’s more about “diversity.”

Nowadays, we worship the idea of originality — or at least we like to think we do. It’s considered a key ingredient of creativity. In fact, the current consensus definition in philosophy and psychology holds that creativity is the ability to generate ideas that are both original and valuable.

But originality wasn’t always and everywhere considered so central. When traditional Chinese artists learned their craft, they did it by copying earlier masters, and later they proudly painted in the style of their artistic predecessors. When Shakespeare penned romantic comedies, he was rejiggering much older stories about star-crossed lovers — and he seemed to suspect as much, writing, “there be nothing new, but that which is hath been before” (which was itself a rejiggered quote from the Bible).

It was only in the 18th century that originality became such a preeminent value. The Romantics were very big on the notion that the individual self can spontaneously create new ideas and generate its own authoritative meaning. (According to some scholars, people needed to believe that in order to cope with the loss of traditional structures of meaning — a loss ushered in by the Enlightenment.) Western culture has inherited this Romantic notion of originality.

Contemporary neuroscience tells a different story. The latest research suggests that pure originality is, alas, not a thing. Instead, when you’re writing a poem or making a painting, you’re drawing on an interplay between your brain’s memory and control systems: memory, because you have to pull up words, people, or events you’ve encountered before; and control, because you have to flexibly recombine them in new and meaningful ways. Coming up with a unicorn, say, involves remembering the idea of a horse and combining it with the idea of a horn.

If our minds were always already working within a finite loop, the concept of “originality” may be a bit of a red herring, confusing our discussion of generative AI. Instead of worrying about the loss of an originality that perhaps we never possessed, we should talk about the risk of this technology eroding “diversity” or “flexibility” of thought — and replacing that with homogenization or, as the New Yorker’s Kyle Chayka puts it, “Average Garbage Forever.

And that risk is real. In fact, there are multiple senses in which generative AI could homogenize human expression, thought, and life.

The many ways generative AI could homogenize our lives

Stylistically, large language models (LLMs) like ChatGPT might push our writing to become more sanitized. As you’ve probably noticed, they have a tendency to talk in a bland, conformist, Wikipedia-esque way (unless you prompt them otherwise — more on that in a bit).

“If you interact with these models on a daily basis,” Millière told me, “you might end up with your writing impacted by the generic, vanilla outputs of these models.”

ChatGPT also privileges a “proper” English that erases other vernaculars or languages, and the ways of seeing the world that they encode. By default, it’s not writing in African American English (long stigmatized as “incorrect” or “unprofessional”), and it’s certainly not writing by default in, say, Māori language. It trains on the internet, where most content is still in English, in part because there’s still a striking global disparity in who has internet connectivity.

“I worry about Anglocentrism, as most generative models with high visibility perform best in English,” said Irene Solaiman, an AI expert and policy director at Hugging Face who previously worked at OpenAI.

Culturally, ChatGPT might reinforce a Western perspective. Research has shown that richer countries enjoy richer representations in LLMs. Content from or about poorer countries occurs less frequently in the training data, so the models don’t make great predictions about them, and sometimes flat-out erase them.

Rishi Bommasani, an AI researcher at Stanford, offered a simple example. “If you use the models to suggest breakfast foods,” he told me, “they will overwhelmingly suggest Western breakfasts.”

To test that out, I asked the GPT-4-powered Bing to write me a story about “a kid who cooks breakfast.” Bing wrote me a perfectly cogent story … about a boy (male) named Lucas (probably white), whose mom is a chef at a fancy restaurant (probably expensive). Oh, and yes, the kid whips up pancakes, eggs, bacon, and toast (very much Western).

This is worrisome when you think about the cultural effects at scale — and AI is all about scale. Solaiman told me that government representatives from developing countries have already come to her concerned about a new algorithmically powered wave of Westernization, one that could dwarf the homogenizing effects that globalization has already imposed.

It’s not like the language we see deterministically limits the thoughts we’re able to think or the people we’re able to be. When the philosopher Ludwig Wittgenstein said “the limits of my language mean the limits of my world,” that was a bit of an overstatement. But language does shape how we think and, by extension, the lives we dare to imagine for ourselves; it’s the reason there’s such a big push to portray diverse characters in STEM fields in children’s books. As adults, our imaginations are also conditioned by what we read, watch, and consume.

Bommasani and his colleagues also worry about algorithmic monoculture leading to “outcome homogenization.” AI’s advantage and disadvantage is in its sheer scale. If it makes a mistake, it’s not like one hiring manager or one bank officer making a mistake; it goes all the way down the line. If many decision-makers incorporate the same popular AI models into their workflow, the biases of the models will trickle into all the downstream tasks. That could lead to a situation where certain people or groups experience negative outcomes from all decision-makers. Their applications for a job or a loan are rejected not just by one company or bank, but by every company or bank they try! Not exactly a recipe for diversity, equity, and inclusion.

But the risks of homogenization don’t end there. There are also potential epistemic effects — how generative AI may push us toward certain modes of thinking. “In terms of the way in which you formulate your reasoning, and perhaps eventually the way in which you think, that’s definitely a concern,” Millière said.

Maybe we get used to providing only a starting prompt for a text, which the AI then completes. Or maybe we grow accustomed to providing the outline or skeleton and expecting the AI to put meat on the bones. Sure, we can then make tweaks — but are we cheating ourselves out of something important if we jump straight to that editing stage?

The writer Rob Horning recently expressed this anxiety:

I am imagining a scenario in the near future when I will be working on writing something in some productivity suite or other, and as I type in the main document, my words will also appear in a smaller window to the side, wherein a large language model completes several more paragraphs of whatever I am trying to write for me, well before I have the chance to conceive of it. In every moment in which I pause to gather my thoughts and think about what I am trying to say, the AI assistant will be thinking for me, showing me what it calculates to be what I should be saying…

Maybe I will use its output as a gauge of exactly what I must not say, in which case it is still dictating what I say to a degree. Or maybe I’ll just import its language into my main document and tinker with it slightly, taking some kind of ownership over it, adapting my thinking to accommodate its ideas so that I can pretend to myself I would have eventually thought them too. I am wondering what I will have to pay to get that window, or worse, what I’ll have to pay to make it disappear.

There’s a palpable fear here about relinquishing the role of creator for the role of curator, about letting originality become contaminated by some outside influence. Again, since pure originality is probably a fantasy, arguably we’re all already curators, and we’re always under the influence of others (sorry, Romantics!).

Still, skipping over the idea-generation phase by immediately turning to LLMs for help seems like a bad idea for two interrelated reasons.

First, we may become overreliant on the tech, so much so that some of our imaginative or cognitive “muscles” gradually become weaker for lack of use. If you think that’s implausible, ask yourself how many of your friends’ phone numbers you remember, or how much mental math you can do, now that you walk around with a smartphone on you at all times.

Such concerns aren’t new. The ancient Greek philosopher Socrates, who operated in a largely oral culture, worried that the invention of writing “will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory.” Contemporary research actually bears out the philosopher’s fears, showing that “when people expect to have future access to information, they have lower rates of recall of the information itself.”

Which doesn’t mean we should all give up writing, without which civilization as we know it would essentially be impossible! But it does mean we should think about which skills each new technology may reshape or diminish — especially if we’re not mindful about how we use it — and ask ourselves whether we’re fine with that.

OpenAI itself highlights overreliance as a potential problem with GPT-4. The model’s system card notes, “As users become more comfortable with the system, dependency on the model may hinder the development of new skills or even lead to the loss of important skills.”

Second, asking LLMs for help at the earliest stages of our creative process will yield a certain answer that inevitably primes us to think in a certain direction. There will be thought paths we’re less likely to go down because ChatGPT has already got certain (majority) voices whispering in our ears. Other (minority) voices will get left out — potentially leaving our writing, and our thinking, impoverished as a result.

Usually, we’re in a position to be able to dial up or down the degree to which other voices are whispering in our ears. When I was writing my debut novel, and suffering from what the literary critic Harold Bloom called “the anxiety of influence,” I actually decided to bar myself from reading fiction for a while because I realized the sentences I was writing were starting to sound like Jonathan Franzen, whose novels I’d just been reading. I didn’t want another writer’s voice to overly influence mine, so I put the books down.

But if we become overreliant on a technology, we become, definitionally, less likely to put it down. Sure, we still have some agency. But the ease of turning to ChatGPT, coupled with the magical-feeling instant gratification it provides (just put in your incantation and the oracle replies!), can make it harder to exercise that agency.

What can AI companies — and the rest of us — do to counter homogenization?

So far, we’ve been unpacking worries about what happens when we have not just a machine producing the content that informs our imagination, but machines trained on machines, forever and ever. Yet there’s an obvious question here. If you’re a company building an AI model, can you just put AI-generated data off limits for training, and therefore stop the model from eating its own tail?

“Maybe you can do better than chance — you can do something — but I don’t think you can do it well at scale,” Bommasani said. “It would be pretty hard to guarantee that your training data for the next model includes no machine-generated data from the previous model.”

Millière agreed. “It’s probably hard already, and in the future it’ll be even harder to quantify how much contamination there is in your data.”

Even though researchers are working on detection models to spot AI-generated outputs and ways to watermark them, and even though there are stronger and weaker methods for detecting contamination (OpenAI’s method could use some work), this remains a very tricky problem. That’s because the whole point of LLMs is to crank out text indistinguishable from what humans would produce.

Beyond trying to prevent contamination, something companies can do is pay careful attention to how they’re designing the interface for these models. When I first got early access to Bing in mid-February, I gave it simple prompts, like asking it to write me a song. It was just an “it” — one single mode to choose from. But by the last week of that month, Bing featured three “conversation styles,” and I had to choose between them: precise, balanced, or creative. When I chose the creative style, it answered in more off-the-wall, less predictable ways.

When you’re trying to write something factual, you don’t want to dial up unpredictable deviations, as those who’ve been using generative AI for research are learning. But in creative work, it could help to lean into the unpredictable — or, as AI researchers might put it, to “increase hallucinations” or “increase the temperature.” That makes the model less deterministic, so instead of choosing the next word with the highest probability of occurring, it can choose words with much lower probabilities. It’s the difference between typing in “The sky is” and getting back “The sky is blue” versus getting back “The sky is clear, the water smooth, and it’s an unimaginably long way to go before the dolphins decide to give up their vertical quest.”

Getting a model to diverge more from the data it learned during training can be somewhat helpful (though probably not sufficient) for addressing the homogenization concern. But how much it’ll help depends in part on how much the interface nudges people to get creative themselves rather than relying on the default, and on how individuals choose to use the model.

To be clear, the onus should be mostly on the companies, not on you, the user. That said, Millière thinks the models could enrich your creative process if you go the extra mile to prompt it in certain ways. He imagined an author who wants to attempt the challenging task of writing across difference — for example, an author who has never been to rural Texas trying to create characters from rural Texas, complete with naturalistic dialogue.

“I could see this augmenting your creativity, because it’ll lead you to abstract away from your own personal perspective and biases to explore a different linguistic realm that you don’t necessarily have access to yourself,” Millière told me.

I’ve been experimenting with LLMs since 2019, when I used GPT-2 to help me with my next novel. It’s about two little girls who discover an ancient hotel that contains a black hole, complete with wormholes. Prompted with the idea of wormholes, GPT-2 gave me a bunch of questions seeking to pin down exactly how the wormholes work. Those questions were really helpful for world-building!

But I turned to the LLM only once I had a complete draft of the novel and felt stuck on how to improve it. At that point, GPT-2 worked great as a creative prosthesis to help me out of my rut. I would not turn to it in the early stage, when I’m staring down a blank page (though that’s precisely how it’s being marketed). I don’t want it to weaken my writerly muscles through overreliance, or take my imagination down specific paths before I’ve had a chance to survey as many paths as I want.

What is AI for? What is humanity for?

Can we tweak AI models with human feedback to get them to be more surprising or variable in their outputs? Yes. Is that the same as human beings struggling against a convention of thought to push a new idea or vision into the world because it gives voice to something unspoken that’s happening in us? No.

“The only way that AI will be compatible with human flourishing is if it empowers that,” said Shannon Vallor, a philosopher of technology at the University of Edinburgh, where she directs the Centre for Technomoral Futures. “If it makes it easier and more rewarding for us to use our recorded past as a place to push off from, rather than to revolve around. But that’s not what today’s commercial AI systems are built for.”

OpenAI says its mission is to ensure that AI “benefits all of humanity.” But who gets to define what that means, and whether we’re all willing to diminish a core human capacity in the quest to optimize for a definition decided on by Silicon Valley? As the philosopher Atoosa Kasirzadeh has written, “the promise that AI technologies will benefit all of humanity is empty so long as we lack a nuanced understanding of what humanity is supposed to be.”

As generative AI models proliferate, we all face a question: Will we put in the work to counteract their homogenizing effects? Maybe we’ll answer with a collective meh, proving that we don’t actually care about originality or diversity of thought as much as we thought we did. But then we have to face another question: If we don’t really value the capacity for original thought and its role in human agency, then what do we value? What do we think it is to lead a meaningful life? What is a human life for?

And these are not questions ChatGPT can answer.


Post a Comment

0 Comments