Google’s Jigsaw Develops AI Tools to Rank Online Content Based on Civility and Nuance

In the 1990s and early 2000s, technologists made the world a grand promise: new communications technologies would strengthen democracy, undermine authoritarianism, and lead to a new era of human flourishing. However, today, few people would agree that the internet has lived up to that lofty goal.

For many years now, on social media platforms, content tends to be ranked based on how much engagement it receives. Over the last two decades, politics, media, and culture have all been reshaped to meet a single, overriding incentive: posts that provoke an emotional response often rise to the top.

Efforts to improve the health of online spaces have long focused on content moderation, the practice of detecting and removing harmful content. Tech companies hired workers and built AI to identify hate speech, incitement of violence, and harassment. That worked imperfectly, but it stopped the worst toxicity from flooding our feeds.

There was one problem: while these AIs helped remove the bad content, they didn’t elevate the good content. “Do you see an internet that is working, where we are having conversations that are healthy or productive?” asks Yasmin Green, the CEO of Google’s Jigsaw unit, which was founded in 2010 with a remit to address threats to open societies. “No. You see an internet that is further and further apart.”

What if there was another way?

Jigsaw believes it has found one. On Monday, the Google subsidiary , or classifiers, that can score posts based on the likelihood that they contain good content: Is a post nuanced? Does it contain evidence-based reasoning? Does it share a personal story, or foster human compassion? By returning a numerical score (from 0 to 1) representing the likelihood of a post containing each of those virtues and others, these new AI tools could allow the designers of online spaces to rank posts in a new way. Instead of posts that receive the most likes or comments rising to the top, platforms could—in an effort to foster a better community—choose to put the most nuanced comments, or the most compassionate ones, first.

The breakthrough was made possible by recent advances in large language models (LLMs), the type of AI that underpins chatbots like ChatGPT. In the past, even training an AI to detect simple forms of toxicity, like whether a post was racist, required millions of labeled examples. Those older forms of AI were often brittle and ineffectual, not to mention expensive to develop. However, the new generation of LLMs can identify even complex linguistic concepts out of the box, and calibrating them to perform specific tasks is far cheaper than it used to be. Jigsaw’s new classifiers can identify “attributes” like whether a post contains a personal story, curiosity, nuance, compassion, reasoning, affinity, or respect. “It’s starting to become feasible to talk about something like building a classifier for compassion, or curiosity, or nuance,” says Jonathan Stray, a senior scientist at the Berkeley Center for Human-Compatible AI. “These fuzzy, contextual, know-it-when-I-see-it kind of concepts— we’re getting much better at detecting those.”

This new ability could be a watershed for the internet. Green, and a growing chorus of academics who study the effects of social media on public discourse, argue that content moderation is “necessary but not sufficient” to make the internet a better place. Finding a way to boost positive content, they say, could have cascading positive effects both at the personal level—our relationships with each other—but also at the scale of society. “By changing the way that content is ranked, if you can do it in a broad enough way, you might be able to change the media economics of the entire system,” says Stray, who did not work on the Jigsaw project. “If enough of the algorithmic distribution channels disfavored divisive rhetoric, it just wouldn’t be worth it to produce it any more.”

One morning in late March, Tin Acosta joins a video call from Jigsaw’s offices in New York City. On the conference room wall behind her, there is a large photograph from the 2003 Rose Revolution in Georgia, when peaceful protestors toppled the country’s Soviet-era government. Other rooms have similar photos of people in Syria, Iran, Cuba and North Korea “using tech and their voices to secure their freedom,” Jigsaw’s press officer, who is also in the room, tells me. The photos are intended as a reminder of Jigsaw’s mission to use technology as a force for good, and its duty to serve people in both democracies and repressive societies.

On her laptop, Acosta fires up a demonstration of Jigsaw’s new classifiers. Using a database of 380 comments from a recent Reddit thread, the Jigsaw senior product manager begins to demonstrate how ranking the posts using different classifiers would change the sorts of comments that rise to the top. The thread’s original poster had asked for life-affirming movie recommendations. Sorted by the default ranking on Reddit—posts that have received the most upvotes—the top comments are short, and contain little beyond the titles of popular movies. Then Acosta clicks a drop-down menu, and selects Jigsaw’s reasoning classifier. The posts reshuffle. Now, the top comments are more detailed. “You start to see people being really thoughtful about their responses,” Acosta says. “Here’s somebody talking about School of Rock—not just the content of the plot, but also the ways in which the movie has changed his life and made him fall in love with music.” (TIME agreed not to quote directly from the comments, which Jigsaw said were used for demonstrative purposes only and had not been used to train its AI models.)

Acosta chooses another classifier, one of her favorites: whether a post contains a personal story. The top comment is now from a user describing how, under both a heavy blanket and the influence of drugs, they had ugly-cried so hard at Ke Huy Quan’s monologue in Everything Everywhere All at Once that they’d had to pause the movie multiple times. Another top comment describes how a movie trailer had inspired them to quit a job they were miserable with. Another tells the story of how a movie reminded them of their sister, who had died 10 years earlier. “This is a really great way to look through a conversation and understand it a little better than [ranking by] engagement or recency,” Acosta says.

For the classifiers to have an impact on the wider internet, they would require buy-in from the biggest tech companies, which are all locked in a zero-sum competition for our attention. Even though they were developed inside Google, the tech giant has no plans to start using them to help rank its YouTube comments, Green says. Instead, Jigsaw is making the tools freely available for independent developers, in the hopes that smaller online spaces, like message boards and newspaper comment sections, will build up an evidence base that the new forms of ranking are popular with users.

There are some reasons to be skeptical. For all its flaws, ranking by engagement is egalitarian. Popular posts get amplified regardless of their content, and in this way social media has allowed marginalized groups to gain a voice long denied to them by traditional media. Introducing AI into the mix could threaten this state of affairs. A wide body of research shows that LLMs have plenty of ingrained biases; if applied too hastily, Jigsaw’s classifiers might end up boosting voices that are already prominent online, thus further marginalizing those that aren’t. The classifiers could also exacerbate the problem of AI-generated content flooding the internet, by providing spammers with an easy recipe for AI-generated content that’s likely to get amplified. Even if Jigsaw evades those problems, tinkering with online speech has become a political minefield. Both conservatives and liberals are convinced their posts are being censored; meanwhile, tech companies are under fire for making unaccountable decisions that affect the global public square. Jigsaw argues that its new tools may allow tech platforms to rely less on the controversial practice of content moderation. But there’s no getting away from the fact that changing what kind of speech gets rewarded online will always have political opponents.

Still, academics argue that if done carefully and accountably, ranking posts to elevate nuanced, compassionate contributions could counteract some of social media’s most toxic effects without compromising free expression. Only time will tell whether Jigsaw has found a viable way forward.