The Grim Conclusions of the Largest Ever Study of Fake News

0
760

(ABC TODAY) Falsehood flies, and the Truth comes limping after it,” Jonathan Swift once wrote.

It was hyperbole three centuries ago. But it is a factual description of social media, according to an ambitious and first-of-its-kind study published Thursday in Science.

The massive new study analyzes every major contested news story in English across the span of Twitter’s existence some 126,000 stories, tweeted by 3 million users, over more than 10 years—and finds that the truth simply cannot compete with hoax and rumor. By every common metric, falsehood consistently dominates the truth on Twitter, the study finds: Fake news and false rumors reach more people, penetrate deeper into the social network, and spread much faster than accurate stories.

“It seems to be pretty clear [from our study] that false information outperforms true information,” said Soroush Vosoughi, a data scientist at MIT who has studied fake news since 2013 and who led this study. “And that is not just because of bots. It might have something to do with human nature.”

The study has already prompted alarm from social scientists. “We must redesign our information ecosystem in the 21st century,” write a group of 16 political scientists and legal scholars in an essay also published Thursday in Science. They call for a new drive of interdisciplinary research “to reduce the spread of fake news and to address the underlying pathologies it has revealed.”

“How can we create a news ecosystem … that values and promotes truth?” they ask.

The new study suggests that it will not be easy. Though Vosoughi and his colleagues only focus on Twitter—the study was conducted using exclusive data that the company made available to MIT—their work has implications for Facebook, YouTube, and every major social network. Any platform that regularly amplifies engaging or provocative content runs the risk of amplifying fake news along with it.

Though the study is written in the clinical language of statistics, it offers a methodical indictment of the accuracy of information that spreads on these platforms. A false story is much more likely to go viral than a real story, the authors find. A false story reaches 1,500 people six times quicker, on average, than a true story does. And while false stories outperform the truth on every subject including business, terrorism and war, science and technology, and entertainment fake news about politics regularly does best.

Twitter users seem almost to prefer sharing falsehoods. Even when the researchers controlled for every difference between the accounts originating rumors like whether that person had more followers or was verified falsehoods were still 70 percent more likely to get retweeted than accurate news.

And blame for this problem cannot be laid with our robotic brethren. From 2006 to 2016, Twitter bots amplified true stories as much as they amplified false ones, the study found. Fake news prospers, the authors write, “because humans, not robots, are more likely to spread it.”

Political scientists and social-media researchers largely praised the study, saying it gave the broadest and most rigorous look so far into the scale of the fake-news problem on social networks, though some disputed its findings about bots and questioned its definition of news.

“This is a really interesting and impressive study, and the results around how demonstrably untrue assertions spread faster and wider than demonstrable true ones do, within the sample, seem very robust, consistent, and well supported,” said Rasmus Kleis Nielsen, a professor of political communication at the University of Oxford, in an email.

“I think it’s very careful, important work,” Brendan Nyhan, a professor of government at Dartmouth College, told me. “It’s excellent research of the sort that we need more of.”

“In short, I don’t think there’s any reason to doubt the study’s results,” said Rebekah Tromble, a professor of political science at Leiden University in the Netherlands, in an email.

What makes this study different? In the past, researchers have looked into the problem of falsehoods spreading online. They’ve often focused on rumors around singular events, like the speculation that preceded the discovery of the Higgs boson in 2012 or the rumors that followed the Haiti earthquake in 2010.

This new paper takes a far grander scale, looking at nearly the entire lifespan of Twitter: every piece of controversial news that propagated on the service from September 2006 to December 2016. But to do that, Vosoughi and his colleagues had to answer a more preliminary question first: What is truth? And how do we know?

It’s a question that can have life-or-death consequences.

“[Fake news] has become a white-hot political and, really, cultural topic, but the trigger for us was personal events that hit Boston five years ago,” said Deb Roy, a media scientist at MIT and one of the authors of the new study.

On April 15, 2013, two bombs exploded near the route of the Boston Marathon, killing three people and injuring hundreds more. Almost immediately, wild conspiracy theories about the bombings took over Twitter and other social-media platforms. The mess of information only grew more intense on April 19, when the governor of Massachusetts asked millions of people to remain in their homes as police conducted a huge manhunt.

“I was on lockdown with my wife and kids in our house in Belmont for two days, and Soroush was on lockdown in Cambridge,” Roy told me. Stuck inside, Twitter became their lifeline to the outside world. “We heard a lot of things that were not true, and we heard a lot of things that did turn out to be true” using the service, he said.

The ordeal soon ended. But when the two men reunited on campus, they agreed it seemed seemed silly for Vosoughi—then a Ph.D. student focused on social media—to research anything but what they had just lived through. Roy, his adviser, blessed the project.

He made a truth machine: an algorithm that could sort through torrents of tweets and pull out the facts most likely to be accurate from them. It focused on three attributes of a given tweet: the properties of its author (were they verified?), the kind of language it used (was it sophisticated?), and how a given tweet propagated through the network.

“The model that Soroush developed was able to predict accuracy with a far-above-chance performance,” said Roy. He earned his Ph.D. in 2015.

After that, the two men—and Sinan Aral, a professor of management at MIT—turned to examining how falsehoods move across Twitter as a whole. But they were back not only at the “what is truth?” question, but its more pertinent twin: How does the computer know what truth is?

They opted to turn to the ultimate arbiter of fact online: the third-party fact-checking sites. By scraping and analyzing six different fact-checking sites—including Snopes, Politifact, and FactCheck.org—they generated a list of tens of thousands of online rumors that had spread between 2006 and 2016 on Twitter. Then they searched Twitter for these rumors, using a proprietary search engine owned by the social network called Gnip.

Ultimately, they found about 126,000 tweets, which, together, had been retweeted more than 4.5 million times. Some linked to “fake” stories hosted on other websites. Some started rumors themselves, either in the text of a tweet or in an attached image. (The team used a special program that could search for words contained within static tweet images.) And some contained true information or linked to it elsewhere.

Then they ran a series of analyses, comparing the popularity of the fake rumors with the popularity of the real news. What they found astounded them.

Speaking from MIT this week, Vosoughi gave me an example: There are lots of ways for a tweet to get 10,000 retweets, he said. If a celebrity sends Tweet A, and they have a couple million followers, maybe 10,000 people will see Tweet A in their timeline and decide to retweet it. Tweet A was broadcast, creating a big but shallow pattern.

Meanwhile, someone without many followers sends Tweet B. It goes out to their 20 followers—but one of those people sees it, and retweets it, and then one of their followers sees it and retweets it too, on and on until tens of thousands of people have seen and shared Tweet B.

Tweet A and Tweet B both have the same size audience, but Tweet B has more “depth,” to use Vosoughi’s term. It chained together retweets, going viral in a way that Tweet A never did. “It could reach 1,000 retweets, but it has a very different shape,” he said.

Here’s the thing: Fake news dominates according to both metrics. It consistently reaches a larger audience, and it tunnels much deeper into social networks than real news does. The authors found that accurate news wasn’t able to chain together more than 10 retweets. Fake news could put together a retweet chain 19 links long and do it 10 times as fast as accurate news put together its measly 10 retweets.

These results proved robust even when they were checked by humans, not bots. Separate from the study, a group of undergraduate students fact-checked a random selection of roughly 13,000 English-language tweets from the same period. They found that false information outperformed true information in ways “nearly ide