Automated Journalism – AI Applications at New York Times, Reuters, and other Media Giants


Artificial intelligence in news media is being used in new ways from speeding up research to accumulating and cross-referencing data and beyond.

This article discusses several examples in which AI is being integrated into the newsroom, and we’ll aim to tackle the following three questions for our business and media industry readers:

  • What new journalism tasks are made possible by AI?
  • Which AI applications are playing a role in augmenting the journalistic process, and which are actually replacing journalists?
  • How are newsrooms using these applications to improve the quality of news media, and how will they affect the future of journalism?

The following examples help to flesh out the directions that AI is taking in journalism, and the opportunities made available by its application. With AI use-case examples from eight reputable publications (including The New York Times and Washington Post), we’ll aim to paint a picture of how journalism is changing, and we hope it’ll help you imagine the future of journalism in the next five years (and how your organization might adopt).

Before examining applications in each specific publication, we’ll start with a high-level overview of the findings from this research.

The New York Times – Semantic Discovery, Comment Monitoring

In 2015 The New York Times implemented its experimental AI project known as Editor. The aim of the project was to simplify the journalistic process. When writing an article, a journalist can use tags to highlight phrase, headline, or main points of the text.

Over time, the computer learns to recognize these semantic tags and learn the most salient parts of an article. By searching through data in real time and extracting information based on requested categories, such as events, people, location and dates,  Editor can make information more accessible, simplifying the research process and providing fast and accurate fact checking.

The New York Times is also using AI in a unique approach to moderate reader comments, encourage constructive discussion and eliminate harassment and abuse. Known as a hearty and often stimulating forum, the Times’ comment section is currently moderated by a team of 14 people who are responsible for manually reviewing over 11,000 comments daily. Such a labor-intensive process limits commenting to a mere 10 percent of all the Times’ articles.

PerspectiveAPI allows users to search media comments based on sentiment (they refer to negative sentiment as “toxicity” in the image above).

But the Times is experimenting with an AI solution which could transform comment moderation and extend the comment feature to more articles, hopefully allowing for cost-savings for NYT, and more engaging conversation for its readers

The Perspective API tool developed by Jigsaw (part of Google’s parent company Alphabet) organizes reader’s comments interactively so that viewers can quickly see which ones they may find “toxic” and which may be more illuminating.  Viewers can read comments by sliding a bar across the top of the page from left to right. The closer the bar gets to the right, the more toxic the comments become. It’s a great way for users to read and interact with comments they are interested in while avoiding more aggressive ones.

BBC News Labs – Semantic Discovery

A screenshot of BBC’s Juicer tool.

The BBC is a repository for a vast amount of data, from daily news stories, features, and video, not to mention the archives.

Then there is also data from other news sources, government sources, and the internet. What would be great was if there was a way to link all this data together in a way that would make it more accessible and at the same time meaningful. Since 2012, BBC News Labs have been using the data extraction tool Juicer to attempt just that.

The machine watches around 850 global news outlets’ RSS feeds and aggregates and extracts news articles from the BBC and outside sources. It then assigns semantic tags to the stories and organizes them to one of four categories: organizations, locations, people, and things. So if a journalist is looking for the latest stories on President Trump or articles associated with companies in the AI sector, Juicer quickly searches the web and provides a list of related content. Iain Collins of the BBC explains and demos the technology below in 3 minutes:

In the not too distant future, Juicer may also be used to enhance the user experience by creating pop-up news facts when readers hover over certain words. BBC Lab is also experimenting with adding this capability to video content by overlaying facts on different parts of an image or shot.

Reuters – Data Visualization

In 2016, Reuters partnered with semantic technology company Graphiq, to provide news publishers with a wide range of free interactive data visualizations across a spectrum of topics including entertainment, sports, and news. Publishers can access the data via Reuters Open Media Express. Once embedded on the publishers’ website, the data visualizations are updated in real time.

This is an innovative way for news media publishers to draw an audience and provide data-driven news stories that are visually stimulating and easy to understand. Because Graphiq’s algorithms are constantly constructing and updating, the tool provides speedy access to data. While not all data requires AI to be visualized and displayed, tools like Graphiq allow publishers to display much richer and connected information than they ever could with a simple table or chart.

Data visualization is an efficient way to present readers with complex information in a quick to read and easy to understand format.  The breadth of information can be as varied as “Apple Stock Prices” to “President Trump’s Popularity” to “Predictive Analytics for Marketing“, all at the click of a button.

The Washington Post – Automated Journalism

The Post has been experimenting with automated news writing (sometimes referred to as “robot journalism” or simply “automated journalism”) using Heliograf smart software. The bot made its debut in the Summer of 2016 with coverage of the Rio Olympic Games. Heliograf put together the news story by analyzing data about the games as it emerged.

This information is then matched to relevant phrases in a story template and the machine adds the information to create a narrative which could be published across different platforms. The software can also alert journalists of any anomalies it finds in the data. This meant that during the Olympics, Heliograf was able to keep up with information relating to scores and medal counts in real time, freeing up journalists so they could work on creating other content.

Automated journalism products got their original start in more data-grounded domains like sports and finance (see the Yahoo! example below) – where raw data about news events could be transferred into a coherent story, and it seems that Washington Post’s Heliograph is doing much of the same thing.

Yahoo! Sports – Automated Journalism

Much of the initial media coverage about “robot journalism” (two or three years ago) involved sports and finance stories at Yahoo!. Despite the company’s decade-long decline (and recent sale to telecom giant Verizon), Yahoo! still boasts a massive following on its news, finance, and sports media properties.

Automotive Insights – a prominent natural language generation vendor – features a case study about its work with Yahoo! Sports. Yahoo! claims that by generating content (articles, reports, emails) with data from specific sports teams (or fantasy sports teams) it is able to kill two birds with one stone:

First, the company draws in readers for longer sessions with customized, rich content (based on sports data).

Second, advertisers eagerly look for engaging material and are willing to spend more on ads that will gain more exposure for more time with more users.

Automated Insights explains the basics about it’s automated journalism product with the video below:

Readers with an interest in what implementing Wordsmith is like – refer to our Automated Insights case study with Associated Press.

It’s worth mentioning that there are a number of applications of natural language generation outside of the domain of publishing. AI vendor Yseop has a product that delivers financial and analytics-oriented reports based purely on the information it is “fed.”

While quickly produced, formulaic content for the masses (i.e. Yahoo! Sports) can lead to great efficiencies, there is also an important market for quickly produced (and error-free) content for a company’s own internal communication and insight. Our natural language generation-focused interview with Yseop’s Matthieu Rauscher explores more of the use-cases and applications for finance and business intelligence.

Associated Press – Semantic Discovery, AI for Analytics, Automated Journalism

NewsWhip’s “Resources” section seems to cater to a non-technical audience.

The Associated Press first began using AI for the creation of news content in 2013 to draw data and produce sport and earnings reports. These days the AP newsroom uses NewsWhip keep ahead of trending news stories on social media such as Twitter, Facebook, Pinterest, and LinkedIn. NewsWhip’s analytics page advertises the following main capabilities:

  • Competitor benchmarking across all social networks
  • Audience engagement around keywords and verticals
  • Identify influencers impacting brand performance

As well as tracking news stories, it can analyze a real or historical time period on any timescale scale between 30 minutes and 3 years and provide reporters with real-time alerts or daily digests. As well as the added benefits of speed and scope, AI technologies like NewsWhip may increase data accuracy and decreasing errors in copy – in addition to giving publishers a greater pulse across their

While we’ve seen sector-agnostic media monitoring applications (with we explored recent “AI in Industry” interview with Signal Media’s Chief Data Scientist Dr. Miguel Martinez), it’s not surprising that publishers would develop their own set of unique niche tools. NewsWhip will likely see strong competition in the coming 3-4 years as AI replaces the manual work involved with influencer marketing, research, and competitive analysis.

As with Yahoo! Sports, Automated Insights features a case study about its work with the Associated Press as well. AP isn’t writing long and thoughtful political commentary with the use of AI alone (and probably won’t be anytime soon), but the company does use Automated Insight’s “Wordsmith” product to turn raw earnings data into articles – which is extremely similar to the use case with Yahoo!.

Quartz Digital News – Chatbot Media Interfaces

Quartz is experimenting with a media and news app that resembles “chat”, and uses natural language processing to find articles about events, people, or topics that it’s users request.

In 2016, Quartz received a £193,000 grant from the Knight Foundation to set up a Bot Studio to create a set of automated tools for journalists. The move is inspired by the fact that today’s news media has moved not just from print to desktop to mobile phones, but also to other Internet-connected devices for the home and car.

Users are interacting with companies through chat, voice, and other innovative new channels, and Quartz wants to find the cutting edge for how media can be consumed, too.

Though the project is clearly in its infancy, initial BotStudio experiments have involved a new interface that looks and feels like “chat.” Users text in with questions about news events, people, or places, and the app replies with content that it believes will be relevant for them. It’s uncertain whether this specific application of AI will be adopted for media consumption on a large scale, but it is clear that users will gravitate to low-friction ways of getting the information and entertainment they want, and Quartz would rather make the change proactively rather than reacting to it.

Quartz aims to develop bots and AI in applications that will interface seamlessly with all media platforms. Although Quartz is still evaluating what its next steps will be, one idea is a newsroom bot created to assist journalists in their workflow by improving the way reporters can generate data and produce news stories for new media spaces.

The Guardian – Chatbot Media Interfaces

In 2016, The Guardian launched its Chatbot via Facebook. To save time scrolling through or searching for news stories, the chatbot allows users to pick from US, UK and Australian version of Guardian News, choose from a 6 am, 7 am or 8 am delivery time and it will deliver selected news stories every day via Facebook Messenger.

If a user only wants to catch the headlines and sports news you can, or she only wants to read trending tech and science news, she can add those too. Much like our Quartz example above, the interface replies to chat messages with related content relevant to the users’ query.

A screenshot of Guardian’s chatbot.

Concluding Thoughts on AI Applications in Journalism

Generally speaking, what can be automated, will be automated – and we can expect journalism to be no different. However, we don’t predict that the present AI developments in journalism are the tailspin of the role of the journalist or writer.

Publications who hire teams of people for simple fact-finding or fact-checking tasks (jobs which are typically handled overseas) will likely be able to “replace” that limited, repetitive roles with a system that may be faster and (certainly with time) more cost-effective. Most other jobs in at a large publication or in a newsroom will be “augmented” with additional capabilities to gather and manage data.

Whether the newsroom of 2025 will be run mainly my intelligent machines or will be comprised of AI human reporters working together will remain to be seen. What is clear for the present is that AI does have a place in the newsroom for helping to save time and money and increase speed and efficiency to help human journalists keep up with the ever-expanding scale of global news media. As usual, here at Emerj we’ll be following (and interviewing) the biggest players in the industry, and shedding light on their use-cases so that our business audience can stay ahead of the curve.

Source: SCMP

Related posts: