The world’s biggest tech companies are in talks with major media outlets for major deals on using news content to train AI (artificial intelligence) technology.
OpenAI, Google, Microsoft and Adobe have met with news executives in recent months to discuss copyright issues relating to their AI products such as text chatbots and image generators, according to several people briefed on the talks.
Those people said media companies including News Corp, Axel Springer, The New York Times and The Guardian had been in talks with at least one of the technology companies.
Those involved in the discussions, which are in the early stages, added that deals could include paying a subscription fee for content from media organizations in order to develop the technology that powers chatbots such as OpenAI’s ChatGPT and Bard. , Of google.
The talks come as media groups express concern about the threat to the industry posed by the rise of AI, as well as fears about OpenAI and Google’s use of their content without existing agreements. Some companies, such as Stability AI and OpenAI, face lawsuits from artists, photo agencies and programmers, alleging contractual and copyright infringement.
In May, at the INMA media conference, News Corp chief executive Robert Thomson summed up the industry’s outrage by saying that “collective intellectual property [da mídia] is threatened, and so we must argue vehemently for compensation”.
He added that the AI was “designed so that the reader never visits a journalism website, fatally undermining that journalism”.
A deal would set the blueprint for news organizations to do business with generative AI companies around the world.
“Copyright is a crucial issue for all publishers,” said the Financial Times, which is also discussing the issue.
“As a subscription company, we need to protect the value of our journalism and our business model. Engaging in constructive dialogue with relevant companies, as we are doing, is the best way to achieve this.”
Media industry executives want to avoid the mistakes of the early internet age, when many offered articles online for free that ended up undermining their business models. Big tech groups like Google and Facebook have tapped into this information to help build multi-billion dollar online advertising businesses.
As the popularity of generative AI has grown, so have concerns in the news industry, given the technology’s ability to produce compelling snippets of human-like text.
Google recently announced a generative search function, which returns a frame of AI-written information on top of your traditional web link format. It was released in the United States and is in preparation for a worldwide release.
Some discussions currently involve trying to find a pricing model for news content used as training data for AI models. A number that was discussed by editors is US$ 5 million (R$ 24.1 million) to US$ 20 million (R$ 96.6 million) per year, according to an industry executive.
Mathias Döpfner, chief executive of Axel Springer, which owns the Politico website, which has met with leading AI companies – Google, Microsoft and OpenAI – said his first choice would be to create a “quantitative” model similar to that developed by the AI industry. music, where radio stations, nightclubs and streaming services pay record companies each time a track is played. This would first require AI companies to disclose their use of media content, which they currently do not.
Döpfner, whose Berlin-based media company also owns the German tabloid Bild and the newspaper Die Welt, said an annual deal for unlimited use of a media outlet’s content would be a “second best option” because that model would be harder for smaller regional businesses or local news outlets to take advantage of.
“We need an industry-wide solution,” said Döpfner. “We have to work together on this.”
Google has been leading negotiations with UK media, meeting with the Guardian and NewsUK. The Alphabet-owned company has long-standing partnerships with various media organizations to use content data such as articles to ensure it is optimized to appear in your search engine.
The company used the data to train its large language models, according to two people briefed on the arrangement.
“Google has put a licensing deal on the table,” said an executive at a news group. “They have accepted the principle that there has to be payment […], but we didn’t get to the point of talking about numbers. They recognized that there is a conversation about money that we need to have in the coming months, which is the first step.”
After this article was first published, Google said the media executive’s comment about a potential licensing deal “is not accurate. It’s too soon and we’re continuing to work with the ecosystem, including news publishers, to hear your feedback.” ideas”.
Google does not comment on financial discussions. However, the internet search firm said it is having “ongoing conversations” with news outlets large and small in the US, UK and Europe, and has already trained its AI on “publicly available information” which may include sites with paid access.
The Silicon Valley giant added that it is considering the option of giving publishers more “choice and control” over whether their content would be part of an AI training dataset, as well as allowing websites to opt out of their content being used. in searches.
Since launching ChatGPT in November, OpenAI chief Sam Altman has met with News Corp and The New York Times, according to people familiar with the discussions. The company acknowledged that it has held conversations with publishers and publishing associations around the world about how they could work together.
Developing a financial model for using news content to train AI will be extremely difficult, say editorial leaders. Senior executives at a major US publisher said the newspaper industry was working backwards because technology companies released these products without consulting them.
“There was no discussion, and now we have to try to get paid after it happened,” said the executive. “The way they released these products, the total secrecy, the fact that there was no transparency, no communication before it happened, there’s reason to be very pessimistic.”
Media analyst Claire Enders said negotiations are “very complicated at the moment”, adding that with each organization taking its own approach, a single commercial deal for media groups is unlikely and could be counterproductive.
Enders added, “Chatbots will not be reliable tools if they are literally trained basically in the sewers of misogyny and racism that make up most open and accessible text.”
The tech companies building the AI are keen to focus on its usefulness to drive newsroom efficiency and improve journalism, and are content to pay millions to preserve long-standing relationships with the industry, said people involved in the talks.
Microsoft vice president Brad Smith said it was “in the early days of conversations with the media and publishers, and part of that is just helping everyone learn how models are trained.”
“I think our biggest opportunity is really working with publishers first to think about how they can use AI to drive more revenue,” he added.
Adobe chief executive Shantanu Narayen said he has had meetings with Disney, Sky and the UK’s Daily Telegraph in recent weeks to discuss how it might develop custom models for companies to use its generative AI for images.
Adobe’s model is trained on images from its own image library, as well as openly licensed and public domain content whose copyright has expired. Narayen said bespoke deals and pricing would depend on the company, but customers could add their proprietary content to the tool.
Axel Springer’s Döpfner expressed optimism that there will be deals, because both media organizations and policy makers have understood the scale of the challenge more quickly than during the last great wave of technology disruption.
AI companies know regulation is coming and are afraid of it, he said. “It is in all parties’ interest to find a solution for a healthy ecosystem. If there is no incentive to create intellectual property, there is nothing to start with. Artificial intelligence will become artificial stupidity.”
Translated by Luiz Roberto M. Gonçalves
Cristina Criddle, Madhumita Murgia, Daniel Thomas, Anna Nicolaou and Laura Pitel