As news publishers ink deals with AI companies to train their models with news stories, the price businesses like OpenAI are willing to pay for copyrighted information is coming to light. The Information reports that OpenAI offers between $1 million and $5 million a year to license copyrighted news articles to train its AI models. That’s one of the first indications of how much AI companies plan to pay for licensed material. It sits alongside a recent report saying Apple is looking to partner with media companies to use content for AI training and is offering at least $50 million over a multiyear period for data. The Verge reached out to OpenAI for comment on the numbers. The numbers appear roughly similar to some earlier non-AI licensing deals. When Meta launched the Facebook News tab — since discontinued in Europe — it allegedly offered up to $3 million a year to license news stories, headlines, and previews. But it’s not clear whether the total payouts would equal some of the bigger numbers we’ve seen. Google announced in 2020 that it would invest $1 billion in total to partner with news organizations, for instance. Under pressure from a new law, Google also recently agreed to pay Canadian publishers a total of $100 million annually in exchange for linking to their articles. Today’s large language models have, insofar as we know what’s in their training data, mainly been trained on information from the internet. While some AI models do not disclose how they got their training data, information is often available on which datasets or web crawlers were used. Pricing for training datasets varies by provider, size, and the content of a dataset.
Full exclusive : OpenAI’s news publisher deals reportedly top out at $5 million a year.