I’m with the content creators on this one. One could make the argument that having an AI model scoop up copyrighted material for training is okay as long as the model doesn’t actually copy it. In that sense it would not be any different than a person reading a book or listening to a song and using that knowledge to inform some future work. But there isn’t anything that prevents an AI model using the material in a way that violates the copyright. At the very least, it would seem that the tech companies should pay the content creators for their work or pay them if their work is used by the AI model in the future.
Late last year, the U.K. government kicked off a consultation on proposals that would give tech giants and AI labs like OpenAI a legally sound way of using copyrighted content to train their advanced foundational models.
Under the proposals, artists would have to opt out of having their copyright-protected works from being scraped by large language models. LLMs like OpenAI’s GPT-4 and Google’s Gemini rely on huge amounts of data to generate humanlike responses in the form of text, images, video and audio.
This led to concerns from the U.K.’s creative industries, as it would mean placing the onus on content creators to request not to have their data used for the training of AI models — which, they argue, would amount to giving their valuable work away.
Recent Comments