According to Verge report, The New York Times (NYT) has implemented new measures to prevent the usage of its materials in the development of artificial intelligence (AI) models. Recent updates to the publication’s Terms of Service, as reported by Adweek, have explicitly prohibited the incorporation of any content, spanning text, photographs, images, audio/video clips, “look and feel,” metadata, or compilations, for the purpose of training machine learning or AI systems.
The revised terms also outline strict regulations regarding automated tools, such as website crawlers, that access, utilize, or aggregate NYT’s content. These tools can no longer be employed without obtaining prior written consent from the publication. To emphasize the seriousness of compliance, the NYT has indicated that failure to adhere to these new guidelines may result in undisclosed fines or penalties.
Notably, despite these updates, there have been no apparent modifications to the NYT’s robots.txt file, which informs search engine crawlers about permissible URLs for access.
Observers speculate that this proactive step by the NYT could be a response to recent alterations in Google’s privacy policy. The tech giant’s updated policy revealed its potential collection of public data from the internet to enhance various AI services like Bard and Cloud AI. The concern raised is that numerous large language models, such as OpenAI’s ChatGPT, power popular AI services and might have been trained on extensive datasets containing copyrighted or protected content sourced from the internet without proper authorization from the original creators.
This move by The New York Times underlines the growing awareness of the potential implications surrounding the usage of copyrighted materials in the development and training of AI systems. As the AI landscape continues to evolve, stakeholders are navigating the delicate balance between technological innovation and intellectual property right.