Australian books and plays used to train AI systems

28 September, 2023

Earlier this week, the Atlantic published a search tool that allowed authors to search for their books in a dataset that has been used to train generative-AI systems. The search tool can be accessed here.
 
The dataset – ‘Books3’ – contains approximately 183,000 pirated books, plays and other literary works. Generative AI systems have been developed or trained by scraping, or copying, these artistic works without the permission of their authors. Tech companies like Meta, EleutherAI, and Bloomberg have all used the dataset to train their language models. 
 
A number of published plays written by Guild members are contained in the Books3 dataset.
 
The US Authors Guild has recently filed a class action for copyright infringement against ChatGPT creator OpenAI over its use of pirated book datasets. There are also author class action suits pending against Meta and Google.
 
If your work has been used in the Books3 dataset without your permission, please contact us at [email protected].
 
You can also read the Authors Guild’s advice on the next steps you can take here.
You can read the Guild’s Position Paper on AI in performance and interactive writing here.

Know another writer who would benefit from these resources? Your Guild is here to support them and ensure their rights are protected. Forward this email on to them and let them know they can get industrial advice and access all our rates and contracts by joining here: awg.com.au/register.