AI chatbots need more books to learn from. These libraries are opening their stacks.
Briefly

Harvard University is releasing nearly one million books dating back to the 15th century, alongside old newspapers and government documents, to aid AI researchers. This effort comes at a time when tech companies are facing legal challenges related to the unauthorized use of copyrighted works for AI training. By utilizing public domain materials, these companies can access rich cultural, historical, and linguistic data while mitigating legal risks. Supported by grants from Microsoft and OpenAI, the initiative aims to enhance AI's comprehension of human knowledge and benefit libraries and their communities.
It is a prudent decision to start with public domain data because that's less controversial right now than content that's still under copyright.
We're trying to move some of the power from this current AI moment back to these institutions.
Libraries also hold significant amounts of interesting cultural, historical and language data that's missing from the past few decades of online commentary.
Supported by unrestricted gifts from Microsoft and ChatGPT maker OpenAI, the Harvard-based Institutional Data Initiative is working with libraries.
Read at Boston.com
[
|
]