Search results
The world’s largest open-source open-data library. Mirrors Sci-Hub, Library Genesis, Z-Library, and more.
Removing overlap (deduplication) Text and metadata extraction. Support long-term archival of human knowledge, while getting better data for your model! Contact us to discuss how we can work together. The world’s largest open-source open-data library. Mirrors Sci-Hub, Library Genesis, Z-Library, and more.
🏛️ Long-term archive. The datasets used in Anna’s Archive are completely open, and can be mirrored in bulk using torrents. Learn more…
The world’s largest open-source open-data library. Mirrors Sci-Hub, Library Genesis, Z-Library, and more.
Anna's Archive is a search engine for shadow libraries created by the pseudonymous Anna. It was founded in direct response to law enforcement efforts to close down Z-Library in 2022 . It describes itself as a project that aims to "catalog all the books in existence" and to "track humanity's progress toward making all these books easily ...
What is Anna’s Archive? § Anna’s Archive is a non-profit project with two goals: Preservation: Backing up all knowledge and culture of humanity. Access: Making this knowledge and culture available to anyone in the world. All our code and data are completely open source.
Files mirrored by Anna’s Archive: 8,641,738 (87.534%) Last updated: 2024-07-02; Torrents by Anna’s Archive; Example record on Anna’s Archive; Main website; Digital Lending Library; Metadata documentation (most fields) Scripts for importing metadata; Anna’s Archive Containers format
AAC (Anna’s Archive Container) is a single item consisting of metadata, and optionally binary data, both of which are immutable. It has a globally unique identifier, called AACID. Collection. Each AAC belongs to a collection, which by definition is a list of AACs that are semantically consistent. That means that if you make a significant change to the format of the metadata, then you have to create a new collection.
Jul 7, 2024 · Datasets - Anna’s Archive. 📚 The largest truly open library in human history. ⭐️ We mirror Sci-Hub and LibGen. We scrape and open-source Z-Lib, DuXiu, and more. 📈 35,495,093 books, 103,135,237 papers — preserved forever. All our code and data are completely open source. Learn more….
Mar 22, 2024 · Datasets Z-Library scrape. If you are interested in mirroring this dataset for archival or LLM training purposes, please contact us. Z-Library has its roots in the Library Genesis community, and originally bootstrapped with their data.