Google's book scans are not of archival quality

Suber, Peter

doi:10.63485/mqww6-aa739

Published February 17, 2006 | https://doi.org/10.63485/mqww6-aa739

Google's book scans are not of archival quality

Suber, Peter

Jim Jacobs, Thoughts on Google Book Search, Diglet, February 16, 2006. Excerpt:

Yesterday, I went to the Stanford EE Computer Systems Colloquium to hear Daniel Clancy, the Engineering Director for the Google Book Search Project....Clancy mentioned that Google was NOT going for archival quality (indeed COULD not) in their scans and were ok with skipped pages, missing content and less than perfect OCR -- he mentioned that the OCR process AVERAGED one word error per page of every book scanned!. The key point that I took away from this is that Google book project IS NOT an alternative to library/archive/archival/preservation scans....When I asked if there would be links to libraries on ALL results pages, he hemmed and hawed a bit and wouldn't say one way or the other. He mentioned about the difference between the publisher-supplied content and the library-supplied content and seemed to hint that the publisher-supplied content is subject to stricter licensing agreements....92% of the world's books are not generating revenues for copyright holders or publishers!...Someone asked what had surprised him the most since he started. One thing he was surprised about was that about 70% of the book project use was coming from India.

Additional details

Jim Jacobs, Thoughts on Google Book Search, Diglet, February 16, 2006.

UUID: 47b83524-21d8-4726-9934-f97618e16cd7
GUID: tag:blogger.com,1999:blog-3536726.post-114018727677765746
URL: https://legacy.earlham.edu/~peters/fos/2006/02/googles-book-scans-are-not-of-archival.html

Issued: 2006-02-17T14:33:00Z
Updated: 2006-02-17T14:41:16Z

Google's book scans are not of archival quality

Additional details

Description

Identifiers

Dates

Google's book scans are not of archival quality

Creators & Contributors

Additional details

Description

Identifiers

Dates