Google's book scans are not of archival quality
Creators & Contributors
Jim Jacobs, Thoughts on Google Book Search, Diglet, February 16, 2006. Excerpt:
Yesterday, I went to the Stanford EE Computer Systems Colloquium to hear Daniel Clancy, the Engineering Director for the Google Book Search Project....Clancy mentioned that Google was NOT going for archival quality (indeed COULD not) in their scans and were ok with skipped pages, missing content and less than perfect OCR -- he mentioned that the OCR process AVERAGED one word error per page of every book scanned!. The key point that I took away from this is that Google book project IS NOT an alternative to library/archive/archival/preservation scans....When I asked if there would be links to libraries on ALL results pages, he hemmed and hawed a bit and wouldn't say one way or the other. He mentioned about the difference between the publisher-supplied content and the library-supplied content and seemed to hint that the publisher-supplied content is subject to stricter licensing agreements....92% of the world's books are not generating revenues for copyright holders or publishers!...Someone asked what had surprised him the most since he started. One thing he was surprised about was that about 70% of the book project use was coming from India.
Additional details
Description
Jim Jacobs, Thoughts on Google Book Search, Diglet, February 16, 2006.
Identifiers
- UUID
- 47b83524-21d8-4726-9934-f97618e16cd7
- GUID
- tag:blogger.com,1999:blog-3536726.post-114018727677765746
- URL
- https://legacy.earlham.edu/~peters/fos/2006/02/googles-book-scans-are-not-of-archival.html
Dates
- Issued
-
2006-02-17T14:33:00Z
- Updated
-
2006-02-17T14:41:16Z