How well do search engines index the OA repositories?
Creators
Frank McCown and three co-authors, Search Engine Coverage of the OAI-PMH Corpus, IEEE Internet Computing, March/April 2006.
Abstract: The major search engines are competing to index as much of the Web as possible. Having indexed much of the surface Web, search engines are now using a variety of approaches to index the deep Web. At the same time, institutional repositories and digital libraries are adopting the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to expose their holdings, some of which are indexed by search engines and some of which are not. To determine how much of the current OAI-PMH corpus search engines index, we harvested nearly 10M records from 776 OAI-PMH repositories. From these records we extracted 3.3M unique resource identifiers and then conducted searches on samples from this collection. Of this OAI-PMH corpus, Yahoo indexed 65%, followed by Google (44%) and MSN (7%). Twenty-one percent of the resources were not indexed by any of the three search engines.
Additional details
Description
Frank McCown and three co-authors, Search Engine Coverage of the OAI-PMH Corpus, IEEE Internet Computing, March/April 2006.
Identifiers
- UUID
- a413450f-726f-4e40-8509-57dc72311ca0
- GUID
- tag:blogger.com,1999:blog-3536726.post-114187436182976780
- URL
- https://legacy.earlham.edu/~peters/fos/2006/03/how-well-do-search-engines-index-oa.html
Dates
- Updated
-
2006-03-09T03:19:21Z
- Issued
-
2006-03-09T03:15:00Z