Messaggi di Rogue Scholar

language
Pubblicato in iPhylo

Some quick notes on possibilities for text-mining BHL (in rough order of priority). Any text-mining would have to be robust to OCR errors. I've created a group of OCR-related papers on Mendeley: Skip to content Welcome Enter your email to continue with Mendeley Email Continue Sign in via your organization About Elsevier Terms and conditions Privacy policy Help We use cookies to help provide and enhance our service.

Pubblicato in iPhylo

Anyone who works with taxonomic databases is aware of the fact that they have errors. Some taxonomic databases are restricted in scope to a particular taxon in which one or more people have expertise, these then get aggregated into larger databases, which may in turn be aggregated by databases whose scope is global. One consequence of this is that errors in one database can be propagated through many other databases.

Pubblicato in iPhylo

When I think of the Biodiversity Heritage Library (BHL) or GBIF I tend to think of taxonomy and biodiversity. Folk wisdom has it that BHL is full of old books, mostly pre-1923. Great for finding old taxonomic names, or nice artwork, but not exactly "modern" biology. GBIF is mainly about displaying organism distributions based on museum specimens, the primary data of taxonomic research.

Pubblicato in iPhylo

Following on from exploring links between GBIF and GenBank here I'm going to look at links between GBIF and the primary literature, in this case articles scanned by the Biodiversity Heritage Library (BHL). The OCR text in BHL can be mined for a variety of entities. BHL itself has used uBio's tools to identity taxonomic names in the OCR text, and in my BioStor project I've extracted article-level metadata and geographic co-ordinates.

Pubblicato in iPhylo

Here are some quick notes on how BHL could use Mendeley as a "CiteBank". As a repository of bibliographic data If the goal is to assemble a "bibliography of life" then there are various ways this could be done. Taxon-specific bibliographies Create groups that are taxon-specific (or find existing groups in Mendeley.

Pubblicato in iPhylo

I've recently updated my database of links between animal taxonomic names and literature identifiers, which now has over 280,000 names linked to some form of identifier (127,000 of these being DOIs). You can see the current version here: http://iphylo.org/~rpage/itaxon/ As an experiment I've added a feature to list the number of names for each journal.

Pubblicato in iPhylo

Following on from my earlier post Linking taxonomic names to literature: beyond digitised 5×3 index cards I've been slowly updating my latest toy: http://iphylo.org/~rpage/itaxon This site displays a database mapping over 200,000 animal names to the primary literature, using a mix of identifiers (DOIs, Handles, PubMed, URLs) as well as links to freely available PDFs where they are available.

Pubblicato in iPhylo

David King et al.'s paper "Towards the bibliography of life" http://dx.doi.org/10.3897/zookeys.150.2167 has just appeared in a special issue of ZooKeys . I've written a number of posts on this topic, so I've a few comments. King et al. survey some of the issues, but don't really tackle the big issue of how we're going to build this.

Pubblicato in iPhylo

Browsing EOL I stumbled upon the recently described fish Protoanguilla palau , shown below in an image by rairaiken2011: Two things struck me, the first is that the EOL page for this fish gives absolutely no clue as to where you would to find out more about this fish (apart from an unclickable link to the Wikipedia page http://en.wikipedia.org/wiki/Protoanguilla - seriously, a link that isn't clickable?), despite the fact this fish