Messages de Rogue Scholar

language
Publié in iPhylo

On eof the things BioNames will need to do is match taxon names to classifications. For example, if I want to display a taxonomic hierarchy for the user to browse through the names, then I need a map between the taxon names that I've collected and one or more classifications. The approach I'm taking is to match strings, wherever possible using both the name and taxon authority.

Publié in iPhylo

Quick note to self about possible way to using fuzzy matching when searching for taxonomic names. Now that I'm using Cloudant to host CouchDB databases (e.g., see BioStor in the the cloud) I'd like to have a way to support fuzzy matching so that if I type in a name and misspelt it, there's a reasonable chance I will still find that name. This is the "did you mean?" feature beloved by Google users.

Publié in iPhylo

In Arthur C. Clarke's short story The Nine Billion Names of God Tibetan monks hire two programmers to help them generate all the the possible names of God. The monks believe that the purpose of the Universe is to generate those names, once that goal is achieved the Universe will end.

Publié in iPhylo

Google Refine is an elegant tool for data cleaning. One of its most powerful features is the ability to call "Reconciliation Services" to help clean data, for example by matching names to external identifiers. Google Refine comes with the ability to use Freebase reconciliation services, but you can also add external services.

Publié in iPhylo

I've recently updated my database of links between animal taxonomic names and literature identifiers, which now has over 280,000 names linked to some form of identifier (127,000 of these being DOIs). You can see the current version here:http://iphylo.org/~rpage/itaxon/As an experiment I've added a feature to list the number of names for each journal.

Publié in iPhylo

As part of my Quixotic attempt to construct a wiki of taxonomic names, I'm building a database of names and links. My current plan is to seed this with the NCBI taxonomy. What I want to do is flesh out the NCBI taxonomy with authorities and links to the original literature. At the moment the NCBI taxonomy is almost "nude", lacking links to the literature behind the names.