Postagens de Rogue Scholar

language
Publicados in iPhylo

Note to self for upcoming discussion with JournalMap.As of Monday August 25th, BioStor has 106,617 articles comprising 1,484,050 BHL pages. From the full text for these articles, I have extracted 45,452 distinct localities (i.e., geotagged with latitude and longitude). 15,860 BHL pages in BioStor pages have at least one geotag, these pages belong to 5,675 BioStor articles.In summary, BioStor has 5,675 full-text articles that are geotagged.

Publicados in iPhylo

The new look Biodiversity Heritage Library has just launched. It's a complete refresh of the old site, based on the Biodiversity Heritage Library–Australia site. If you want an overview of what's new, BHL have published a guide to the new look site. Congrats to involved in the relaunch.One of the new features draws on the work I've been doing on BioStor.

Publicados in iPhylo

Quick note on an experimental version of BioStor that is (mostly) hosted in the cloud. BioStor currently runs on a Mac Mini and uses MySQL as the database. For a number of reasons (it's running on a Mac Mini and my knowledge of optimising MySQL is limited) BioStor is struggling a bit.

Publicados in iPhylo

If we are ever going to link biodiversity data together we need to have some way of ensuring persistent links between digital records. This isn't going to happen unless people take persistent identifiers seriously.I've been trying to link specimen codes in publications to GBIF, with some success, so imagine my horror when it started to fall apart.

Publicados in iPhylo

Following on from exploring links between GBIF and GenBank here I'm going to look at links between GBIF and the primary literature, in this case articles scanned by the Biodiversity Heritage Library (BHL). The OCR text in BHL can be mined for a variety of entities. BHL itself has used uBio's tools to identity taxonomic names in the OCR text, and in my BioStor project I've extracted article-level metadata and geographic co-ordinates.

Publicados in iPhylo

Recently I've been thinking about the best ways to make article-level metadata from BioStor more widely available. For example, for someone visiting the BHL site there is no easy way to find articles, which are the basic unit for much of the scientific literature. How hard would it be to add articles to BHL?