Rogue Scholar Beiträge

language
Veröffentlicht in iPhylo

Note to self. The challenge of finding specimen citations in papers keeps coming around. It seems that this is basically the same problem as finding citations to papers, and can be approached in much the same way. If you want to build a database of reference from scratch, one way is to scrape citations from papers (e.g., from the "literature cited" section), convert those strings into structured data, and add those to your database.

Veröffentlicht in iPhylo

Garnett et al. recently published a paper in PLoS Biology that starts with the sentence "Lists of species matter": This paper (one of a forthcoming series) is pretty much the kind of paper I try and avoid reading.

Veröffentlicht in Henry Rzepa's Blog

Another occasional conference report (day 1). So why is one about “persistent identifiers” important, and particularly to the chemistry domain? The PID most familiar to most chemists is the DOI (digital object identifier). In fact there are many; some 60 types have been collected by ORCID (themselves purveyors of researcher identifiers). They sometimes even have different names;

Veröffentlicht in iPhylo

Given that it's the start of a new year, and I have a short window before teaching kicks off in earnest (and I have to revise my phyloinformatics course) I'm playing with a few GBIF-related ideas. One topic which comes up a lot is annotating and correcting errors. There has been some work in this area [1][2] bit it strikes me as somewhat complicated.

Veröffentlicht in iPhylo

I've been banging on about having citable, persistent identifiers for specimens, so was suitably impressed when Derek Sikes posted a comment on iPhylo that Arctos already does this. For example, here is a DOI for a specimen: http://dx.doi.org/10.7299/X7VQ32SJ. So, we're all done, right? Not quite.

Veröffentlicht in iPhylo

Continuing the theme of trying to map specimens cited in the literature to the equivalent GBIF records, consider the GBIF record http://data.gbif.org/occurrences/685591320, which according to GBIF is specimen "ZFMK 188762" (a [sic] holotype of Praomys hartwigi ).This is odd, because the original publication of this name (Eisentraut, M. 1968 .Beitrag zur Saugetierfauna von Kamerun.

Veröffentlicht in iPhylo

Based on recent discussions my sense is that our community will continue to thrash the issue of identifiers to death, repeating many of the debates that have gone on (and will go on) in other areas. To be trite, it seems to me we have three criteria: cheap , resolvable , and persistent . We get to pick two. Cheap and resolvable means URLs, which everybody is nervous about because they break.

Veröffentlicht in iPhylo

One reason I'm pursuing the theme of specimen identifiers (and identifiers in general) is the central role they play in annotating databases. To give a concrete example, I (among others) have argued for a wiki-style annotation layer on top of GenBank to capture things such as sequencing errors, updated species names, etc. Annotation is a lot easier if we have consistent identifiers for the things being annotated.