Postagens de Rogue Scholar

language
Publicados in iPhylo

Two ongoing challenges in biodiversity informatics are getting data into a form that is usable, and linking that data across different projects platforms. A recent and interesting approach to this problem are "data journals" as exemplified by the Biodiversity Data Journal. I've been exploring some data from this journal that has been aggregated by GBIf and EOL, and have come across a few issues.

Publicados in iPhylo

I spent last Friday and Saturday at ( Research in the 21st Century: Data, Analytics and Impact , hashtag #ReCon_15) in Edinburgh. Friday 19th was conference day, followed by a hackday at CodeBase. There's a Storify archive of the tweets so you can get a sense of the meeting. Sitting in the audience a few things struck me. No identifier wars, DOIs have won and are everywhere.

Publicados in iPhylo

This a quick writeup of an analysis I did to make the case that the list of names held by the Index of Organism Names (ION) (part of Thomson Reuters) would be very useful for GBIF. I must declare a bias, in that I've spent a good chunk of the last 3-4 years exploring the ION database and investigating ways to link the taxonomic names it contains to the primary taxonomic literature, culminating in building BioNames.

Publicados in iPhylo

Playing with the my "material examined" tool I've been working on, I wondered whether I could make use of it in, say, a spreadsheet. Imagine that I have a spreadsheet of museum codes and want to look those up in GBIF. I could create a service for Open Refine but Open Refine is a bit big and clunky, you have to fire up a Java application and point your browser at it, and Open Refine isn't as intuitive or as flexible as a spreadsheet.

Publicados in iPhylo

The six finalists for the GBIF Ebbe Nielsen Challenge have been announced by GBIF: The finalists all receive a €1,000 prize, and now have the possibility to refine their work and compete for the grand prize of €20,000 (€5000 for second place). As the rather cheesy quote above suggests, I think the challenge has been a success in terms of the interest generated, and the quality of the entrants.

Publicados in iPhylo

I've put together a working demo of some code I've been working on to discover GBIF records that correspond to museum specimen codes. The live demo is at http://bionames.org/~rpage/material-examined/ and code is on GitHub. To use the demo, simply paste in a specimen code (e.g., "MCZ 24351") and click Find and it will do it's best to parse the code, then go off to GBIF and see what it can find.

Publicados in iPhylo

The GBIF Ebbe Nielsen Challenge has closed and we have 23 submissions for the jury to evaluate. There's quite a range of project types (and media, including sound and physical objects), and it's going to be fascinating to evaluate all the entries (some of which are shown below). This is the first time GBIF has run this challenge, so it's gratifying to see so much creativity in response to the challenge.

Publicados in iPhylo

Below I sketch what I believe is a straightforward way GBIF could tackle the issue of annotating and cleaning its data. It continues a series of posts Annotating GBIF: some thoughts, Rethinking annotating biodiversity data, and More on annotating biodiversity data: beyond sticky notes and wikis on this topic. Let's simplify things a little and state that GBIF at present is essentially an aggregation of Darwin Core Archive files.

Publicados in iPhylo

Each year about this time, as I ponder what to devote my time on in the coming year, I get exasperated and frustrated that each year will be like the previous one, and biodiversity informatics will seem no closer to getting its act together. Sure, we are putting more and more data online, but we are no closer to linking this stuff together, or building things that people can use to do cool science with.