Messaggi di Rogue Scholar

language
Pubblicato in iPhylo

The following is a guest post by Bob Mesibov. No winner yet in the second Darwin Core Million for 2020, but there are another two and a half weeks to go (to 30 September). For details of the contest see this iPhylo blog post. And please don’t submit a million RECORDS, just (roughly) a million DATA ITEMS. That’s about 20,000 records with 50 fields in the table, or about 50,000 records with 20 fields, or something arithmetically similar.

Pubblicato in iPhylo

The following is a guest post by Bob Mesibov. There's still time (to 31 March ) to enter a dataset in the 2020 Darwin Core Million, and by way of encouragement I'll celebrate here the best and worst Darwin Core datasets I've seen. The two best are real stand-outs because both are collections of IPT resources rather than one-off wonders. The first is published by the Peabody Museum of Natural History at Yale University.

Pubblicato in iPhylo

Quick notes on modelling taxonomic names in databases, as part of an ongoing discussion elsewhere about this topic. Simple model One model that is widely used (e.g., ITIS, WoRMS) and which is explicit in Darwin Core Archive is something like this: We have a table for taxa and we don't distinguish between taxa and their names. the taxonomic hierarchy is represented by the parentID field, which points to your parent.

Pubblicato in iPhylo

Two ongoing challenges in biodiversity informatics are getting data into a form that is usable, and linking that data across different projects platforms. A recent and interesting approach to this problem are "data journals" as exemplified by the Biodiversity Data Journal. I've been exploring some data from this journal that has been aggregated by GBIf and EOL, and have come across a few issues.

Pubblicato in iPhylo

Following on from Annotating and cleaning GBIF data: Darwin Core Archive, GitHub, ORCID, and DataCite here's a quick and dirty example of using GitHub to help clean up a Darwin Core Archive. The dataset 3i - Cicadellinae Database has 2,152 species and 4,749 taxa, but GBIF says it has no georeferenced data.