Postagens de Rogue Scholar

language
Publicados in iPhylo

Today I managed to publish some data from a GitHub repository directly to GBIF. Within a few minutes (and with Tim Robertson on hand via Skype to debug a few glitches) the data was automatically indexed by GBIF and its maps updated. You can see the data I uploaded here. The data I uploaded came from this paper: This is the data I used to build the geophylogeny for Banza using Google Earth.

Publicados in iPhylo

A quick note to myself to document a problem with the GBIF classification of liverworts (I've created issue POR-1879 for this). While building a new tool to browse GBIF data I ran into a problem that the taxon "Jungermanniales" popped up in two different places in the GBIF classification, which broke a graphical display widget I was using.

Publicados in rOpenSci - open tools for open science
Autor Scott Chamberlain

rgbif is an R package to search and retrieve data from the Global Biodiverity Information Facilty (GBIF). rgbif wraps R code around the [GBIF API][gbifapi] to allow you to talk to GBIF from R. We just pushed a new verion of rgbif to cran - v0.5.0. Source and binary files are now available on CRAN. There are a few new functions: count_facet, elevation, and installations. These are described, with examples, below.

Publicados in iPhylo

More for my own benefit than anything else I've decided to list some of the things I plan to work on this year. If nothing else, it may make sobering reading this time next year. A knowledge graph for biodiversity Google's introduction of the "knowledge graph" gives us a happy phrase to use when talking about linking stuff together.

Publicados in iPhylo

Given that it's the start of a new year, and I have a short window before teaching kicks off in earnest (and I have to revise my phyloinformatics course) I'm playing with a few GBIF-related ideas. One topic which comes up a lot is annotating and correcting errors. There has been some work in this area [1][2] bit it strikes me as somewhat complicated. I'm wondering whether we couldn't try and keep things simple.

Publicados in iPhylo

Quick notes on yet another attempt to marry the task of editing a taxonomic classification with versioning it in GitHub. The idea of dumping the whole GBIF classification into GitHub as a series of nested folders looks untenable. So, maybe there's another way to tackle the problem.

Publicados in iPhylo

Here's another example of a Darwin Core Archive that is "broken" such that GBIF is missing some information. GBIF data set A checklist to the wasps of Peru (Hymenoptera, Aculeata) comes from Pensoft, and corresponds to the paper: As with the previous example GBIF says there are 0 georeferenced records in this dataset. This is odd, because the ZooKeys page for this article lists three supplementary files, including KML files for Google Earth.

Publicados in iPhylo

Following on from Annotating and cleaning GBIF data: Darwin Core Archive, GitHub, ORCID, and DataCite here's a quick and dirty example of using GitHub to help clean up a Darwin Core Archive. The dataset 3i - Cicadellinae Database has 2,152 species and 4,749 taxa, but GBIF says it has no georeferenced data.

Publicados in iPhylo

This is a quick sketch of a way to combine existing tools to help clean and annotate data in GBIF, particularly (but not exclusively) occurrence data. GitHub The data provider puts a Darwin Core Archive (expanded, not zipped) into a GitHub repository. GBIF forks the repository, cleans the data, and uploads that to GBIF to populate the database behind the portal.