Rogue Scholar

Veröffentlicht 27. Mai 2022 in iPhylo

Note to self (basically rewriting last year's Finding citations of specimens). Bibliographic data supports going from identifier to citation string and back again, so we can do a "round trip." 1. Given a DOI we can get structured data with a simple HTTP fetch, then use a tool such as citation.js to convert that data into a human-readable string in a variety of formats.

BioRxivGBIFGenbankGeocodingSpecimen CodesInformatikEnglisch

Geocoding genomic databases using GBIF

https://doi.org/10.59350/35kwk-1ty15

Veröffentlicht 15. November 2018 in iPhylo

Autor Roderic Page

I've put a short note up on bioRxiv about ways to geocode nucleotide sequences in databases such as GenBank. The preprint is "Geocoding genomic databases using GBIF" https://doi.org/10.1101/469650.

Ross MounceSpecimen CodesText MiningInformatikEnglisch

Text mining for museum specimen identifiers

https://doi.org/10.59350/xvdw8-nc818

Veröffentlicht 19. Mai 2015 in iPhylo

Autor Roderic Page

This post is a response to Ross Mounce's post Text mining for museum specimen identifiers. As Ross notes in that post, mining literature for specimen codes is something I've been interested in for a while (search for specimen codes on iPhylo), and @Aime Rankin (formerly an undergraduate student at Glasgow) did some work on this as well. It's great to see progress in this area.

GBIFGoogle DocsMaterial ExaminedSpecimen CodesWeb ServicesInformatikEnglisch

Looking up specimen codes in GBIF using Google Spreadsheet

https://doi.org/10.59350/c236y-8bn90

Veröffentlicht 21. April 2015 in iPhylo

Autor Roderic Page

Playing with the my "material examined" tool I've been working on, I wondered whether I could make use of it in, say, a spreadsheet. Imagine that I have a spreadsheet of museum codes and want to look those up in GBIF. I could create a service for Open Refine but Open Refine is a bit big and clunky, you have to fire up a Java application and point your browser at it, and Open Refine isn't as intuitive or as flexible as a spreadsheet.

GBIFGenbankKnowledge GraphSpecimen CodesInformatikEnglisch

Linking specimen codes to GBIF

https://doi.org/10.59350/g6gq1-crg31

Veröffentlicht 15. April 2015 in iPhylo

Autor Roderic Page

I've put together a working demo of some code I've been working on to discover GBIF records that correspond to museum specimen codes. The live demo is at http://bionames.org/~rpage/material-examined/ and code is on GitHub. To use the demo, simply paste in a specimen code (e.g., "MCZ 24351") and click Find and it will do it's best to parse the code, then go off to GBIF and see what it can find.

BHLNHMPteralopexSpecimen CodesInformatikEnglisch

Linking data from the NHM portal with content in BHL

https://doi.org/10.59350/fwy42-qza35

Veröffentlicht 18. Dezember 2014 in iPhylo

Autor Roderic Page

One reason I'm excited by the launch of the NHM data portal is that it opens up opportunities to link publications about specimens i the NHM to the record of the specimens themselves.

CrossrefDataCiteDOIIdentifiersSpecimen CodesInformatikEnglisch

Quick thoughts on specimen identifiers

https://doi.org/10.59350/8y7v3-6jc97

Veröffentlicht 20. April 2012 in iPhylo

Autor Roderic Page

Based on recent discussions my sense is that our community will continue to thrash the issue of identifiers to death, repeating many of the debates that have gone on (and will go on) in other areas. To be trite, it seems to me we have three criteria: cheap , resolvable , and persistent . We get to pick two. Cheap and resolvable means URLs, which everybody is nervous about because they break.

Darwin Core RipletDuplicatesGBIFIdentifiersSpecimen CodesInformatikEnglisch

How many specimens does GBIF really have?

https://doi.org/10.59350/2d3dv-8q010

Veröffentlicht 23. Februar 2012 in iPhylo

Autor Roderic Page

Duplicate records are the bane of any project that aggregates data from multiple sources.

Darwin Core RipletData MiningMuseumSpecimen CodesInformatikEnglisch

Extracting museum specimen codes from text

https://doi.org/10.59350/6qy4m-eg641

Veröffentlicht 26. Januar 2012 in iPhylo

Autor Roderic Page

Quick note about a tool I've cobbled together as part of the phyloinformatics course, which addresses a long standing need I and others have to extract specimen codes from text. I've had this code kicking around for a while (as part of various never-finished data mining projects), but never got around to releasing it, until now.

Rogue Scholar Beiträge

Round trip from identifiers to citations and back again

Geocoding genomic databases using GBIF

Text mining for museum specimen identifiers

Looking up specimen codes in GBIF using Google Spreadsheet

Linking specimen codes to GBIF

Linking data from the NHM portal with content in BHL

Quick thoughts on specimen identifiers

How many specimens does GBIF really have?

Extracting museum specimen codes from text