Rogue Scholar

Publicado 15 de noviembre de 2018 in iPhylo

I've put a short note up on bioRxiv about ways to geocode nucleotide sequences in databases such as GenBank. The preprint is "Geocoding genomic databases using GBIF" https://doi.org/10.1101/469650.

GBIFGenbankGeoreferencingNote To SelfInformática y Ciencias de la InformaciónInglés

Notes on finding georeferenced sequences in GenBank

https://doi.org/10.59350/k5syz-mpb09

Publicado 6 de octubre de 2017 in iPhylo

Autor Roderic Page

Notes on how many georeferenced DNA sequences there are in GenBank, and how many could potentially be georeferenced.

GBIFGenbankKnowledge GraphSpecimen CodesInformática y Ciencias de la InformaciónInglés

Linking specimen codes to GBIF

https://doi.org/10.59350/g6gq1-crg31

Publicado 15 de abril de 2015 in iPhylo

Autor Roderic Page

I've put together a working demo of some code I've been working on to discover GBIF records that correspond to museum specimen codes. The live demo is at http://bionames.org/~rpage/material-examined/ and code is on GitHub. To use the demo, simply paste in a specimen code (e.g., "MCZ 24351") and click Find and it will do it's best to parse the code, then go off to GBIF and see what it can find.

GenbankMuseum CollectionsInformática y Ciencias de la InformaciónInglés

The Experimenter’s Museum: GenBank, Natural History, and the Moral Economies of Biomedicine

https://doi.org/10.59350/wqdnj-fhz15

Publicado 8 de abril de 2014 in iPhylo

Autor Roderic Page

An undergraduate student (Aime Rankin) doing a project with me on citation and impact of museum collections came across a paper I hadn't seen before:Unfortunately the paper is behind a paywall, but here's the abstract (you can also get a PDF here):It's well worth a read. It argues that sequence databases such as Genbank are essentially the equivalent of the great natural history museums of the 19th Century. There are several ironies here.

GenbankNCBIType SpecimensInformática y Ciencias de la InformaciónInglés

NCBI taxonomy database now shows type material

https://doi.org/10.59350/217kh-1j345

Publicado 24 de enero de 2014 in iPhylo

Autor Roderic Page

Scott Federhen told me about a nice new feature in GenBank that he's described in a piece for NCBI News. The NCBI taxonomy database now shows a its of type material (where known), and the GenBank sequence database "knows: about types. Here's the summary:You can query for sequences from type using the query "sequence from type"[filter]. This could lead to some nice automated tools.

DNA BarcodingGenbankGPSGuest PostInformática y Ciencias de la InformaciónInglés

Guest post: response to "Putting GenBank Data on the Map"

https://doi.org/10.59350/fdkqx-47m65

Publicado 12 de diciembre de 2013 in iPhylo

Autor Roderic Page

The following is a guest blog post by David Schindel and colleagues and is a response to the paper by Antonio Marques et al. in Science doi:10.1126/science.341.6152.1341-a.Marques, Maronna and Collins (1) rightly call on the biodiversity research community to include latitude/longitude data in database and published records of natural history specimens.

AnnotationErrorGBIFGenbankIdentifiersInformática y Ciencias de la InformaciónInglés

Yet more reasons to have specimen identifiers: annotating GenBank sequences

https://doi.org/10.59350/k46hh-dz648

Publicado 1 de marzo de 2012 in iPhylo

Autor Roderic Page

One reason I'm pursuing the theme of specimen identifiers (and identifiers in general) is the central role they play in annotating databases. To give a concrete example, I (among others) have argued for a wiki-style annotation layer on top of GenBank to capture things such as sequencing errors, updated species names, etc. Annotation is a lot easier if we have consistent identifiers for the things being annotated.

FrogsGBIFGenbankGeophylogenyKMLInformática y Ciencias de la InformaciónInglés

Linking GBIF and Genbank

https://doi.org/10.59350/hj161-hh554

Publicado 21 de febrero de 2012 in iPhylo

Autor Roderic Page

As part of my mantra that it's not about the data, it's all about the links between the data, I've started exploring matching GenBank sequences to GBIF occurrences using the specimen_voucher codes recorded in GenBank sequences. It's quickly becoming apparent that this is not going to be easy.

Bio2RDFCrossrefDOIGenbankIntegrationInformática y Ciencias de la InformaciónInglés

TDWG Challenge - what is RDF good for?

https://doi.org/10.59350/wwpmk-0es54

Publicado 19 de octubre de 2011 in iPhylo

Autor Roderic Page

Last month, feeling particularly grumpy, I fired off an email to the TDWG-TAG mailing list with the subject Lobbing grenades: a challenge . Here's the email:In the context of the TDWG meeting (happening as we speak and which I'm following via Twitter, hashtag #tdwg) Joel Sachs asked me whether I had any specific data in mind that could form the basis of a discussion. So, here goes.

Dark TaxaDNA BarcodingGenbankNCBIPost-taxonomicInformática y Ciencias de la InformaciónInglés

Dark taxa: GenBank in a post-taxonomic world

https://doi.org/10.59350/xhvv2-xjt24

Publicado 12 de abril de 2011 in iPhylo

Autor Roderic Page

In an earlier post (Are names really the key to the big new biology?, I questioned Patterson et al.'s assertion in a recent TREE article (doi:10.1016/j.tree.2010.09.004) that names are key to the new biology.In this post I'm going to revisit this idea by doing a quick analysis of how many species in GenBank have "proper" scientific names, and whether the number of named species has changed over time.

Publicaciones de Rogue Scholar

Geocoding genomic databases using GBIF

Notes on finding georeferenced sequences in GenBank

Linking specimen codes to GBIF

The Experimenter’s Museum: GenBank, Natural History, and the Moral Economies of Biomedicine

NCBI taxonomy database now shows type material

Guest post: response to "Putting GenBank Data on the Map"

Yet more reasons to have specimen identifiers: annotating GenBank sequences

Linking GBIF and Genbank

TDWG Challenge - what is RDF good for?

Dark taxa: GenBank in a post-taxonomic world