Informática y Ciencias de la InformaciónInglésBlogger

Syntaxus baccata

Syntaxus baccata
Thoughts about bibliographic metadata, programming, statistics, taxonomy, and biology.
Página de inicioFeed AtomMastodon
language
Publicado

I finished the General Plugin system for Citation.js a few days ago (more on that later), so I could finally publish a new beta release. Now, after that half-finished piece of code had been blocking other work for a long while, I can at last start… fixing bugs, and closing other items in the backlog. One of the items that has been on the backlog for a long time, and was on the backlog of the previous major version too, was sorting out BibJSON.

Publicado

Below is part two of a small series on ctj rdf, a new program I made to transform ContentMine CProjects into SPARQL-queryable Wikidata-linked rdf. Here’s a more detailed example. We are using my dataset available at 10.5281/zenodo.845935. It was generated from 1000 articles that mention ‘Pinus’ somewhere. This one has 15326 statements, whereof 8875 (57.9%) can be mapped, taking ~50 seconds.

Publicado

Below is part one of a small series on ctj rdf, a new program I made to transform ContentMine CProjects into SPARQL-queryable Wikidata-linked rdf. ctj has been around for longer, and started as a way to learn my way into the ContentMine pipeline of tools, but turned out to uncover a lot of possibilities in further processing the output of this pipeline (1, 2). The recent addition of ctj rdf expands on this.

Publicado

Originally posted on the ContentMine blog. Lars Willighagen, orcid:0000-0002-4751-4637 Final Report of my fellowship at the ContentMine. Proposal My proposal was to extract facts about various conifer species by analysing text from papers with software suited for analysing text and the tools provided by the ContentMine. These facts were then to be converted into JSON, and then viewable with an HTML (+CSS/JS) interface.

Publicado

Citation.js now supports BibJSON. How I did that without actually updating Citation.js? Well, apparently I supported it all along. I've supported the quickscrape output format since July last year, and that turned out to be BibJSON. How convenient. I'll update the demo and docs to reflect this revelation (currently it just says "quickscrape's JSON scheme"), and, now that I can find actual documentation, some improvements to the parser.

Publicado

This week, the big achievement is the addition of a multi-step form to add the semantic triples from last week to Wikidata with QuickStatements, which we talked about before too. The new '+' icon in table rows now links to a page where you can curate the statement and add Wikidata IDs where necessary. At the last step, you get a table of the existing data, the added identifiers and soon their Wikidata label.

Publicado

In Weekly Report 10 I talked about searching for answers to the question "What height does a grown Pinus sylvestris normally have?". In the post, I looked at some of the articles returned by the query "Pinus sylvestris"[Abstract] AND height, and found interesting information in the tables. The next step was to extract this information. So that's what I did.

Publicado

Earlier this week, tarrow published factvis, short for fact visualisation. I decided to have a go with the design, and I made this, in the style of cardlists. Note: If my version and tarrow's version of factvis look very similar, my changes are probably pushed to the master branch already. Screenshot of my factvis design The facts being visualised come from the ContentMine.

Publicado

This week I wanted to extend my program, with the lists of cards containing information. In the past few weeks, I made examples with topics such as conifers and zika. In the previous report I explained how I got the facts, and how other people could get them as well. But I felt it was a bit too complex, and mainly too messy.

Publicado

Last week I wanted to look into extracting more facts, and the relation between found species and compounds. This would be done by extending ami. However, it became clear there will be big improvements to ami in the future, and things like ChemicalTagger and OSCAR are planned to be implemented anyway. It's better to wait for those things to complete before extending it for my own purposes. Instead I improved the card page for future use.