Rogue Scholar Posts

language
Published in rOpenSci - open tools for open science
Author Jeroen Ooms

A new version of pdftools has been released to CRAN. Go get it while it’s hot:install.packages("pdftools") This version has two major improvements: low level text extraction and encoding improvements.About PDF textboxes A pdf document may seem to contain paragraphs or tables in a viewer, but this is not actually true.

Published in rOpenSci - open tools for open science
Author Thomas J. Leeper

There is no problem in science quite as frustrating as other peoples’ data . Whether it’s malformed spreadsheets, disorganized documents, proprietary file formats, data without metadata, or any other data scenario created by someone else, scientists have taken to Twitter to complain about it. As a political scientist who regularly encounters so-called “open data” in PDFs, this problem is particularly irritating.

Published in rOpenSci - open tools for open science
Author Jeroen Ooms

Scientific articles are typically locked away in PDF format, a format designed primarily for printing but not so great for searching or indexing. The new pdftools package allows for extracting text and metadata from pdf files in R. From the extracted plain-text one could find articles discussing a particular drug or species name, without having to rely on publishers providing metadata, or pay-walled search engines.

Published in OpenCitations blog

Very VERY occasionally I read a paper that is so well written, and which addressed the points so accurately and so eloquently, that I rejoice.  The paper by Pettifer et al . entitled Ceci n’est pas un hamburger: modelling and representing the scholarly article that appeared in Learned Publishing last October [1], is one of this special handful.

Published in iPhylo

Continuing on from my previous post Viewing scientific articles on the iPad: towards a universal article reader, here are some brief notes on the PLoS iPad app that I've previously been critical of.There are two key things to note about this app. The first is that it uses the page turning metaphor. The article is displayed as a PDF, a page at a time, and the user swipes the page to turn it over.

Published in Science in the Open
Author Cameron Neylon

As has been noted in a few places, Neil Withers, one of the editors of soon to be newest Nature journal, Nature Chemistry put out a request last week for input on a range of issues to do with how people use journals, formats, and technical widgets. Egon Willighagen, Rich Apodaca, and Oscar the Journal Munching Robot (masquerading as Peter Murray-Rust, or is that the other way around?) have already posted responses.