Postagens de Rogue Scholar

language
Publicados in rOpenSci - open tools for open science
Autor Jeroen Ooms

The new rOpenSci spelling package provides utilities for spell checking common document formats including latex, markdown, manual pages, and DESCRIPTION files. It also includes tools especially for package authors to automate spell checking of R documentation and vignettes. Spell Checking Packages The main purpose of this package is to quickly find spelling errors in R packages.

Publicados in rOpenSci - open tools for open science
Autor Kyle Bocinsky

The package FedData has gone through software review and is now part of rOpenSci. FedData includes functions to automate downloading geospatial data available from several federated data sources (mainly sources maintained by the US Federal government). Currently, the package enables extraction from six datasets: The National Elevation Dataset (NED) digital elevation models (1 and 1/3 arc-second;

Publicados in rOpenSci - open tools for open science
Autor Mara Averick

Contributing to an open-source community without contributing code is an oft-vaunted idea that can seem nebulous. Luckily, putting vague ideas into action is one of the strengths of the rOpenSci Community, and their package onboarding system offers a chance to do just that.

Publicados in rOpenSci - open tools for open science
Autor Nicholas Tierney

This is a phrase that comes up when you first get a dataset. It is also ambiguous. Does it mean to do some exploratory modelling? Or make some histograms, scatterplots, and boxplots? Is it both? Starting down either path, you often encounter the non-trivial growing pains of working with a new dataset.

Publicados in rOpenSci - open tools for open science
Autor Jeroen Ooms

Last week we released an update of the tesseract package to CRAN. This package provides R bindings to Google’s OCR library Tesseract. install.packages("tesseract") The new version ships with the latest libtesseract 3.05.01 on Windows and MacOS. Furthermore it includes enhancements for managing language data and using tesseract together with the magick package.

Publicados in rOpenSci - open tools for open science
Autor Jeroen Ooms

Last week, version 1.0 of the magick package appeared on CRAN: an ambitious effort to modernize and simplify high quality image processing in R. This R package builds upon the Magick++ STL which exposes a powerful C++ API to the famous ImageMagick library. The best place to start learning about magick is the vignette which gives a brief overview of the overwhelming amount of functionality in this package.

Publicados in rOpenSci - open tools for open science
Autor Scott Chamberlain

What is Taxonomy? Taxonomy in its most general sense is the practice and science of classification. It can refer to many things. You may have heard or used the word taxonomy used to indicate any sort of classification of things, whether it be companies or widgets. Here, we’re talking about biological taxonomy, the science of defining and naming groups of biological organisms.

Publicados in rOpenSci - open tools for open science
Autores Rich FitzJohn, Os Keyes, Stephanie Locke, Jeroen Ooms, Bob Rudis

Most of us who work in R just want to Get Stuff Done™. We want a minimum amount of friction between ourselves and the data we need to wrangle, analyze, and visualize. We’re focused on solving a problem or gaining insights into a new area of research. We rely on a rich, community-driven ecosystem of packages to help get our work done and likely make an unconscious assumption that there is a safety net out there, protecting us from harm.

Publicados in rOpenSci - open tools for open science
Autores Noam Ross, Alice Daish, Laura DeCicco, Molly Lewis, Nistara Randhawa, Jennifer Thompson, Nicholas Tierney

Two years ago at #runconf15, there was a great discussion about best practices for organizing R-based analysis projects that yielded a nice guidance document describing research compendia . Compendia, as we described them, were minimal products of reproducible research, using parts of R package structure to organize the inputs, analyses, and outputs of research projects.