Messages de Rogue Scholar

language
Publié in rOpenSci - open tools for open science
Auteur Mahmoud Ahmed

A few months ago, I wasn’t sure what to expect when looking at fluorescence microscopy images in published papers. I looked at the accompanying graph to understand the data or the point the authors were trying to make. Often, the graph represents one or more measures of the so-called co-localization, but I couldn’t figure out how to interpret them. It turned out; reading the images is simple.

Publié in rOpenSci - open tools for open science
Auteur Jeroen Ooms

Last week Google and friends released the new major version of their OCR system: Tesseract 4. This release builds upon 2+ years of hard work and has completely overhauled the internal OCR engine. From the tesseract wiki: We have now also updated the R package tesseract to ship with the new Tesseract 4 on MacOS and Windows. It uses the new engine by default, and the results are extremely impressive!

Publié in rOpenSci - open tools for open science
Auteur Thomas Klebel

Every R package has its story. Some packages are written by experts, some bynovices. Some are developed quickly, others were long in the making. This is thestory of jstor, a package which I developed during my time as a student ofsociology, working in a research project on the scientific elite withinsociology.

Publié in rOpenSci - open tools for open science
Auteur Rafael Pilliard Hellwig

Background Surveys are ubiquitous in the social sciences, and the best of them are meticulously planned out. Statisticians often decide on a sample size based on a theoretical design, and then proceed to inflate this number to account for “sample losses”. This ensures that the desired sample size is achieved, even in the presence of non-response.

Publié in rOpenSci - open tools for open science
Auteur Max Joseph

Hundreds of thousands of people in east Africa have been displaced and hundreds have died as a result of torrential rains which ended a drought but saturated soils and engorged rivers, resulting in extreme flooding in 2018.This post will explore these events using the R package smapr, which provides access to global satellite-derived soil moisture data collected by the NASA Soil Moisture Active-Passive (SMAP) mission and abstracts away some of

Auteur Jeroen Ooms

This week version 2.0 of the mongolite package has been released to CRAN. Major new features in this release include support for MongoDB 4.0, GridFS, running database commands, and connection pooling. Mongolite is primarily an easy-to-use client to get data in and out of MongoDB. However it supports increasingly many advanced features like aggregation, indexing, map-reduce, streaming, encryption, and enterprise authentication.

Auteur Dom Bennett

In this technote I will outline what phylotaR was developed for, how to install it and how to run it with some simple examples.What is phylotaR? In any phylogenetic analysis it is important to identify sequences that share the same orthology – homologous sequences separated by speciation events. This is often performed by simply searching an online sequence repository using sequence labels.

Auteur Matthew Strimas-Mackey

eBird is an online tool for recording birdobservations. The eBird database currently contains over 500 millionrecords of bird sightings, spanning every country and nearly every birdspecies, making it an extremely valuable resource for bird research andconservation. These data can be used to map the distribution andabundance of species, and assess how species’ ranges are changing overtime. This dataset is available for download as a text file;

Auteurs Sean Hughes, Angela Li, Ju Kim, Malisa Smith, Ted Laderas

Motivation A few weeks ago, as part of the rOpenSci Unconference, a group of us (Sean Hughes, Malisa Smith, Angela Li, Ju Kim, and Ted Laderas) decided to work on making the UMAP algorithm accessible within R. UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique that allows the user to reduce high dimensional data (multiple columns) into a smaller number of columns for visualization purposes (github,