Rogue Scholar

Pubblicato 8 dicembre 2022 in Andrew Heiss's blog

I always forget how to deal with logged values in ggplot—particularly things that use the natural log.

Research-integrityNaturallanguageprocessingData-visualizationAcademic-publishingScienze informatiche e dell'informazioneInglese

The Papermill Alarm: Patterns in PubMed

https://doi.org/10.59350/47nrn-m5h86

Pubblicato 23 agosto 2022 in Stories by Adam Day on Medium

Autore Adam Day

This post is about The Papermill Alarm: an API for detecting potential papermill-products. There’s a field of study called ‘stylometry’ where we look at the statistical properties of someone’s writing and use that to model their ‘style’. People write in idiosyncratic ways.

RTidyverseGgplotData VisualizationScienze politicheInglese

Quick and easy ways to deal with long labels in ggplot2

https://doi.org/10.59350/x7xtj-3dh31

Pubblicato 23 giugno 2022 in Andrew Heiss's blog

Autore Andrew Heiss

In one of the assignments for my data visualization class, I have students visualize the number of essential construction projects that were allowed to continue during New York City’s initial COVID shelter-in-place order in March and April 2020. It’s a good dataset to practice visualizing amounts and proportions and to practice with dplyr ’s group_by() and summarize() and shows some interesting trends.

RTidyverseRegressionStatisticsData VisualizationScienze politicheInglese

Marginalia: A guide to figuring out what the heck marginal effects, marginal slopes, average marginal effects, marginal effects at the mean, and all these other marginal things are

https://doi.org/10.59350/40xaj-4e562

Pubblicato 20 maggio 2022 in Andrew Heiss's blog

Autore Andrew Heiss

Diagrams! You can download PDF, SVG, and PNG versions of the marginal effects diagrams in this guide, as well as the original Adobe Illustrator file, here: PDFs, SVGs, and PNGs Illustrator .ai file Do whatever you want with them! They’re licensed under Creative Commons Attribution-ShareAlike (BY-SA 4.0). I’m a huge fan of doing research and analysis in public.

DashboardsData VisualizationData ScienceCross-post: StarschemaScienze naturaliInglese

As we may see: the world after dashboards

https://doi.org/10.59350/x8nrs-z3168

Pubblicato 1 aprile 2022 in Chris von Csefalvay

Autore Chris von Csefalvay

In my latest post for the Starschema blog, I discuss the end of dashboards, and what comes next: Read the full post here.

Data VisualizationGraphical SummariesHighest Density RegionHdrcdeRScienze informatiche e dell'informazioneInglese

A Journey to gghdr

https://doi.org/10.59350/scs7p-rnd48

Pubblicato 15 febbraio 2022 in rOpenSci - open tools for open science

Autore Sayani Gupta

This was how being a newcomer to rOpenSci OzUnconf 2019 felt. It was incredible to be a part of such a diverse, welcoming and inclusive environment. I thought it would be fun to blog about how it all began, and the twists and turns we experienced along the way as we developed the gghdr package. The package provides tools for plotting highest density regions with ggplot2 and was inspired by the package hdrcde developed by Rob J Hyndman.

RData VisualizationScienze informatiche e dell'informazioneInglese

What I read in 2021

https://doi.org/10.59350/dzdwa-es082

Pubblicato 31 dicembre 2021 in dataand.me

Autore Mara Averick

My relationship with reading borders on pathological (and by “borders on” I mean “has literally been a topic of discussion in therapy”). I mean, I’ve gotten it under control somewhat —we’ll use my 2014 Goodreads Reading Challenge as a bar for a bit out of control—which means I can take a look back on my 2021 year in books without too much self-recrimination.

RTidyverseRegressionStatisticsData VisualizationScienze politicheInglese

How to create a(n almost) fully Bayesian outcome model with inverse probability weights

https://doi.org/10.59350/gyvjk-hrx68

Pubblicato 20 dicembre 2021 in Andrew Heiss's blog

Autore Andrew Heiss

Read the previous post first! This post is a sequel to the previous one on Bayesian propensity scores and won’t make a lot of sense without reading that one first. Read that one first! In my previous post about how to create Bayesian propensity scores and how to legally use them in a second stage outcome model, I ended up using frequentist models for the outcome stage.

RTidyverseRegressionStatisticsData VisualizationScienze politicheInglese

How to use Bayesian propensity scores and inverse probability weights

https://doi.org/10.59350/nrwsd-3jz20

Pubblicato 18 dicembre 2021 in Andrew Heiss's blog

Autore Andrew Heiss

This post combines two of my long-standing interests: causal inference and Bayesian statistics. I’ve been teaching a course on program evaluation and causal inference for a couple years now and it has become one of my favorite classes ever.

RTidyverseRegressionStatisticsData VisualizationScienze politicheInglese

A guide to working with country-year panel data and Bayesian multilevel models

https://doi.org/10.59350/t19jz-ds665

Pubblicato 1 dicembre 2021 in Andrew Heiss's blog

Autore Andrew Heiss

In most of my research, I work with country-level panel data where each row is a country in a specific year (Afghanistan in 2010, Afghanistan in 2011, and so on), also known as time-series cross-sectional (TSCS) data.

Messaggi di Rogue Scholar

How to use natural and base 10 log scales in ggplot2

The Papermill Alarm: Patterns in PubMed

Quick and easy ways to deal with long labels in ggplot2

Marginalia: A guide to figuring out what the heck marginal effects, marginal slopes, average marginal effects, marginal effects at the mean, and all these other marginal things are

As we may see: the world after dashboards

A Journey to gghdr

What I read in 2021

How to create a(n almost) fully Bayesian outcome model with inverse probability weights

How to use Bayesian propensity scores and inverse probability weights

A guide to working with country-year panel data and Bayesian multilevel models