Rogue Scholar

Pubblicato 23 marzo 2024

In this post we’ll explore the link between Bayes factors and cross-validation as discussed in Fong & Holmes 2020: On the marginal likelihood and cross-validation. I’ll then argue why this is a reason to not trust Bayes factors too much. This is a followup to Three ways to compute a Bayes factor, though I will repeat all the important bits here.

Scienze informatiche e dell'informazioneInglese

Brms hacking: linear predictors for random effect standard deviations

https://doi.org/10.59350/rd59h-hca42

Pubblicato 17 febbraio 2024

Autore Martin Modrák

brms is a great package. It allows you to put predictors on a lot of things. Its power is however not absolute — one thing it doesn’t let you directly do is use data to predict variances of random/varying effects.

Scienze informatiche e dell'informazioneInglese

The SBC package - check your models before you wreck yourself

https://doi.org/10.59350/rwg8s-q3838

Pubblicato 1 novembre 2023

Autore Martin Modrák

To celebrate a new paper out in Bayesian Analysis, let’s talk simulation-based calibration checking (SBC). SBC is a method where you use simulated datasets to verify that you implemented you model correctly and/or that your sampling algorithm work. It was introduced by Talts et al. and has been known and used for a while, but was considered to have a few shortcomings, which we try to address.

Scienze informatiche e dell'informazioneInglese

Using brms to model reaction times contaminated with errors

https://doi.org/10.59350/nda32-61r30

Pubblicato 1 aprile 2021

Autore Martin Modrák

Nathaniel Haines made a neat tweet showing off his model of reaction times that handles possible contamination with both implausibly short reaction times (e.g., if people make an anticipatory response that is not actually based on processing the stimulus of interest) or implausibly large reaction times (e.g., if their attention drifts away from the task, but they snap back to it after having “zoned out” for a few seconds). Response times that

Scienze informatiche e dell'informazioneInglese

Three ways to compute a Bayes factor

https://doi.org/10.59350/80dxt-sxd75

Pubblicato 28 marzo 2021

Autore Martin Modrák

This post was inspired by a very interesting paper on Bayes factors: Workflow Techniques for the Robust Use of Bayes Factors by Schad, Nicenboim, Bürkner, Betancourt and Vasishth. I would specifically recommend it for its introduction into what actually is a hypothesis in the Bayesian context and insights into what Bayes factors are.

Scienze informatiche e dell'informazioneInglese

Enforcing line-break rules in RMarkdown via Pandoc

https://doi.org/10.59350/21xxj-t5x81

Pubblicato 16 dicembre 2020

Autore Martin Modrák

Generating document via RMarkdown is fun! So I recently used RMarkdown to generate reports that were written in Czech. Interestingly, Czech has rules on some words that are not allowed to be the last on a line of text - those are almost all single-letter words and a few abbreviations. MS Word is actually smart enough to enforce this policy, but this does not happen for the HTML and PDF outputs from RMarkdown.

Scienze informatiche e dell'informazioneInglese

Approximate Densities for Sums of Variables: Negative Binomials and Saddlepoint

https://doi.org/10.59350/dwf8j-hqb80

Pubblicato 20 giugno 2019

Autore Martin Modrák

The Approximation - Big Picture Saddlepoint for Sum of NBs Implementing the Approximation in Stan A Simple Baseline Eyeballing Masses Evaluating Performance Summing up Saddlepoint Approximations for Other Families I recently needed to find the distribution of sum of non-identical but independent negative binomial (NB) random variables. Although for some special cases the sum is itself NB, analytical solution is not feasible in the general case.

Scienze informatiche e dell'informazioneInglese

Thank you: Statistics as a Journey

https://doi.org/10.59350/3s60r-q9963

Pubblicato 24 marzo 2019

Autore Martin Modrák

Recently another high-profile piece on abandoning statistical significance by Amrhein, Greenland & McShane was published. I have mixed feelings about this, me and my Twitter bubble are mostly like “Another one of those?!”… But how did I get from not doing almost any statistics five years ago to considering myself a cool insider that can look down on a prominent piece by a group of lifelong experts?

Scienze informatiche e dell'informazioneInglese

A Plea for Tests in R Markdown

https://doi.org/10.59350/yzane-9xe90

Pubblicato 17 settembre 2018

Autore Martin Modrák

I’ve read Yihui Xie’s thoughtful response to the I don’t like notebooks talk from JupyterCon 2018. And I agree with basically everything Yihui said, only one point felt like it could give a wrong impression. It states: This reads as if there is no room for automated tests in markdown/notebooks.

Scienze informatiche e dell'informazioneInglese

Kangaroo and DESeq2 (ENBIK 2018)

https://doi.org/10.59350/s90tr-b0q76

Pubblicato 11 giugno 2018

Autore Martin Modrák

For the Czech national bioinformatics conference (ENBIK) I prepared a short presentation on Type S and Type M errors and how to use simulations to understand what your method might do before conducting an experiment. I show how t-test can fail, inspired by Andrew Gelman’s take on power = .06 and how DESeq2 (used to determine differentially expressed genes) does a good job at mitigating false positives at the cost of increased false negatives.

Martin Modrák

Cross-validation --- a fourth way to compute a Bayes factor

Brms hacking: linear predictors for random effect standard deviations

The SBC package - check your models before you wreck yourself

Using brms to model reaction times contaminated with errors

Three ways to compute a Bayes factor

Enforcing line-break rules in RMarkdown via Pandoc

Approximate Densities for Sums of Variables: Negative Binomials and Saddlepoint

Thank you: Statistics as a Journey

A Plea for Tests in R Markdown

Kangaroo and DESeq2 (ENBIK 2018)