Computer and Information SciencesHugo

Martin Modrák

Recent content on Martin Modrák
Home PageRSS FeedMastodon
language
Published

In this post we’ll explore the link between Bayes factors and cross-validation as discussed in Fong & Holmes 2020: On the marginal likelihood and cross-validation. I’ll then argue why this is a reason to not trust Bayes factors too much. This is a followup to Three ways to compute a Bayes factor, though I will repeat all the important bits here.

Published

To celebrate a new paper out in Bayesian Analysis, let’s talk simulation-based calibration checking (SBC). SBC is a method where you use simulated datasets to verify that you implemented you model correctly and/or that your sampling algorithm work. It was introduced by Talts et al. and has been known and used for a while, but was considered to have a few shortcomings, which we try to address.

Published

Nathaniel Haines made a neat tweet showing off his model of reaction times that handles possible contamination with both implausibly short reaction times (e.g., if people make an anticipatory response that is not actually based on processing the stimulus of interest) or implausibly large reaction times (e.g., if their attention drifts away from the task, but they snap back to it after having “zoned out” for a few seconds). Response times that

Published

Generating document via RMarkdown is fun! So I recently used RMarkdown to generate reports that were written in Czech. Interestingly, Czech has rules on some words that are not allowed to be the last on a line of text - those are almost all single-letter words and a few abbreviations. MS Word is actually smart enough to enforce this policy, but this does not happen for the HTML and PDF outputs from RMarkdown.

Published

The Approximation - Big Picture Saddlepoint for Sum of NBs Implementing the Approximation in Stan A Simple Baseline Eyeballing Masses Evaluating Performance Summing up Saddlepoint Approximations for Other Families I recently needed to find the distribution of sum of non-identical but independent negative binomial (NB) random variables. Although for some special cases the sum is itself NB, analytical solution is not feasible in the general case.

Published

Recently another high-profile piece on abandoning statistical significance by Amrhein, Greenland & McShane was published. I have mixed feelings about this, me and my Twitter bubble are mostly like “Another one of those?!”… But how did I get from not doing almost any statistics five years ago to considering myself a cool insider that can look down on a prominent piece by a group of lifelong experts?

Published

For the Czech national bioinformatics conference (ENBIK) I prepared a short presentation on Type S and Type M errors and how to use simulations to understand what your method might do before conducting an experiment. I show how t-test can fail, inspired by Andrew Gelman’s take on power = .06 and how DESeq2 (used to determine differentially expressed genes) does a good job at mitigating false positives at the cost of increased false negatives.