Messages de Rogue Scholar

language
Publié in Jabberwocky Ecology

I had an interesting conversation with someone the other day that made me think I needed one last frequency distribution post in order to avoid causing some people to not move forward with addressing interesting questions. As a quantitative ecologist I spent a fair amount of time trying to figure out the best way to do things. In other words, I often want to know what the best method is available for answering a particular question.

Publié in Science in the Open
Auteur Cameron Neylon

A very interesting paper from Caroline Savage and Andrew Vickers was published in PLoS ONE last week detailing an empirical study of data sharing of PLoS journal authors. The results themselves, that one out ten corresponding authors provided data, are not particularly surprising, mirroring as they do previous studies, both formal [pdf] and informal (also from Vickers, I assume this is a different data set), of data sharing.

Publié in Jabberwocky Ecology

This is a table of contents of sorts for five posts on the visualization, fitting, and comparison of frequency distributions. The goal of these posts is to expose ecologists to the ideas and language related to good statistical practices for addressing frequency distribution data. The focus is on simple distributions and likelihood methods.

Publié in Jabberwocky Ecology

Summary Likelihood, likelihood, likelihood (and maybe some other complicated approaches), but definitely not r^2 values from fitting regressions to binned data. A bit more nitty gritty detail In addition to causing issues with parameter estimation, binning based methods are also inappropriate when trying to determine which distribution provides the best fit to empirical data.

Publié in Jabberwocky Ecology

Summary Don’t bin you’re data and fit a regression. Don’t use the CDF and fit a regression. Use maximum likelihood or other statistically grounded approaches that can typically be looked up on Wikipedia. A bit more detail OK, so you’ve visualized your data and after playing around a bit you have an idea of what the basic functional form of the model is. Now you want to estimate the parameters.

Publié in Jabberwocky Ecology

Beyond simple histograms there are two basic methods for visualizing frequency distributions. Kernel density estimation is basically a generalization of the idea behind histograms. The basic idea is to put an miniature distribution (e.g., a normal distribution) at the position of each individual data point and then add up those distributions to get an estimate of the frequency distribution.

Publié in Jabberwocky Ecology

Well, I guess that grant season was a bit of an optimistic time to try to do a 4 part series on frequency distributions, but I’ve got a few minutes before heading off to an all day child birth class so I thought I’d see if I could squeeze in part 2. OK, so you have some data and you’d like to get a rough visual idea of its frequency distribution. What do you do know? There are 3 basic approaches that I’ve seen used: Histograms.

Publié in iPhylo

For the last two days I've been participating in a NESCent meeting on Dryad, a "repository of data underlying scientific publications, with an initial focus on evolutionary biology and related fields". The aim of Dryad is to provide a durable home for the kinds of data that don't get captured by existing databases such as GenBank and TreeBASE (for example, the Excel spreadsheets, Word files, and tarballs of data that, if they are lucky, make it