Sprachwissenschaften und LiteraturwissenschaftenEnglischWordPress

Technology and language

by Angus Grieve-Smith
StartseiteAtom-Feed
language
Veröffentlicht

I’ve written in the past about instrumentalism, the scientific practice of treating theories as tools that can be evaluated by their usefulness, rather than as claims that can be evaluated as true or false. If you haven’t tried this way of looking at science, I highly recommend it! But if theories are tools, what are they used for? What makes a theory more or less useful?

Veröffentlicht

You might be familiar with Arthur C. Clarke’s Third Law, “Any sufficiently advanced technology is indistinguishable from magic.” Clarke tucked this away in a footnote without explanation, but it fits in with the discussion of magic in Chapter III of James Frazer’s magnum opus The Golden Bough . These two works have shaped a lot of my thoughts about science, technology and the way we interact with our world.

Veröffentlicht

The big buzz over the past few years has been Data Science. Corporations are opening Data Science departments and staffing them with PhDs, and universities have started Data Science programs to sell credentials for these jobs. As a linguist I’m particularly interested in this new field, because it includes research practices that I’ve been using for years, like corpus linguistics and natural language processing.

Veröffentlicht

The Problem You’ve probably heard the joke about the two people camping in the woods who encounter a hungry predator. One person stops to put on running shoes. The other says, “Why are you wasting time? Even with running shoes you’re not going to outrun that animal!” The other replies, “I don’t have to outrun the animal, I just have to outrun you.” For me this joke highlights a problem with the way some people argue about climate change.

Veröffentlicht

In 1936, Literary Digest magazine made completely wrong predictions about the Presidential election. They did this because they polled based on a bad sample: driver’s licenses and subscriptions to their own magazine. Enough people who didn’t drive or subscribe to Literary Digest voted, and they voted for Roosevelt. The magazine’s editors’ faces were red, and they had the humility to put that on the cover.

Veröffentlicht

Last month I wrote those words on a slide I was preparing to show to the American Association for Corpus Linguistics, as a part of a presentation of my Digital Parisian Stage Corpus. I was proud of having a truly representative sample of theatrical texts performed in Paris between 1800 and 1815, and thus finding a difference in the use of negation constructions that was not just large but statistically significant.

Veröffentlicht

At the beginning of June I participated in the Trees Count Data Jam, experimenting with the results of the census of New York City street trees begun by the Parks Department in 2015. I had seen a beta version of the map tool created by the Parks Department’s data team that included images of the trees pulled from the Google Street View database. Those images reminded me of others I had seen in the @everylotnyc twitter feed.

Veröffentlicht

Data Science is all the rage these days. But this current craze focuses on a particular kind of data analysis. I conducted an informal poll as an icebreaker at a recent data science party, and most of the people I talked to said that it wasn’t data science if it didn’t include machine learning. Companies in all industries have been hiring “quants” to do statistical modeling. Even in the humanities, “distant reading” is a growing trend.

Veröffentlicht

I wrote recently that if you want to be confident in generalizing observations from a sample to the entire population, your sample needs to be representative. But maybe you’re skeptical. You might have noticed that a lot of people don’t pay much attention to representativeness, and somehow there are hardly any consequences for them. But that doesn’t mean that there are never consequences, for them or other people.

Veröffentlicht

Recently I’ve talked about the different standards for existential and universal claims, how we can use representative samples to estimate universal claims, and how we know if our representative sample is big enough to be “statistically significant.” But I want to add a word of caution to these tests: you can’t get statistical significance without a representative sample.