How many tweets would your code crunch if it could crunch Twitter, or why holidays are bad for using Twitter’s streaming API

Twitter has emerged as a convenient source of data for those who want to explore social media. The company provides several access endpoints through APIs. There is a REST API for collecting past tweets and a streaming API for collecting tweets in real time. R has libraries for working with both. As is usual in data collection, the catchphrase is “more” – we want more tweets, ideally all that are relevant to our research question. While REST API is rate-limited (a user can submit 180 requests per 15 minutes, with each request returning 100 tweets), the streaming API holds a promise of delivering much more. The nagging question, though, is “how much?”

Read more

FiveThirtyEight’s Riddler #1 – using R to evaluate the answers

Last week a prominent data journalism blog FiveThirtyEight.com has launched The Riddler – a section dedicated to math and probability related puzzles. The deadline for submitting solutions to the first riddle is over and this post will illustrate how you can use R to evaluate potential answers without doing any analytic derivations.

Read more

Data from Surprising Origins: Looking at the Work of R. Luke Dubois

When R. Luke Dubois sat down with a group of students for lunch on Friday, November 20th, he could have begun by introducing himself: Known as R. Luke Dubois or just Luke Dubois, he is an artist based in New York City with many notable works related to data, some of which have been on display at the Zilkha Gallery since the beginning of the semester. But instead he began by asking us what we were working on.

At first, most of us nervously fidgeted in silence. We hadn’t been expecting the spotlight to be on us. After a couple awkward moments, I offered up an explanation of my final project for my data analysis class. Dubois responded with interest and gave some suggestions. After that, other students slowly began to come forward with their ideas, and he continued to react excitedly. He then powered up the projector behind him and showed us some related work by other artists, such as Fernanda Viegas and Martin Wattenberg.

After this discussion had gone on for a while, I realized that Dubois wasn’t going to talk about himself or his work unless he was prodded to. I turned the spotlight back on him by asking whether he thinks of himself as an artist or data researcher. Dubois jerked his eyes upwards, and when they re-centered on us he had a funny smile on his face. “I’m a musician,” he responded. “I play the cello badly. I played so badly that I switched to a computer.”

Read more

Data and a Polygraph: A Look into Data Journalism

As the uses and values of data become more well-known, more and more unique ways of exploring and presenting data are emerging to the forefront of the Internet. When Wesleyan invited one of these explorers, Matt Daniels, to give a talk on data journalism and media art, I immediately dug into his portfolio. Daniels hosts his projects on a website called Polygraph, and currently has only focused one exploring data related to music. I was immediately transfixed by the name of his site – polygraph isn’t a word commonly connected to data or information – and, due to blanking on the definition, Googled it. I found the following:

pol·y·graph [ˈpälēˌɡraf]

NOUN

  1. A machine designed to detect and record changes in physiological characteristics, such as a person’s pulse and breathing rates, used especially as a lie detector.

With this definition swirling in my head, I came to Daniels’ talk eager to learn what he was all about. Daniels, one of the many young creators who are storming the tech industry, began by clicking to a slide of the visualization that made him “internet famous.” He describes that the goal of this project was to look at the usage of unique words by rappers in their songs. The visualization charts these usages, along with the amount of unique words used by authors ranging from Shakespeare to Melville. The visualization was then followed by some text that further fleshed out what he had discovered. And there you find Daniel’s formula, the foundation of this data journalism he has fallen into: a code narrative + a prose narrative = an interesting and interactive read.

Read more

Introducing: The Data Analysis Minor

The Quantitative Analysis Center has always had the goal of bringing together students from across different departments and disciplines through the art of data analysis. While data and quantitative analysis can be connected only to work in math and computer science, it is really a broad skill set that can complement work done in economics, psychology, sociology, English, and much more.

Pride yourself on wheedling down a chaotic data set? Enjoy making snazzy graphs? Love seeing stories unfold from visualizations? Finally, there is now a way to officially bring together the QAC’s programs and your main major. With the QAC Data Analysis Minor, these skills can officially be declared as a part of your college education. Overall, this five course minor requires one basic knowledge course; two courses that are either mathematical, statistical, or computing foundation courses; and two applied electives. Not bad, eh?

I know what you’re thinking. Finally there is a way to learn how to master messy data and make snazzy graphs and get credit for it on your college diploma. So what are you waiting for? Acquire this awesome skill set, enter the world of the QAC, and declare your data analysis minor today!