Photo of the day – April 14, 2016

With rainy days finally over, students from the Introduction to Text Mining course (QAC 386) decided to hold the class outside, which they successfully did on the lawn near Allbritton Hall.  The topic of the day was tree parsing using openNLP package in R. In the photo, left to right: first row: Trisha Arora ’16, … Read more

Is it our future to be agnostic?

As interest in data and data analysis grows, future students interested in the career have to work harder to understand the boundaries and guidelines. It won’t always be as simple as it was for Evan Thorne ‘15, who came to Wesleyan thinking he wanted to study Economics before discovering the QAC department. “I was starting to work with data sets in my math and computer science classes when I heard about big data,” Evan said, explaining that it was soon after he began taking classes with the QAC that he realized he wanted this as a career.

But what is “this”? Data science? Data analysis? Data manipulation? Sometimes it can be hard to define. But Evan did not flounder when I asked him for a definition of his job at CKM Advisors, the company where he was hired right after graduation. He began by explaining to me that an analyst is someone who is able to take in what’s readily available to them and then dissect it to look at more basic stats and trends. Data scientists, however, are able to find things that aren’t available – unstructured data – and take it in raw. “Every data scientist is an analyst in a way,” Evan explained, “but it’s at a much bigger level.” At CKM, Evan is a data scientist, and he is responsible for all of the analytic process: data ingestion, wrangling, manipulation, analysis, and visualization.

Read more

The hidden power of calendars: history of Oscars told through their telecast schedules

Sundays seem natural for large TV events.  Why wouldn’t they?  NFL’s Super Bowl has been on Sundays forever. It feels like the proper order of things that the Academy Awards ceremony is also on a Sunday. Every year, somewhere near the end of February, start of March.  Yet, a simple dataset of telecast dates points out that this practice is a relatively recent phenomenon and for a long while things were quite different.  For a quick summary of the data, look at the chart below: it shows the progression of the ceremony dates from the most distant (1953) to the closest (2014).  For more details on why the changes occurred, keep reading on.

f1

Read more

The Academy and Public Opinion: Will Leo Finally Win an Oscar?

With the Oscars ceremony just two days away, it is nearly impossible to escape the media buzz around the potential winners.  Many commentators believe that the Oscar for the best actor in a leading role should go to Leonardo DiCaprio.  Analysis of data from the Academy’s database shows that, even for the superstars, nothing is written in stone.  A little bit of background reading, aided by New York Times Article API, revealed a history of intricate balancing between the Academy, the studios, and the public.

Read more

Data Visualization and the Transparency of Truth

Transparency is a hot issue; in politics, in business, and in journalism, people are all itching to know how truthful the truths their being fed really are. However, truth is no longer as easy a thing to gauge as it once was. It turns out, the public can be fed information that is, technically, true, yet at the same time only one version of the truth.

A good example to look at is weather forecasts. Most people think of weather as easy and straight-forward data to access. There are tons of websites that allow search by location (eg/ www.weather.com), and on TV news we can be given an explanation of a weather chart, as seen below:

Read more