Racism and Diversity in Data Analysis

In light of the recent election, it is more important than ever to look at how and where we are responsible for perpetuating prejudice. In a previous article, DataCrunch introduced the concept of “Weapons of Math Destruction,” which are data models built from a limited or biased sample of data that result in toxic feedback loops. Since this explanation is most often attributed to artificial intelligence, there is little discussion about how this description could also illuminate the workings of the human mind. While many might want to think of this narrow-mindedness as below the mental capacity of human beings, such a viewpoint is dangerous in that makes having a conversation about prejudice difficult. Like artificial intelligence, humans create internal models in their minds, which often lead to the creations of ideals such a stereotypes. The flaw of these models is that they are based on the past, and if not updated or addressed they will continue to run off of the same data they were originally fed – even if times have changed.

At an individual level, racism is a toxic feedback loop. People want to be able to predict how other people will behave, and it can be far too easy to create “a binary prediction that all people of that race [or gender, sexuality, religion, political group, etc] will behave that same way,” (O’Neil, 23). If an issue like this develops in a technical model, it isn’t too difficult to go back and manually adjust the data input or change the important factors. But people with racist beliefs “don’t spend a lot of time hunting down reliable data to train their twisted models,” (23) They will continue to gladly absorb the data that seems to confirm their beliefs, and will refute data that tests them.

Read more

Seasons of Internships

I’ve feared the moment that my summers would be turned over to internships for a long time. I can’t remember for how long I’ve known internships are important – probably for as long as I’ve known about applying for college. My relationship with the idea of internships has gone through stages, with me sliding from thinking that they are silly resume builders to valuable and necessary work experience almost every day. I recently decided that I wanted to pursue some sort of consulting internship, and then felt a drop in my stomach similar to when I decided to apply for Wesleyan. But while there is a large and personalized application process still ahead of me, I don’t want to feel as scared as I did then. With this in mind, I sat down with Asie Makarova ’17 and Taylor Chin ’18 to discuss two of the main myths about internships and what truths, based on their experience, lie beneath.

Both Asie and Taylor had similar beginnings to their internship journey. “I started pretty early applying to things Junior fall,” Asie remembered. She found her connection through LinkedIn, by reaching out to a friend’s dad who then put her in contact with FTI Consulting. Taylor also came across his internship on LinkedIn when he noticed that an old friend from high school had connections at an energy intelligence software company called EnerNOC. From there, both Taylor and Asie got offered interviews at their respective companies.

Read more

Peeking Into Design’s Toolbox: Design, Data, and the Liberal Arts Education

In 1994, a small company called Marvel acquired the rights to sell children’s toys and comic books based off of their characters. During this time they were riding the wave of the comic book boom, a time when comic book consumption and production reached a sudden high. Marvel entered this period of success with high hopes, and followed the lead of other comic book companies to find success. This follow-the-leader approach turned against them when the market collapsed in 1997, forcing Marvel to declare bankruptcy.

All of this happened before Marvel Entertainment was the media power house we know today. Now, it seems as if Marvel is expanding into every corner of product design, churning out movie and TV series with a built in comic book and merchandise market at such a pace that some are calling this Marvel’s Golden Age. This approach is startlingly different than the company’s mantra in 1997, leading many Marvel enthusiasts to ask themselves what has changed between then and now.

Wesleyan alum Peter Olson ’97 was hired by Marvel in 2004, the year before Marvel changed their name from Marvel Enterprises to Marvel Entertainment – a move that made their expansionist dreams quite clear. Peter’s main assignment was to re-launch Marvel’s website, in a hope that they could rebuild through better online communication with fans. But Peter knew that, in order to really reach their full potential, Marvel needed to become a business. While working there, he landed on a golden question for Marvel’s future: “How can we take Marvel’s data and turn it into something useful for fans?” One of the results that came from this line of thinking was a visualization of all the character relations in the Marvel Universe, color-coded by the major franchises. Shown below, each node represents a character, and the thickness of each edge correlates to the number of interactions between the characters. Peter was only a cog in a large mechanical shift within Marvel, but the thinking that led to the creation of this data visualization is very representative of the change that took place after Marvel’s bankruptcy in 1997 – they stopped thinking about how they could use their data to merely market products, and instead focused on a way to draw in customers by using their data for interactive and proactive design.

Read more

The Invisible and Pervasive Power of College Rankings

This article is inspired by and quotes from Weapons of Math Destruction by Cathy O’Neil, a book about O’Neil’s growing disillusionment with the data economy as she learned that data can be used to fuel toxic feedback loops. This post is the first in a series DataCrunch will be doing based on the examples cited in her book.

 

When preparing to apply to college, one of the first references that people often turn to are lists of college rankings. Almost every newspaper/journal has one – Forbes, Princeton Review, U.S. News. They are a big deal within higher education, with students and parents often referring to the lists as a point of reference when choosing where to apply. But the scope of influence goes beyond that. Alumni and teachers will also look at these lists to decide if they want to apply or donate money. These simple rankings of colleges have become somewhat of a bible in higher education that destines a school to fly or flop – all based on what their ranking is.

Does this sound scary to you? It should. It’s hard to truly understand the amount of power we give to these lists until you step back and look at how far the cycle of impact spans: The process of applying for college has become so much more than just “applying.” High schools will start prepping students their freshman year to be wary of their grades, ranked GPA, AP scores, extracurriculars, volunteer work, honors society, SAT scores, ACT scores…. And when high schoolers are stressing out about how much there is to do, they surely don’t think back to those college rankings that they started reading with your parents for fun. But the truth is that they are the center point of a vicious feedback loop that now controls our higher education system.

Read more

What is Star Trek? IBM’s Alchemy Language API gives a dual answer

In case you have not noticed from the multiple TV ads, for a few years now IBM has been positioning itself as a Big Data company, with its Watson platform and cloud-based services.  One of them is the Alchemy Language API, which packs together functions for text analysis and information retrieval.  As part of learning how to handle this API from R, I tried it on a news story about a sci-fi book publishing business.  Overall, the results were strong, although not without some amusing quirks…

Read more

Why are Textbooks so Expensive?

As Drop/Add week comes to an end, students are finishing up one of the most dreaded activities of the semester: Acquiring textbooks. Whether you have already purchased all your textbooks or are heading to Broad Street to pick up the final ones, you will all end the week having dealt with one of the worst cases of sticker shock possible. Because while the mile-long line is annoying, nothing is more horrifying than seeing your purchase total turn to a three-digit number for the nine textbooks your English class requires. Of course, there are cheaper options: you can buy used or rent new/used copies, borrow, lend, sellback after buying, have parents pay, pay part, pay full. But sellback can be difficult and asking parents difficult to navigate, especially if the money situation is tight. And with the prices sitting at a cringe-worthy level no matter what, paying for textbooks has become a serious concern for most college students.

Have you ever wondered why you’re textbooks are so expensive? Even normal books aren’t always cheap, but textbook prices have soared far above that level. There is some disparity in data, but on average it is reported that textbook prices have risen 800% in the last 30 years.  And with choosing to not buy a textbook possibly hurting your grade, it poses questions about a rigged college system that favors those with money, even after you get past the golden gates. So what’s causing prices to be so high?

Read more

How does Netflix keep getting it right?

In 2013, Netflix came out with Orange is the New Black, one of the first original series to be debuted on an online streaming network. It was an immediate success, and ushered in years of Netflix continuously “getting it right”: House of Cards, Arrested Development, Unbreakable Kimmy Schmidt, numerous Marvel shows, Sense8 – and, most recently, Stranger Things.

What’s fascinating is that there seems to be pattern of streaming video networks coming out with great original shows while cable TV shows are declining in quality and originality. When Stranger Things came out in early July of 2016, Netflix had another hit, and I heard many people saying in awe, “How does Netflix keep getting it right?”

It turns out there’s a secret to their success: Big Data.

Read more

New Age Research in a New Age World: An Interview with the QAC’s Congressional Politics Research Lab

When seen through a news report or a computer screen, the impact of current political research can seem very disconnected from what’s really taking place. It can be hard to try to understand the results and implications of politician’s behaviors and opinions without an already-written history book. But with this new age of media presentation comes the new age digging tool of data analysis, which is once again proving to be the key to decoding to today’s political discourse.

And that’s not its only use. Once again, Wes students are proving it possible to not only use online politics for research purposes, but to get your foot in the door of Data in the Real World. In April, John Murchison ‘16, Grace Wong ‘18, and Joli Holmes ’17 attended the Midwest Political Science Association conference in Chicago to present a poster about their research on congressional politics.

Read more

Photo of the day – April 14, 2016

With rainy days finally over, students from the Introduction to Text Mining course (QAC 386) decided to hold the class outside, which they successfully did on the lawn near Allbritton Hall.  The topic of the day was tree parsing using openNLP package in R.

tm2

In the photo, left to right: first row: Trisha Arora ’16, Taran Carr ’16, Antonio Robayo ’16, and Jack Trowbridge ’16; second row: Grace Wong ’18, third row: Sara Eismont ’18.