Should data scientists <3 emojis?

Within a sea of New York start-ups hides a data journalism lab that treats the heart tacked on at the end of your tweet with the same seriousness as its 140 characters. PRISMOJI, founded by ex-Facebook data scientist Hamdan Azhar, hunts and reports on patterns in tweet trends. From the Swift/Kanye debacle to Brexit, the … Read more

Data Ethics: Are these dilemmas really new?

As conversations around ethics of data usage grow, both on-and-off campus, different lines are being drawn in the sand. But it really is still a new frontier in that there are no standardized rules, and companies and organizations are encountering new dilemmas every day.

Although, that might slowly be changing. In March of this year, New America released a set of guidelines for the ethical use of data in higher education. Using data analysis to record and track university students has been a controversial topic, especially in the face of Trump’s immigration policies, as these data could potentially be used to out un-documented students. However, even before that there were concerns about how universities could and couldn’t use student data, and where the line was between their own gain and the students’ personal freedoms.

Read more

Crowdsourcing Data Analysis: The complexities of free data labor in a data hungry market

Companies don’t know where to look to find the data analysts they need. A February 2017 article reported that 40% of major companies are struggling to find reliable data analysts to hire. According to TechTarget, “a lack of skills remains one of the biggest data science challenges,” and many tech magazines have reported something similar. This has led to companies sponsoring campaigns encouraging people to learn coding and universities to create comprehensive data analysis training programs. But it has also led to the widespread use of crowdsourcing data analysis. Crowdsourcing, while not a new tool in data science, has recently become extremely popular as a way for companies to fulfill their data analysis needs, from gritty data cleaning to full blown model creation. Last month DataCrunch reported on Kaggle, a website that allows companies to host competitions with a dataset they need to be analyzed in some way. Another example is DrivenData, who do activism work themselves but have a similar competition layout that runs their projects. The way the competition model works is that the participant or group whose model is chosen as the best by the company receives a cash prize. However, these competitions get a large enough number of submissions that the chance of winning the prize is rather low.

Read more

Racism and Diversity in Data Analysis

In light of the recent election, it is more important than ever to look at how and where we are responsible for perpetuating prejudice. In a previous article, DataCrunch introduced the concept of “Weapons of Math Destruction,” which are data models built from a limited or biased sample of data that result in toxic feedback loops. Since this explanation is most often attributed to artificial intelligence, there is little discussion about how this description could also illuminate the workings of the human mind. While many might want to think of this narrow-mindedness as below the mental capacity of human beings, such a viewpoint is dangerous in that makes having a conversation about prejudice difficult.

Read more