Going Back to the Basics?

With data science languages, sometimes learning the basics can be the hardest part. The QAC offers several .25 credit classes that introduce students to the necessities of different languages, but even fitting all the necessary information into a half a semester can be difficult. This past quarter, Professor Pavel Oleinikov utilized a website called DataCamp to help his students get comfortable with the basics of Python. DataCamp is an online collection of data science lessons that teaches users through videos and repetitive exercises. The website has an in-browser code box that allows users to code right on the website without having to download any software. Each lesson takes roughly 30 minutes to 1 hour to complete, making it a convenient way to nail down a specific skill.

Read more

Detecting Trends in Community Engagement: At Wesleyan and Beyond

When it comes to activism and community service, Wesleyan has always tried to stay ahead of the curve. But this can be difficult, as the concerns and trends of community engagement are constantly shifting. Often, new topics will seemingly erupt out of nowhere, and it will take a while for word to spread. There are so many existing concerns that it can be difficult for new voices to be heard and for old voices to catch on to the changes. It might seem as though the trends in community engagement are shifting constantly, without any pattern. But can technology detect one?

Read more

Crowdsourcing Data Analysis: The complexities of free data labor in a data hungry market

Companies don’t know where to look to find the data analysts they need. A February 2017 article reported that 40% of major companies are struggling to find reliable data analysts to hire. According to TechTarget, “a lack of skills remains one of the biggest data science challenges,” and many tech magazines have reported something similar. This has led to companies sponsoring campaigns encouraging people to learn coding and universities to create comprehensive data analysis training programs. But it has also led to the widespread use of crowdsourcing data analysis. Crowdsourcing, while not a new tool in data science, has recently become extremely popular as a way for companies to fulfill their data analysis needs, from gritty data cleaning to full blown model creation. Last month DataCrunch reported on Kaggle, a website that allows companies to host competitions with a dataset they need to be analyzed in some way. Another example is DrivenData, who do activism work themselves but have a similar competition layout that runs their projects. The way the competition model works is that the participant or group whose model is chosen as the best by the company receives a cash prize. However, these competitions get a large enough number of submissions that the chance of winning the prize is rather low.

Read more

Making Books Unfamiliar: The Art of Novel Analytics

On March 2nd, Matthew Jockers gave a talk at Wesleyan about his research on using quantitative methods for analysis in literature. His talk was titled “Novel Analytics: From James Joyce to the Bestseller Code.” The following article is an exploration of his talk and the ideas he brought forward.

What makes something a piece of art? This might sound like a pretty theoretical question, but English professor Matthew L. Jockers believes that it is possible to take a technical approach.

“Art shows how things are perceived, not known,” Jockers explained. This is a definition that could cause tension in the literary world. After all, writing is messy, personal, and painfully subjective. And yet – “We tend to emphasize the idea that the text is withholding an ‘essential truth,’” Jockers explained. In this way, a literary critic wants to be able to anticipate a certain meaning, causing an endless tug-of-war between objectivity and subjectivity. Jockers does not wrestle with this tension, as evidenced by his book The Bestseller Code, in which he uses analysis to tackle that all-elusive question: What makes a bestselling novel?

Read more

Can we Utilize Passion in Data Science?

It can be easy to think of data science as cut and dry analysis consisting solely of numbers. But according to Economics major Leah Giacalone ’17, if people think of it that way it’s just because they haven’t tried it yet. “Personally, I’ve always found being able to code super exciting,” she said. “The first time I wrote code and then it worked was the most exciting thing ever. I always tell people that and they don’t believe me.”

If you are someone who doesn’t believe in the passion underlying data science, then maybe it’s time to give it a go, because an increasing number of companies are utilizing passion as a power source for their problems. An example of this is Kaggle, a website founded in 2010 that allows companies to post their data and research problems online so that people from around the world can compete to create the best solution. Kaggle is using the overflow of big data to its advantage to create a sort of Kickstarter for data science. It’s engaging, fresh, and possibly a good way for data analysis hopefuls to break the ice with coding.

Read more