Need an Introduction to the QAC?

­­As Drop/Add has ended, we have all finalized our schedules – for better or for worse – and decided which classes we will be taking for the next semester. A lot of you will be taking classes at the Quantitative Analysis Center; this year’s WesMaps shows that almost every QAC class is at or over max capacity for enrollment. However, maybe some of you wanted to take a class to develop some data-related skills, but had absolutely no idea where to start. I’ve had many friends come to me saying that they’ve realized these might be good skills to have, but that they feel like they don’t understand enough to even pick a course. While before I may have given a few recommendations, I would now give only one: QAC201.

Read more

Going Back to the Basics?

With data science languages, sometimes learning the basics can be the hardest part. The QAC offers several .25 credit classes that introduce students to the necessities of different languages, but even fitting all the necessary information into a half a semester can be difficult. This past quarter, Professor Pavel Oleinikov utilized a website called DataCamp to help his students get comfortable with the basics of Python. DataCamp is an online collection of data science lessons that teaches users through videos and repetitive exercises. The website has an in-browser code box that allows users to code right on the website without having to download any software. Each lesson takes roughly 30 minutes to 1 hour to complete, making it a convenient way to nail down a specific skill.

Students in Pavel’s Working with Python class really enjoyed being assigned DataCamp lessons as homework. “We only have 3 hours, and that may seem long but that’s not a lot of time considering the concepts that we’re learning,” said Anthony Price. These .25 credit classes move quickly, and so there isn’t much time to backtrack if students are lost. And students can always Google around for answers, but sometimes the vast amount of material returned can be overwhelming. This is why it is important to have resources in place so that students don’t give up before they get comfortable.

Read more

Detecting Trends in Community Engagement: At Wesleyan and Beyond

When it comes to activism and community service, Wesleyan has always tried to stay ahead of the curve. But this can be difficult, as the concerns and trends of community engagement are constantly shifting. Often, new topics will seemingly erupt out of nowhere, and it will take a while for word to spread. There are so many existing concerns that it can be difficult for new voices to be heard and for old voices to catch on to the changes. It might seem as though the trends in community engagement are shifting constantly, without any pattern. But can technology detect one?

Wesleyan’s Text Mining class was assigned the task of investigating this dilemma. They were asked to analyze the relationship between approaches to community engagement in the past and what people desire from it in the future. For past data, they collected the text of old Argus articles tagged “community engagement.” These articles were meant to illuminate what kinds of activities were most popular. Present data was collected through focus groups that were asked about the current state of community engagement at Wesleyan and how it could be improved. From this data, class groups hoped to discover how much the current activities overlap with the desires of the focus groups, as well as identify which community engagement topics are popular and which ones are new.

Read more

Crowdsourcing Data Analysis: The complexities of free data labor in a data hungry market

Companies don’t know where to look to find the data analysts they need. A February 2017 article reported that 40% of major companies are struggling to find reliable data analysts to hire. According to TechTarget, “a lack of skills remains one of the biggest data science challenges,” and many tech magazines have reported something similar. This has led to companies sponsoring campaigns encouraging people to learn coding and universities to create comprehensive data analysis training programs. But it has also led to the widespread use of crowdsourcing data analysis. Crowdsourcing, while not a new tool in data science, has recently become extremely popular as a way for companies to fulfill their data analysis needs, from gritty data cleaning to full blown model creation. Last month DataCrunch reported on Kaggle, a website that allows companies to host competitions with a dataset they need to be analyzed in some way. Another example is DrivenData, who do activism work themselves but have a similar competition layout that runs their projects. The way the competition model works is that the participant or group whose model is chosen as the best by the company receives a cash prize. However, these competitions get a large enough number of submissions that the chance of winning the prize is rather low.

Read more

Making Books Unfamiliar: The Art of Novel Analytics

On March 2nd, Matthew Jockers gave a talk at Wesleyan about his research on using quantitative methods for analysis in literature. His talk was titled “Novel Analytics: From James Joyce to the Bestseller Code.” The following article is an exploration of his talk and the ideas he brought forward.

What makes something a piece of art? This might sound like a pretty theoretical question, but English professor Matthew L. Jockers believes that it is possible to take a technical approach.

“Art shows how things are perceived, not known,” Jockers explained. This is a definition that could cause tension in the literary world. After all, writing is messy, personal, and painfully subjective. And yet – “We tend to emphasize the idea that the text is withholding an ‘essential truth,’” Jockers explained. In this way, a literary critic wants to be able to anticipate a certain meaning, causing an endless tug-of-war between objectivity and subjectivity. Jockers does not wrestle with this tension, as evidenced by his book The Bestseller Code, in which he uses analysis to tackle that all-elusive question: What makes a bestselling novel?

Jockers and his co-author Jodie Archer nailed down the qualities that make a book a bestseller by analyzing 30 years’ worth of New York Times bestselling novels. The idea of taking an analytical look at novels was appealing to Jockers because, while a literature enthusiast, he is largely interested in the way parts fit together. Rather than focusing on the novel itself, Jockers believes we should focus on the relationship readers have with the literal words on the page. “Books are a map of grammar, syntax, and word order used to direct our attention to a certain meaning,” he explained. This transforms the previous abstract question to a new, more concrete one: How do best-selling writers write?

Read more