NumberCrunching – DataCrunch | a QAC student blog

Counting ships

November 26, 2021 by tplante

Areas of highest ship density from Jan. 2015 to Aug. 2015 (before backlog) vs. Jan. 2021 to Nov. 2021 (during backlog)

by Trey Plante ’24

This past semester I took Working with Remote Sensing Data (QAC234) and developed a project focused on counting container ships in the Long Beach Harbor, California. Here are some details of what I did.

What is Star Trek? IBM’s Alchemy Language API gives a dual answer

September 20, 2017September 20, 2016 by poleinikov

In case you have not noticed from the multiple TV ads, for a few years now IBM has been positioning itself as a Big Data company, with its Watson platform and cloud-based services. One of them is the Alchemy Language API, which packs together functions for text analysis and information retrieval. As part of learning how to handle this API from R, I tried it on a news story about a sci-fi book publishing business. Overall, the results were strong, although not without some amusing quirks…

How many tweets would your code crunch if it could crunch Twitter, or why holidays are bad for using Twitter’s streaming API

January 14, 2016January 13, 2016 by poleinikov

Twitter has emerged as a convenient source of data for those who want to explore social media. The company provides several access endpoints through APIs. There is a REST API for collecting past tweets and a streaming API for collecting tweets in real time. R has libraries for working with both. As is usual in data collection, the catchphrase is “more” – we want more tweets, ideally all that are relevant to our research question. While REST API is rate-limited (a user can submit 180 requests per 15 minutes, with each request returning 100 tweets), the streaming API holds a promise of delivering much more. The nagging question, though, is “how much?”

FiveThirtyEight’s Riddler #1 – using R to evaluate the answers

December 16, 2015December 14, 2015 by poleinikov

Last week a prominent data journalism blog FiveThirtyEight.com has launched The Riddler – a section dedicated to math and probability related puzzles. The deadline for submitting solutions to the first riddle is over and this post will illustrate how you can use R to evaluate potential answers without doing any analytic derivations.