Monthly Archives: October 2012

Data Science & Urban Planning

Here I describe some inklings of ideas around Data Science & urban planning based on recent conversations* I’ve had, and casual reading I’ve been doing. I will touch on Las Vegas, Brooklyn, the Hubway visualization competition, and FourSquare. Metric: Return on Community This weekend’s NYT magazine has an article by Timothy Pratt: “‘If You Fix […]

Next Semester: Applied Data Science (Statistics W4249)

Next semester, Ian Langmore and I will be offering a new course, Applied Data Science, in the Department of Statistics: Short description Data scientists wear many caps. This course presents two from opposite ends of the spectrum. Coding best practices will be taught using test-driven development, version control, and collaboration. The Python programming language will […]

Week 7: hunch.com, Recommendation Engines, SVD, Alternating Least Squares, Convexity, Filter Bubbles

Each week Cathy O’Neil blogs about the class. Cross-posted from mathbabe.org Last night in Rachel Schutt’s Columbia Data Science course we had Matt Gattis come and talk to us about recommendation engines. Matt graduated from MIT in CS, worked at SiteAdvisor, and co-founded hunch as its CTO, which recently got acquired by eBay. Here’s what […]

Tonight’s Guest Speaker: Matt Gattis from eBay

Matt Gattis specializes in machine learning pertaining to recommendation engines. eBay acquired the recommendation engine startup he co-founded, hunch.com, where he was the CTO and responsible for R&D. Currently, he is continuing to develop recommendation engine technology for merchandising at eBay. Prior to hunch, he worked on heuristics for detecting web security threats at SiteAdvisor, which […]

Weekly Data Viz #5

Each Tuesday, Eurry Kim, a student in our class, will pick one example of data visualization to share with us. Eurry writes: I have a Sankey Diagram from ProPublica this week! What’s more is that the authors wrote about how they made it! You’ll see that they employed the NYT API to obtain the data (we can […]

Course Announcements (10/16): Homework #3, Final Project and a Reading about Kaggle

Here are links: Homework #3 (assigned 10/3, due 10/24) Final Project (assigned 10/10) [Updated] to include a link to Kaz’s infographic that explains the second part of the final project visually. Here’s an article making the rounds about how some of the top competitors on Kaggle all had taken Andrew Ng’s Coursera Machine Learning course. Seems relevant to […]

10 Important Data Science Ideas

Here’s a list of 10 important ideas we’ve explored this semester so far. 10. Interdisciplinary Data Science teams My experience at Google, along with DJ Patil’s piece on Building Data Science teams, informs my understanding of the importance of interdisciplinary teams. The students who showed up to take this class are from across departments and disciplines. […]

Follow

Get every new post delivered to your Inbox.

Join 440 other followers

Build a website with WordPress.com