Monthly Archives: November 2012

Week 13: MapReduce

Each week Cathy O’Neil blogs about the class. Cross-posted from mathbabe.org. The week in Rachel Schutt’s Data Science course at Columbia we had two speakers. The first was David Crawshaw, a Software Engineer at Google who was trained as a mathematician, worked on Google+ in California with Rachel, and now works in NY on search. […]

Tonight’s Guest Speakers: David Crawshaw and Josh Wills

Tonight we have two guest speakers, David Crawshaw and Josh Wills, both of whom I’ve had the pleasure of working with at Google. I hesitate to call them “data engineers” because that term is as problematic or potentially overloaded as “data scientist”, but suffice it to say that they’ve both worked as software engineers and […]

Weekly Data Viz #11

Each Tuesday,  Eurry Kim, a student in our class, picks one example of data visualization to share with us.  Eurry writes: I was watching Amanda Cox’s EYEO talk on YouTube a couple of weeks ago and she said something that really stuck with me – There’s this idea that some detail you want to leave […]

Data & Hubris

This is a guest post by Professor Matthew Jones, from Columbia’s History department, who has been attending the course. Data & Hubris In the wake of the recent election, data people, those that love, and especially those that idealize them exploded in Schadenfreude about the many errors of the traditional punditocracy. Computational statistics and data […]

Weekly Data Viz #10

Each Tuesday, Eurry Kim, a student in our class, picks one example of data visualization to share with us. This week we did it on a Wednesday. Eurry writes: A few people have asked me about my process for building visualizations. It’s kind of flattering! Well, it’s a simple answer and not far from what […]

Week 12: Predictive modeling, Data Leakage, Model Evaluation

Each week Cathy O’Neil blogs about the class. Cross-posted from mathbabe.org. This week’s guest lecturer in Rachel Schutt’s Columbia Data Science class was Claudia Perlich. Claudia has been the Chief Scientist at m6d for 3 years. Before that she was a data analytics group at the IBM center that developed Watson, the computer that won […]

Tonight’s Guest Speaker: Claudia Perlich

Claudia Perlich currently serves as Chief Scientist at m6d. In this role, Claudia designs, develops, analyzes and optimizes the machine learning that informs brands on how to find their best prospective customers. She and the team of m6d scientists live and breathe web-wide data to fuel new business and marketplace intelligence. An active industry speaker […]

Experiments, A/B Testing and Causal Modeling

Screenshot of article by Brian Christian from Wired magazine, The A/B Test: Inside the Technology That’s Changing the Rules of Business from April 2012

Dear Students,

I want to address explicitly why Causal Modeling and Experiments are part of this course. The last two lectures have addressed observational studies and causal modeling and a bit on experiments. [...]

Week 11: Estimating Causal Effects

Each week Cathy O’Neil blogs about the class. Cross-posted from mathbabe.org. This week in Rachel Schutt’s Data Science course at Columbia we had Ori Stitelman, a data scientist at Media6Degrees. We also learned last night of a new Columbia course: STAT 4249 Applied Data Science, taught by Rachel Schutt and Ian Langmore. More information can […]

Tonight’s Guest Speaker: Ori Stitelman; Two Announcements

Tonight’s guest speaker is Ori Stitelman from Media 6 Degrees (m6d). Earlier on in the semester we had his colleague, Brian Dalessandro, speak to us about classifiers, logistic regression and evaluation. Ori will be talking about causal modeling. It will be interesting to think about how your understanding of Data Science has evolved since Brian visited us.

Ori Stitelman is a Senior Data Scientist at m6d. His responsibilities include prototyping methods for improving m6d’s display advertisement targeting product, creating fraud detection tools, as well as developing methods for estimating the causal effect of advertising. Ori received a Ph.D. in Biostatistics from the University of California, Berkeley where his primary research focus was on developing methods for estimating causal effects.

Two Announcements:

(1) Eurry and Kaz won Best Data Narrative in the Hubway Competition! Congratulations!!

(2) As a reminder, next week’s lecture is on Monday night (November 19th) rather than Wednesday night because of the long Thanksgiving weekend.

Follow

Get every new post delivered to your Inbox.

Join 363 other followers