Monthly Archives: September 2012

My Data is Bigger Than Yours

I buried this in a p.s. at the end of my previous post, and I think it’s sufficiently funny (by which I mean, hysterical) that I need to bring it to the top so you don’t miss it. Brian Dalessandro, this week’s guest speaker, sent me this video that he and his data science teammate, […]

On Inspiring Students and Being Human

Dear Students, Lest you think (yes, I know I used that turn of phrase in posts before. I like it.) that I am bragging about my character traits (in which case you don’t know me well enough yet– I never brag, and who isn’t human?), wipe that thought from your mind, and read on. On […]

Week 4: The Data Science Process, k-means, Classifiers, Logistic Regression and Evaluation

Each week Cathy O’Neil blogs about the class. Cross-posted from This week our guest lecturer for the Columbia Data Science class was Brian Dalessandro. Brian works at Media6Degrees as a VP of Data Science, and he’s super active in the research community. He’s also served as co-chair of the KDD competition. Before Brian started, […]

Tow Teas, Thursdays, 5-7pm

Kaushik, a student in our class, who is also taking a class at the Tow Center for Digital Journalism at the Columbia Journalism School, “Frontiers of Computational Journalism” (which sounds interesting!) sends along the following: Dear Rachel, In response to Phillips’s post about the talk in the Sociology department on Thursday, I thought I might […]

Computational Social Science talk on Thursday, September 27th

Phillip, a student in our class writes Dear Rachel, We are running a workshop at the sociology department, which hosts Professor Michael Macy on Thursday. Please see the invitation below. This talk might be of interest for

Human Ingenuity

Read the full paper here: Presented, in part, as inspiration: observe the elegance and simplicity of the model; the deep insight that solved a problem as massive as ranking sites on the web with a solution involving eigenvectors. Presented, also, for the student discussion on Anderson’s article.

Course Announcements (Tuesday 9/25)

– Room: We have a bigger room! Reminder that going forward we are in 313 Fayerweather — Guest lecturer: Tomorrow’s guest lecturer is Brian Dalessandro, VP of Data Science at Media 6 Degrees (m6d). He will be teaching “Classification, Logistic Regression and Evaluation”. From their site, “Brian joined m6d as head of the data science […]

Weekly Data Viz #2

Each Tuesday,  Eurry Kim, a student in our class, will pick one example of data visualization to share with us.  Eurry writes: Here’s the visualization for this week: This fancy interactive time-series plot is based on a CSV file that Ben Fry found on a Wikipedia footnote — no scraping necessary! The graph breaks […]

The Data Science Process

Dear Students, Now that we’ve had our first guest lecture, I’d like to revisit the general framework I proposed for thinking about the data science process on the first day of class (when I generalized the example from Google Plus), and show how Jake’s lecture fits within this framework. Throughout the semester we’ll see that […]

Curse of dimensionality

This is a guest post by Professor Matthew Jones, from Columbia’s History department, who has been attending the course. I invited him to give his perspective on the course thus far. Few things lurk as much a challenge and instigation in data mining (or machine learning or the data sciences) as the “curse of dimensionality.” […]


Get every new post delivered to your Inbox.

Join 363 other followers