Monthly Archives: September 2012
They don’t have to settle for models at all
Dear Students, Check out this piece “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete“, published in Wired magazine four years ago by Chris Anderson, the editor-in-chief. In the data science world, four years is a long time (if you go by how long the term “data science” has even existed). In […]
Data Science Vocab List
We’ve started using vocabulary and concepts that are the language of data science. I’ll start listing some of them here. If you can get yourself to the point where you are able to explain each of these concepts to someone else in a clear way in a sentence or two, you’re in good shape. Of […]
Talking to CEOs
The CEO of RealDirect.com, Doug Perlson, visited on Wednesday so you could ask him questions to help better inform your (hypothetical) data strategy for RealDirect in your (hypothetical) capacity as Chief Data Scientist for HW #1, question 2. I really appreciate (non-hypothetical) Doug taking his time to come talk to us! Here are questions you […]
Week 2: Simulated Chaos, RealDirect, linear regression, k-nearest neighbors
Cathy O’Neil blogs about the class each week. Crossposted from mathbabe.org Data Science Blog Today we started with discussing Rachel’s new blog, which is awesome and people should check it out for her words of data science wisdom. The topics she’s riffed on so far include: Why I proposed the course, EDA (exploratory data analysis), […]
Visualizing Bill Cleveland’s original Data Science Proposal
I described the origins of and short history of Data Science in week 1. The origins include a 2001 action plan, by William Cleveland, a statistician, written when he was at Bell Labs, to define propose Data Science as a new academic discipline. A student in our class, Eurry Kim (with permission), created the following: […]
Week 1 Report: Current View of the Scope of the Course
“data science”: collection of best practices taught to you by experts in the field eager to come teach you filling a gap we see in current education Data Science: research area Columbia University Institute for Data Sciences We’re at Columbia; We showed up for a Data Science class; We represent Columbia’s interdisciplinary research community. What […]
Big Data Domain Surfing (Part 1)
Dear Students, As mentioned we have diverse backgrounds in this class. And lest there be any confusion, I am not talking about our ethnicities, home countries, or spoken languages. I’m talking about the academic spaces we each inhabit, which has me thinking along the lines of Data Science as having the potential to be the […]