Category Models
Next-Gen Data Scientists
The following is a prologue to a discussion of what makes for a good data scientist. Data is information and is extremely powerful. Models and algorithms that use data can literally change the world. Quantitatively-minded people have always been able to solve important problems, so this is nothing new, and there’s always been data, so […]
Week 4: The Data Science Process, k-means, Classifiers, Logistic Regression and Evaluation
Each week Cathy O’Neil blogs about the class. Cross-posted from mathbabe.org This week our guest lecturer for the Columbia Data Science class was Brian Dalessandro. Brian works at Media6Degrees as a VP of Data Science, and he’s super active in the research community. He’s also served as co-chair of the KDD competition. Before Brian started, […]
The Data Science Process
Dear Students, Now that we’ve had our first guest lecture, I’d like to revisit the general framework I proposed for thinking about the data science process on the first day of class (when I generalized the example from Google Plus), and show how Jake’s lecture fits within this framework. Throughout the semester we’ll see that […]
They don’t have to settle for models at all
Dear Students, Check out this piece “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete“, published in Wired magazine four years ago by Chris Anderson, the editor-in-chief. In the data science world, four years is a long time (if you go by how long the term “data science” has even existed). In […]