Category Archives: Course Topics

Philosophy of Data Science: Embrace the Practical and the Profound

This is my last blog post for Statistics 4242, Introduction to Data Science at Columbia University. All final projects have been turned in; grades have been given; the semester is over. I reserve the right to start blogging again at a later date. Dear Students, From the beginning, this course viewed Data Science simultaneously in […]

Kaggle Visualization Competition in Our Honor!

Dear Students, There is a new Kaggle Visualization Competition in our honor! I encourage you all to enter it! I received this email from Will Cukierski from Kaggle. This email was sent to me and Chris Mulligan. (See the p.s. for the Legend of Chris Mulligan.) Yours, Rachel Chris and Rachel, Thanks to your blog […]

My Strata Talk: Next-Gen Data Scientists

Dear Students, I’ll be giving a talk at Strata in February about this course and our experiences together: http://strataconf.com/strata2013/public/schedule/detail/27529 I’m bringing it up now, even though it’s more than two months off, because I plan to stop blogging about the class when the semester is finished. Here’s the abstract: Data Science is an emerging field […]

Class of 2013 hackNY Fellows

The following is from Chris Wiggins, a professor in the department of Applied Mathematics and Applied Physics at Columbia.  Chris’s name has come up multiple times throughout the semester including the very first day: What is Data Science? and the last day during the student presentations. Dear Rachel: I’m emailing to ask your help getting […]

Week 14: Student Presentations, Synthesis of Semester

Each week Cathy O’Neil blogs about the class. Cross-posted from mathbabe.org. Thank you Cathy for doing such a wonderful job this semester capturing the course in this way, and also for being a respected voice in the classroom, a question-asker and role model for the students. Here’s our class photo, and Cathy’s blog post follows. Cathy’s post captures the presentation done by a subset of students, which represented a collaboration of many/most students in this course, as part of their work for a think piece. More on this to come at a later date. It also captures my synthesis of the semester.

class_photo
In the final week of Rachel Schutt’s Columbia Data Science course, we heard from two groups of students as well as from Rachel herself. [...]

The Stars of Data Science

VizStars
This is another part of the students’ final project. A small group designed a survey to assess their classmates on different dimensions that capture the skills of a data scientist, and administered the survey to their classmates. The questions were of the form “Do you know what ___ means?”, or “Have you ever implemented ____?”. The students were well aware of potential biases in their questions, the limitations of self-reporting, etc. The survey was a great first pass.

This is an innovative way of describing and visualizing Data Scientists — it captures the variablity among data scientists, and allows for the potential for effective Data Science teams to be constructed by creating “constellations” of these stars, or overlaying the stars on top of each other to create “complete” data science teams. The visualization and survey represented an improvement over the data science profiles I gave them at the beginning of the semester. This was a collaborative effort among many students including Adam Obeng, Eurry Kim, Christina Gutierrez, Kaz Sakamoto, and Vaibhav Bhandari. Full report of last lecture still to come.

Columbia University Institute for Data Sciences and Engineering Graduate Programs

The Institute for Data Sciences and Engineering is in the process of developing interdisciplinary graduate certification programs, certificates and Master’s degrees to support IDSE’s educational mission. Part-time, full-time and online study opportunities will be available beginning Fall 2013. Information about these programs and application procedures for inaugural classes will be posted as it becomes available. […]

Data Science Classes Forming Across the Country

A9UueTYCYAIexQN.jpg:large

Last night the students gave their guest lecture. It was awesome! We’ll have a more detailed report tomorrow, but this image was already posted on twitter, so I thought I’d get it up here as a sneak preview for the rest of the lecture. Part of the students’ design concept was constellations and stars, so they have another nice visualization of “data science profiles” as stars. It will make more sense when you see it. Kaz Sakamoto, Eurry Kim and Vaibhav Bhandari created this as part of a larger class collaboration.

Weekly Data Viz #12

Each Tuesday, Eurry Kim, a student in our class, picks one example of data visualization to share with us. This is the last one. Thanks for taking on this challenge this semester, Eurry. You did an awesome job!  Eurry writes: With all this talk about big data, big potential, and big problems, I was feeling […]

In Defense Of Statistics Or, Why Data Scientists Should Make Understanding Statistics a Priority

This is a guest post from our course’s teaching assistant, Ben Reddy. Ben is a second year PhD student in the statistics department. He’s had the chance to view the class from a unique angle, so I wanted for us to get his perspective. Also, thank you Ben for your support this semester! We’re nearing […]

Follow

Get every new post delivered to your Inbox.

Join 196 other followers