I’m teaching Introduction to Data Science for the second year. We just started last week, and here are some of the significant differences between this year and last year:
(1) Added another professor: I am team teaching this year with Dr. Kayur Patel who is a computer scientist at Google. Crudely speaking we can think of data science as having foundations in computer science and statistics. Given I have a solid foundation in statistics, I wanted to have a collaborator who could bring the perspective of a computer scientist. Kayur has a background in human-computer interaction and machine learning. Kayur and I share a similar vision for what data science could be, so I felt that working with him to develop the course a second time through would refine my ideas and help make them more robust, as well as introduce new concepts into the course.
(2) Definition of data science: Last year the course examined the central question of “what is data science?” and started with the working definition that data science is what data scientists do. We ended the semester with a more expansive definition that data science is the study of the space of problems that can be solved with data* (*using tools and methods taught in this course). So this year we begin the course with this definition, and continue to revise and refine our understanding. Data science is a living, breathing and changing organism so a course on the subject needs to be flexible and responsive.
(3) Course themes and concepts:
Thinking like a data scientist: which includes thinking like a computer scientist, thinking like a statistician, thinking like a social scientist, scoping problems, being creative and asking good questions. Developing habits of mind.
Humanist Approach to Data Science: We will not just focus on the tools, math,models, algorithms, and code, but on the human side as well. I like this definition of humanist: “a person having a strong interest in or concern for human welfare, values, and dignity.” Being humanist in the context of data science means recognizing the role your own humanity plays in building models and algorithms, thinking about qualities you have as a human that a computer does not have (which includes the ability to make ethical decisions), and thinking about the humans whose lives you are impacting when you unleash a model onto the world.
The Data Science Process: We’ll organize the course around the data science process.
(4) Use of the blog: We expect that Kayur will primarily be the one updating the blog this semester and reflecting on the course. Though I might chime in sometimes as well.