Category Exploratory Data Analysis
Exploratory Data Analysis with Time-stamped Event Data
In the age of Big Data, one of the common data types is time-stamped events. This post focuses on (1) Explaining what time-stamped event data is and (2) Describing the Exploratory Data Analysis (EDA) you can do with it. It’s best to start your analysis with EDA so you can gain intuition for the data […]
The Data Science Process
Dear Students, Now that we’ve had our first guest lecture, I’d like to revisit the general framework I proposed for thinking about the data science process on the first day of class (when I generalized the example from Google Plus), and show how Jake’s lecture fits within this framework. Throughout the semester we’ll see that […]
Data Scientist Profiles
An example of a data scientist profile of one of the students in our class
What were you thinking when you made us do those data scientist profiles?
I had four primary reasons for going through that exercise:
Reason 1: Cultivating self-awareness
Reason 2: Illustrate importance of standardization in visualization
I wanted to demonstrate standardizing visualizations of individuals as a mix of characteristics. (You should think about how you might do it, and then also ask yourself whether you think a standardized visualization has any value.) In this particular case
(a) standardizing the x-axis: I used the main buckets that I thought were approximately some of the skills one needs as a data scientist. I’m not tied to these
Exploratory Data Analysis
Exploratory Data Analysis (EDA) is often relegated to Chapter 1 (by which I mean the “easiest”, and lowest level) of standard introductory statistics textbooks and then forgotten about for the rest of the book. Notable examples of textbooks used in statistics curriculum that embrace EDA are Andrew Gelman‘s books (which are by no means introductory). […]