Author Archives: Rachel Schutt

“Big Data on Campus”

For the final project, you’re working on developing a story or hypothesis around the theme of Data Science and Education. This article, “Big Data on Campus”, appeared in the New York Times (in cooperation with the Chronicle of Higher Education) over the summer and explores ways in which universities are starting to use technology that [...]

Data Science for Change, an Infographic by Kaz Sakamoto

Kaz Sakamoto is a student in our class and he created the following infographic to explain the process our class will use to create the Think Piece. Kaz is a Master of Science in Urban Planning candidate at Columbia’s Graduate School of Architecture, Planning, and Preservation (GSAPP) one of his interests is where public participation and technology intersect. He currently works for the New York City Economic Development Corporation (NYC EDC) which is the City’s official economic development corporation; some of their successful projects include the Highline, Coney Island, East River Ferry service, and the upcoming Cornell+Technion campus on Roosevelt Island. He works in the asset management and GIS departments where he has been working on analyzing the economic impact of College Point Corporate Park in Queens and digitizing project data for the EDC. In his free time he likes to ride trains, explore the city, and makes maps… usually in that order.

Kaggle In-class Essay-Scoring Competition, Think Piece on Data Science & Education

Dear Students, This final project will be an investigation of the relationship between Data Science and Education from two perspectives: (1) The application of Data Science to the education sector and (2) Data Science in a university setting. You will: Apply machine learning algorithms to education-related data sets Investigate and recommend ways in which Data [...]

Week 6: Kaggle, crowdsourcing, decision trees, random forests, social networks, and Google’s hybrid research environment

Each week Cathy O’Neil blogs about the class. Cross-posted from mathbabe.org Yesterday we had two guest lecturers, who took up approximately half the time each. First we welcomed William Cukierski from Kaggle, a data science competition platform. Will went to Cornell for a B.A. in physics and to Rutgers to get his Ph.D. in biomedical [...]

Tonight’s Guest Lecturers: Will Cukierski from Kaggle and David Huffaker from Google

Tonight we have two guest lecturers. It’s going to be a jam-packed night! William Cukierski is a data scientist at Kaggle. He has a bachelor’s degree in physics from Cornell University and a Ph.D. in biomedical engineering from Rutgers University, where he studied applications of machine learning in cancer research. Prior to joining Kaggle, he [...]

Weekly Data Viz #4

Each Tuesday, Eurry Kim, a student in our class, will pick one example of data visualization to share with us. Eurry wrote:
Hi Rachel,
For this week’s viz, I decided on the following New York Times graphic:

Source: Jason Deparle and Matthew Ericson/The New York Times 05/09/09

http://www.nytimes.com/interactive/2009/05/09/us/0509-safety-net.html

New York Times again?? Yes, I have a good reason. Wait for it.[...]

Course announcements: Thursday (10/11) Columbia Data Viz event

Mike, a student in our class, writes about what should be a useful workshop. Their flyer appears below and I have to say, Mike, I wish it were more visually appealing given it’s a data visualization workshop. Mike has a sense of humor. He can take it. (Update: Mike responded “Yeah, I know sorry. We [...]

Exploratory Data Analysis with Time-stamped Event Data

In the age of Big Data, one of the common data types is time-stamped events. This post focuses on (1) Explaining what time-stamped event data is and (2) Describing the Exploratory Data Analysis (EDA) you can do with it. It’s best to start your analysis with EDA so you can gain intuition for the data [...]

Course Announcements (10/7) Data Visualization Event Monday

I received this message from a student in our class about a data visualization event tomorrow at Columbia. Hi Rachel, I’ve just come across this panel on data visualization tomorrow in Avery Hall: http://www.gsappevents.org/event/cluster Even if one can’t make it there (e.g., because one is going to the early lab session with Jared), it’s worth [...]

The Case for Data Science

Dear Students, Data Science is an emerging field in industry, yet not well-defined as an academic subject. This is the first course at Columbia that has the term “Data Science” in the title. So recently, Allen Bernard, a freelance journalist working on an article for CIO.com about the emerging role of the data scientist asked [...]

Follow

Get every new post delivered to your Inbox.

Join 53 other followers

Powered by WordPress.com