  • QR - Quantitative Reasoning
  • ES - Environmental Science

This course is a continuation of Data Science I. Publicly available data is often of great use, but is rarely offered in ready-to-use formats and requires “data wrangling” before it is ready to be analyzed. We will begin with advanced data wrangling of publicly available data from the social and natural sciences. We will then progress to critical evaluation of the data and develop the skills to generate reproducible analysis reports. Students interested in analyzing data from the social or natural sciences should take this course. Students who complete this course will be able to:

  1. Perform advanced data wrangling of publicly available data sets,
  2. Build custom functions to streamline data analysis,
  3. Perform simulations to explore how small changes in variables affect the results,
  4. Make code used for data analysis publicly available, and
  5. Produce a final report that integrates explanatory text with computer code that transforms data, fits models to the data and visualizes the results.

The course will emphasize rigorous practices that lead to reproducible research through scripting of analyses and versioning of data and results. The course examples will use publicly available data. Students will be encouraged to bring data from their own research to the class. Students who do not have data will be able to select from several data sets from the social and physical sciences. Students will need to use either their personal laptop or a COA loaner laptop for class and programming exercises. Evaluation will be through class participation, quizzes, homework and a final project.


ES1064 Data Science I

