Data Science I: Visualization
How can one summarize information and data and convey its meaning to others? What is an effective data visualization? What is an ineffective or dishonest one? And, for that matter, what is data? This course will explore these questions by introducing students to the broad field of information visualization. Students will learn about different types of visualizations that may be used to explore variation and covariation, the evolution of processes through time and space, and representing parts of a whole. Much of the work of this course will be carried out using computers and the R programming language, but we will also explore non-computational approaches to visualization. Students will develop skills in data collection, data cleaning, and creating different types of data visualizations (e.g. bar charts, scatter plots, density plots, heat maps, violin plots, time series, and interactive graphics) and effective data communication while working on problems and case studies inspired by and based on real-world questions. We will also critique and reflect upon data visualizations in our daily lives. Students will also gain familiarity with descriptive statistics and ways to organize and summarize categorical and numerical data to pick out key information.
This course is designed to serve as an introduction to programming in R. Students will learn to gain insight from data, to use literate programming and version control so that these insights are reproducible by others, and to develop code collaboratively. Students who successfully complete this course will be able to work with large data sets, transform those data, and implement effective visualizations. Throughout the course we will be using GitHub, ggplot2, Rmarkdown, gganimate, RShiny and the tidyverse packages for data manipulation. This course is intended to appeal to a wide range of students. The skills and habits of mind taught in this course are applicable not only in the sciences and social sciences, but in almost all fields. Evaluation will be based on several short homework and lab assignments, participation in in-class activities, and a final project.