Course code:

ES2045

Level:

IM - Introductory/Intermediate

Class size limit:

15

Typically offered:

Upon occasion

There isn’t really much consensus on what Data Science actually is. Some argue that it’s a nascent interdisciplinary field that encompasses elements of computer science, applied statistics, and data visualization. Others insist that it’s a “fourth paradigm” of science: empirical, theoretical, computational, and now “data-driven”. Regardless of what it is, what may be most important is to understand what Data Science does and what you can learn to do with it!

This course will be both a broad overview of how Data Science is done in the “real world” with a specific focus on learning applications of practical Data Science skills. This would be a great class for those interested in an introductory exploration of Data Science as a primary area of focus as well as those looking to add a degree of technical expertise with data to pre-existing work and interests. We’ll be focusing on four major areas of application: (1) properly building data-driven questions and hypotheses; (2) obtaining, organizing, and transforming data; (3) exploratory data analysis and pattern recognition; and, as time permits, (4) data visualization and communication. Leaving this class, students will be able to immediately apply these concepts to a broad array of interests. For example: students should be able to generate hypotheses from disparate data sources (such as minutes from ACM), obtain and transform data from websites (like accessing the U.S. Census Bureau API), explore and analyze data to discover patterns (such as through spatial biological data sets), and visualize such data for easy communication to peers and laypersons.

Classes will be taught as a mix of live coding exercises, lectures, and group discussions. No prior programming experience is required - we’re going to be learning to use the R programming language in this course! - but a familiarity with computers and data will be helpful. Students will need to use either their personal laptop or a COA loaner laptop for class and programming exercises. Evaluation will be through class participation and discussion, several data investigation exercises, and a final project. The data investigations will take the form of written analyses of several well-known data sets as well as investigations of synthetic ones created specifically for the course. The final project will take the form of an oral presentation of an analysis. This can be either done in a group or as an individual and may be of any topic of sufficient interest to the student(s) involved.

Level: Introductory/Intermediate. Prerequisites: none. Class limit: 15. Lab fee: none.

Prerequisites:

None.

Always visit the Registrar's Office for the official course catalog and schedules.