Course Syllabus

MATH 2080 Applied Data Science

Division: Natural Science and Math
Department: Mathematics
Credit/Time Requirement: Credit: 2; Lecture: 2; Lab: 0
Prerequisites: Math 2040 with a C or better and Math 1100 with a C or better.

Semesters Offered:
Semester Approved: Fall 2020
Five-Year Review Semester: Fall 2025
End Semester: Spring 2026

Optimum Class Size: 20
Maximum Class Size: 25

Course Description

Students will get an introduction to Python programming, data analysis tools, and the necessary statistics to acquire, clean, analyze, explore, and visualize data using real-life data sets. Using statistics, students will learn to make data-driven inferences and decisions, and to communicate those results effectively. This course is designed for students outside of engineering and the sciences. Students with majors in engineering or science should take Math 3080 instead.

Justification

Data collection and the analysis of data is ubiquitous and fast becoming a prerequisite to economic success for businesses. This course provides a subset of the tools necessary to leverage data for prediction. This course will support the bachelor’s in software engineering degree by providing relevant mathematics coursework.

Student Learning Outcomes

Students will acquire data through we-scraping and data APIs.
Students will clean and reshape messy datasets.
Students will learn to use statistical software to deploy statistical methods including generalized linear regression, cluster analysis, and classification.
Students will apply dimensionality reduction and perform basic analysis of network data.
Students will evaluate outcomes, make decisions based on data, and effectively communicate those results.

Course Content

This course will include introduction to data analysis tools in Python, descriptive statistics, data structures with Numpy & Pandas, introductory hypothesis testing & statistical inference, web scraping and data acquisition via APIs, generalized linear regression, classification methods including logistic regression; k-nearest neighbors; decision trees; support vector machines; and neural networks, data visualization, clustering methods, dimensionality reduction; including principle component analysis; network analysis; rating, ranking, and elections, cleaning and reformatting messy datasets using regular expression or dedicated tools such as open refine; natural language processing; ethics of big data. This course supports a learning environment where perspectives are recognized, respected and seen as a source of strength.