The field of Data Science deals with the theories, methodologies and tools of applying statistical concepts and computational techniques to various data analysis problems related to science, engineering, medicine, business, etc. The objective is to inspect, clean, transform and model data in order to discover useful information, suggest conclusions and support decision-making. It is an emerging topic that plays a critical role in almost every discipline of today’s science and technology and has become an indispensable component of any business, industry, enterprise, etc.
Data science is a highly interdisciplinary field. Data Science methodologies are mostly derived from statistics theories. The computational algorithms for implementing these statistical methodologies are based upon numerical computation and optimization, and are often executed on a large-scale hardware platform composed of massive computing units and storage devices. These kinds of data analyses can be applied to a wide range of specific problems across the natural and social sciences and serve as the foundation for artificial intelligence. Data Science can be extensively applied to economics, biology, health care, quantitative social science including global health and environmental science, and humanities (e.g., digital media). Numerous new applications are being discovered, and established techniques are being applied in new ways to solve emerging problems. Meanwhile, a variety of career opportunities are open to students with appropriate training in interdisciplinary data science.
Major Requirements
(Not every course listed is offered every term, and the course list will be updated periodically. Please refer to the online Course Catalog for Courses offered in 2023-2024.)
Data Science
Divisional Foundation Courses
Course Code | Course Name | Course Credit |
Choose one from the following two calculus courses | ||
MATH 101 | Introductory Calculus | 4 |
MATH 105 | Calculus | 4 |
And choose two of the following courses (PHYS 121 is strongly recommended) | ||
BIOL 110 | Integrated Science – Biology | 4 |
CHEM 110 | Integrated Science – Chemistry | 4 |
PHYS 121 | Integrated Science – Physics | 4 |
INTGSCI 205 | Integrated Science — Research Methods and Science Communication | 4 |
Interdisciplinary Courses
Course Code | Course Name | Course Credit |
COMPSCI 201 | Introduction to Programming and Data Structures | 4 |
STATS 302 | Principles of Machine Learning | 4 |
STATS 303 | Statistical Machine Learning | 4 |
STATS 401 | Data Acquisition and Visualization | 4 |
STATS 402 | Interdisciplinary Data Analysis | 4 |
Disciplinary Courses
Course Code | Course Name | Course Credit |
MATH 201 | Multivariable Calculus | 4 |
MATH 202 | Linear Algebra | 4 |
MATH 206 | Probability and Statistics | 4 |
STATS 211 | Introduction to Stochastic Processes | 4 |
COMPSCI 301 | Algorithms and Databases | 4 |
MATH 304 | Numerical Analysis and Optimization | 4 |
MATH 305 | Advanced Linear Algebra | 4 |
Electives
Courses listed in the table below are recommended electives for the major. The course list reflects the most recent intellectual organization of major electives. Depending on the academic year in which you matriculated, some of the courses below may be requirements for your major. To verify required courses, always consult the requirements for the relevant class year in the bulletin of the year in which you matriculated unless you have been approved to complete the major requirements of a subsequent year. (See Ability to Meet Major Requirements Published in Years Subsequent to Year of Matriculation.)
Programming and Software Engineering | ||
COMPSCI 101 | Introduction to Computer Science | 4 |
COMPSCI 203 | Discrete Math for Computer Science | 4 |
COMPSCI 205 | Computer Organization and Programming | 4 |
COMPSCI 303 | Search Engines | 4 |
COMPSCI 306 | Introduction to Operating Systems | 4 |
COMPSCI 308 | Design and Analysis of Algorithms | 4 |
COMPSCI 310 | Introduction to Databases | 4 |
COMPSCI 311 | Computer Network Architecture | 4 |
COMPSCI 320 | Software Reliability | 4 |
COMPSCI 401 | Cloud Computing | 4 |
Machine Learning and AI | ||
STATS 102 | Introduction to Data Science | 4 |
STATS 304 | Bayesian and Modern Statistics | 4 |
COMPSCI 402 | Artificial Intelligence | 4 |
STATS 403 | Deep Learning | 4 |
STATS 404 | Probabilistic Graphical Models | 4 |
Signal Processing | ||
COMPSCI 207 | Image Data Science | 4 |
COMPSCI 302 | Computer Vision | 4 |
COMPSCI 304 | Speech Recognition | 4 |
Interdisciplinary Data Analytics |
| |
ECON 211 | Intelligent Economics: An Explainable AI approach | 4 |
SOSC 320 | Data in the World: Applied Social Statistics | 4 |