Data Science Graduate Certificate

According to the August 2018 LinkedIn Workforce Report for the United States, the demand for data scientists is off the charts. Shortages of data science skills are present in almost every large US city. The national shortage for these skills exceeds 150,000 people, with particularly acute shortages in New York City, the San Francisco Bay Area, and Los Angeles. As more industries rely on big data to make decisions, data science has become increasingly important across all industries, not just tech and finance. Data scientists should have IT expertise and the ability to work with extremely large data sets and to communicate analytical findings to other professionals who can leverage those findings for strategic decision-making.

The data science graduate certificate (fifteen credit hours) is a collaborative program among business analytics, computer science, and information systems. This certificate program prepares students to apply both data management and analytic techniques (including statistical and data/text mining approaches) to extract meaningful insights from both structured and unstructured data. Students acquire hands-on experience with relevant analytics and data management tools, programming languages, data models, and IT architectures for data integration, analysis and visualization. Working in teams, students communicate results of analysis effectively (visually and verbally) to a broad audience.

Graduate credits earned may apply to MS and MBA programs subject to approval of MS/MBA program directors.

Learning outcomes

Learning outcomes for data science certificate students include:

  • Knowledge of how to integrate data from multiple sources and manage integrated data under a proper data management architecture.
  • Knowledge of how to apply analytics techniques and algorithms (including statistical and data/text mining approaches) to large data sets to extract meaningful insights.
  • Acquisition of hands-on experience with relevant analytics and data management tools, programming languages, data models, and IT architectures for data integration, analysis and visualization.
  • Ability to communicate results of analysis effectively (visually and verbally) to a broad audience.

Curriculum (15 credit hours)

There are two certificate tracks for Lindner College of Business graduate students. Certificate-only students will follow the MS-IS track.

MS Business Analytics courses
Course No. Course Title Course Description Credit Hours
BANA 6037
Data Visualization
This course provides an introduction as well as hands-on experience in data visualization. It introduces students to design principles for creating meaningful displays of quantitative and qualitative data to facilitate managerial decision-making.
2
BANA 6043
Statistical Computing
This is a course on the use of computer tools for data management and analysis. The focus is on a few popular data management and statistical software packages such as SQL, SAS, SPSS, S Plus, R, and JMP although others may be considered. Data management and manipulation techniques including queries in SQL will be covered. Elementary analyses may include measures of location and spread, correlation, detection of outliers, table creation, graphical displays, comparison of groups, as well as specialized analyses.
2
CS 6052
Intelligent Data Analysis
This course will introduce students to the theoretical and practical aspects of the field of data mining. Algorithms for data mining will be covered and their relationships with statistics, mathematics, and algorithm design foundations will be explored in detail.
3
BANA 7042
Statistical Modeling
Nonlinear regression and generalized linear model. Logistic regression for dichotomous and polytomous responses with a variety of links. Count data regression including Poisson and negative binomial regression. Variable selection methods. Graphical and analytic diagnostic procedures. Overdispersion. Generalized additive models. Limited dependent variable regression models (Tobit), Panel Data models.
2
BANA 7046
Data Mining I
This is a course in the statistical data mining with emphasis on hands-on data analysis experience using various statistical methods and major statistical software (SAS and R) to analyze large complex real world data. Topics include: Data Processing. Variable Selection for linear regression and generalized linear regression. Out-of-sample Cross Validation. Generalized Additive models. Nonparametric smoothing methods. Classification and Regression Tree. Neural Network. Monte Carlo Simulation.
2
IS 7034**
Data Warehousing for Business Intelligence
This course is designed for the comprehensive learning of data warehousing technology for business intelligence. Data warehouses are used to store (archive) data from operational information systems. Data warehouses are useful in generating valuable control and decision-support business intelligence for many organizations in adjusting to their competitive environment. This course will introduce students to the design, development and operation of data warehouses. Students will apply and integrate the data warehousing and business intelligence knowledge learned in this course in leading software packages.
2
IS 8034
Big Data Integration
This course presents an overview of the principles of data integration, the fundamental basis for developing useful and flexible business intelligence platforms. Modern data integration needs differ from traditional approaches in four main dimensions that parallel differences between big data and traditional data: volume, velocity, variety, and veracity.
2
MS Information Systems/certificate-only courses
Course No. Course Title Course Description Credit Hours
BANA 6037
Data Visualization
This course provides an introduction as well as hands-on experience in data visualization. It introduces students to design principles for creating meaningful displays of quantitative and qualitative data to facilitate managerial decision-making.
2
BANA 6043
Statistical Computing
This is a course on the use of computer tools for data management and analysis. The focus is on a few popular data management and statistical software packages such as SQL, SAS, SPSS, S Plus, R, and JMP although others may be considered. Data management and manipulation techniques including queries in SQL will be covered. Elementary analyses may include measures of location and spread, correlation, detection of outliers, table creation, graphical displays, comparison of groups, as well as specialized analyses.
2
CS 6052
Intelligent Data Analysis
This course will introduce students to the theoretical and practical aspects of the field of data mining. Algorithms for data mining will be covered and their relationships with statistics, mathematics, and algorithm design foundations will be explored in detail.
3
BANA 7038*
Data Analytics Methods
This course covers the fundamental concepts of applied data analysis methods. Various aspects of linear and logistic regression models are introduced, with emphasis on real data applications. Students are required to analyze data using major statistical software SAS and R. 
2
IS 7034**
Data Warehousing for Business Intelligence
This course is designed for the comprehensive learning of data warehousing technology for business intelligence. Data warehouses are used to store (archive) data from operational information systems. Data warehouses are useful in generating valuable control and decision-support business intelligence for many organizations in adjusting to their competitive environment. This course will introduce students to the design, development and operation of data warehouses. Students will apply and integrate the data warehousing and business intelligence knowledge learned in this course in leading software packages
2
IS 7036**
Data Mining for Business Intelligence
This course is designed for the in-depth learning of data-mining knowledge and techniques in the context of business intelligence. The topics include association rules, classification, clustering and text mining. Students will apply and integrate the business intelligence knowledge learned in this course in leading software packages.
2
IS 8034
Big Data Integration
This course presents an overview of the principles of data integration, the fundamental basis for developing useful and flexible business intelligence platforms. Modern data integration needs differ from traditional approaches in four main dimensions that parallel differences between big data and traditional data: volume, velocity, variety, and veracity.
2

* For MS-IS/Certificate students, the pre-requisite for BANA 7038 is BANA 6043.

** The pre-requisite for both IS 7034 and IS 7036 is either IS 6030 (Data Management, two credits) or IS 7032 (Database Design, two credits).

Admission to the data science Ccertificate program is open for all three semesters. However, we recommend students apply either in fall or spring semester because many certificate courses are not offered in summer semester.

Admission requirements

Admission to this program is very competitve and space is limited.

Applicants must have a baccalaureate degree (BS or BA) and the appropriate coursework as described below.

Applicants to the program must provide transcripts and official university course descriptions to show that they have obtained a grade of at least 3.0/4.0 (B or better) for at least undergraduate level background courses in the following areas. Course descriptions should be added in the online application.

  • Database design and management: including experience with SQL queries and design and development of relational databases;
  • Statistics: including basic knowledge of probability distributions, hypothesis testing, and linear regression; and
  • Programming: including knowledge of data structures and experience developing medium-large software in C++ and/or Java.

Academic Director

Headshot of Peng Wang, PhD

Peng Wang, PhD

Assistant Professor, Department of Operations, Business Analytics, and Information Systems

3326 Carl H. Lindner Hall