Hadoop? MapReduce? Spark? Hive?...Making sense of the tools used to analyze big data can seem confusing and overwhelming at times. Dr. Harrison and Dr. Shan will help you understand how these components function and form the core of big data analytics systems. The emphasis of this course will be on understanding the fundamental principles of big data systems using Hadoop and Spark.
Spark allows the processing of huge volumes of data in real-time, and is a dominant choice for performing analytics at scale. Similarly, the Hadoop Distributed File System (HDFS) forms the backbone of most big data systems. In this course, participants will learn the theory behind how these tools work so they can understand when, and how, to implement them effectively. The relative strengths and weaknesses of various big data systems will be highlighted to explain how Spark has emerged as a popular choice for analyzing dynamic, high-velocity, and high-volume data.
Participants will also get hands-on experience using HDFS and Spark to illustrate the power of big data analytics.
AGENDA AND TOPICS
This is an introductory course in Big Data and Spark, but it will go beyond basics to introduce some technical components. Most big data analytics will be performed using Spark and HiveQL, a querying language based on SQL. Participants will also use basic Linux commands for operating Hadoop. This course is appropriate for those that want to learn more about how Spark and HDFS function and those that are looking to begin career in big data analytics.
$750 includes breakfast, lunch, snacks and free parking for both days
Andrew Harrison is an Assistant Professor of Information Systems in the Lindner School of Business at the University of Cincinnati. His research interests include consumer fraud, deception, security systems, privacy, media capabilities, and virtual worlds. http://business.uc.edu/academics/departments/obais/faculty/andrew-harrison.html#bio
Zhe (Jay) Shan is currently an assistant professor in Department of Operations, Business Analytics, and Information Systems in the Lindner College of Business at the University of Cincinnati. He earned his Ph.D. degree in Business Administration and Operations Research from Penn State University Smeal College of Business in 2011. Before joining UC OBAIS, he worked as Assistant Professor of Information Systems at Manhattan College School of Business for two years. Personal website: http://homepages.uc.edu/~shanze/