MIS 447: Data Warehousing

Textbook Information

Required:  Kimball, R., & Ross, M. (2022). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd ed.). Indianapolis, IN: John Wiley & Sons, Inc. (ISBN-10: 1118530802, ISBN-13: 978-1118530801)

Students may opt to get the 2nd edition. The chapters referenced in the course are from the 3rd edition and students may need to find the corresponding chapter in the 2nd edition.

Published Remarks

None

Hardware Requirements

None

Software Requirements

Students will use Excel and Tableau.  These are available free to students with their Penn State username and password.  No additional purchase is required.

Proctored Exams

None

Course Description

PSU Course Description

This course focuses on fundamental principles and applications of data warehousing. Topics covered include data warehouse planning, design, and implementation. MIS 447 Data Warehousing (3) This course focuses on fundamental principles and applications of data warehousing. Issues related to data warehouse planning, design, and implementation are also covered in this course. Through case studies in various business domains, course exercises and projects, students will learn practical dimensional modeling techniques, extract/transformation/load (ETL) logic, ETL design considerations, and report generation. Essentially, students will learn how to align multiple sources of data through data warehousing architectures for deriving valuable business insights through subsequent business intelligence operations. The course begins by reviewing basic database modeling principles, and then introduces dimensional modeling in the context of the various data warehousing architectures (e.g., Kimball, Corporate Information Factory, hybrid architectures). Key concepts related to data warehousing including dimensional table characteristics, fact table characteristics and granularity, types of dimensions, types of fact tables, dimension attribute hierarchies, consolidated fact tables, slowly changing dimension techniques, and multivalued dimensions and weighting factors are covered in the course. Also, key advanced database management techniques such as views, procedures, and triggers will be introduced. Building on these core concepts, the course also covers related concepts including the role of online analytical processing (OLAP) and packaged analytic solutions, enterprise data warehouse business architecture and matrix, data warehousing lifecycle, ETL subsystems and tasks, ETL system planning, ETL design and development process and tasks, and data modeling best practices for big data. Case studies from various business domains and processes are included throughout the course. Examples of these domains include retail sales, order management, procurement, accounting, healthcare, insurance, transportation, and telecommunication. The case studies are used to illustrate the concepts as well as provide a context for hands-on exercises. Through course assignments and group projects, students have an opportunity to gain hands-on experience with data warehouse design, development, and prototype implementation.

 

Pre-requisite course and skills

MIS 336 Database Management Systems

Additional pre-requisite skills are addressed in the Orientation module. These pages serve as a refresher of the information learned in MIS 336.

Course Objectives

At the end of this course, students should be familiar with the learning objectives of this class.

  • Explain data warehousing basic definitions, concepts, structures and operations
  • Describe the processes used in developing and managing data warehousing
  • Describe the role of data warehouses in decision support for stakeholders
  • Apply principles of data warehousing in addressing appropriate business analytics questions’

Module learning objectives can be found throughout the course on the module overview pages.

Course Structure

The course has 7 modules. Each module will have a case study, a content quiz, and a discussion board. 

Module 1 – Introduction to Data Warehousing

Module 2 – Data Warehousing Architecture

Module 3 – Data Modeling for Data Warehouses

Exam 1

Module 4 – Extract, Transform, Load (ETL) Process

Module 5 – Data Quality and Governance

Module 6 – Business Intelligence and Data Analytics

Module 7 – Data Warehousing in the Cloud

Final Exam

Grading

Participation, case studies, quizzes, and exams based on the following percentages:

Grading

Assignment Name

Percentage

Participation (Discussion Boards and other assignments)

20%

Case Studies (7)

30%

Module Quizzes (7)            

25%

Exam 1 and Final Exam

25%

Total

100%

Assignments

Your final letter grade will be based on your scores from your participation, case studies, quizzes, and two exams.

Participation

There are several assignments listed under participation, with the majority being discussion boards in each module. For each discussion board post, you will write two parts – New Insights and Making a Connection. More information on these two sections is provided on each discussion board assignment. You should also respond to at least two peer’s responses. Other assignments that will contribute to your participation grade include the pre- and post-concept quizzes and the class introduction forum in the orientation module.

Case Studies

Case studies are an individual assessment activity. The case studies are designed to help students develop the skill associated with a technique covered in the applicable modules. Assignments will involve explaining concepts, intuitive logic underlying data modeling techniques, applications, data modeling, as well as interpreting models.

Quizzes

Each module has a multiple choice quiz based on module content and readings. The quizzes will be primarily based on the module content that it is found in (i.e. Module 1 Quiz will be mostly drawn from Module 1), but each quiz will have a few questions from previous modules. This technique is used to encourage you to continue to refresh your learning on prior module content, which helps deepen your learning and move information into long-term memory.

Exam 1 and Final Exam

There will be two exams in this course. Exam 1 will be based on Modules 1-3. The Final Exam will be based on Modules 1 – 7. Questions will be multiple choice. All quizzes and exams will be administered in Canvas. Quizzes and exams do have a time limit listed on each.