SQL Server Business Intelligence Design Considerations
By: Siddharth Mehta
The next layer in the solution is developing a Data Warehouse by designing and implementing the dimensional model. Before we start designing the dimensional model, we need to understand the key considerations for developing a data warehouse. In this chapter, we would understand why a Data Warehouse is required and how is adds value in the BI solution architecture.
The first reason for creating a Data Warehouse in the Business Intelligence (BI) process is the increasing load on the Online Transaction Processing (OLTP) system. With the passage of time, data keeps accumulating in the database. If the data is archived in some offline system, it may not be usable for analytics. If the OLTP system does not archive data, it would keep on growing large and would start needing more resources to keep up the performance of the system. The Data Warehouse acts as an active archival data store for all the transactional systems in an enterprise.
With the continuous accumulation of large volumes of data from different transactional systems into the Data Warehouse, data needs to be stored in a denormalized format for smooth data retrieval. A typical dimensional model uses a star or snowflake design that is easy to understand and relate to business needs, supports simplified business queries, and provides superior query performance by minimizing table joins.
As data is consolidated and collected from various transactional systems in a Data Warehouse, it provides a multi-dimensional view of various kinds of business activities in an enterprise. It is very easy to make out from this point that considering the volume and variety of data in the data warehouse, this data can be used to derive a wide-range of analytics than any single OLTP system. For example, there may be one OLTP system for Sales, Production, HR, Finance, CRM, etc. When data from all these individual OLTP systems is gathered into a Data Warehouse, each section of data can be associated with the rest of the dimensions. Questions like what are the products that generated the maximum profit in the North American region in the last 10 years having a client base in the age group of 30-40 yrs can be answered. Answering such questions may need data from multiple OLTP systems and would also need aggregating a large volume of data. This is where the data warehouse proves as the ultimate source of data.
In our business scenario, AdventureWorksDW is the Data Warehouse version of the AdventureWorks database. In case if you look closely at the solution, you would understand that ETL would be required to populate data from either the OLTP system or Staging area into Data Warehouse. Consider downloading the AdventureWorks SSIS sample that loads data from AdventureWorks OLTP database into the AdventureWorks Data Warehouse. Its in the form of a single ETL package with a large number of parallel and sequential tasks to load the data warehouse. Here is a portion of that SSIS Package.
In the next chapter we would look at the specifics of dimensional modeling, which would help us understand why the AdventureWorks DW model is modeled in the way it is. Also we would discuss how it addresses our business scenario.
- For an elaborate understanding of DW design considerations, consider reading this article.