Slowly Changing Dimension SCD
What is Slowly Changing Dimension ?
Slowly Changing Dimension are the dimensions in which the data changes slowly over time, rather than changing regularly on a time basis.
For example : a product price changes over time, people may change their name due to some reasons, country and state name may change over time, people change their address due to relocations.
Types Of Slowly Changing Dimension :
Slowly Changing Dimensions are often categorized into three types –
- SCD Type 1
- SCD Type 2
- SCD Type 3
1. SCD TYPE 1 :
In SCD Type 1, new information simply replaces the original information. In other words, no history is kept.
For example : We have following table –
Customer Key | Name | City |
1001 | Ravi | Delhi |
After Ravi moved from Delhi to Chennai, the new information replaces the new record, and we have the following table:
Customer Key | Name | City |
1001 | Ravi | Chennai |
Advantages:
This is the easiest way to handle the Slowly Changing Dimension problem, since there is no need to keep track of the old information.
Disadvantages:
All history is lost. By applying this methodology, it is not possible to trace back in history.
For example, in this case, the company would not be able to know that Christina lived in Illinois before.
2. SCD Type 2 :
In SCD Type 2, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.
For Example : In our previous example, we have the following table:
Customer Key | Name | City |
1001 | Ravi | Delhi |
After Ravi moved from Delhi to Chennai, we add the new information as a new row into the table:
Customer Key | Name | State |
1001 | Ravi | Delhi |
1005 | Ravi | Chennai |
Advantages:
— This allows us to accurately keep all historical information.
Disadvantages:
— This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern.
— This necessarily complicates the ETL process.
3. SCD Type 3 :
In SCD type 3, there is two column for same attribute. The new value is loaded into ‘current/new’ column and the old one into ‘old/previous’ column. There will also be a column that indicates when the current value becomes active.
For Example : In our previous example, we have the following table:
Customer Key | Name | City |
1001 | Ravi | Delhi |
After Ravi moved from Delhi to Chennai, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2016):
Customer Key | Name | Original City | Current City | Effective Date |
1001 | Ravi | Delhi | Chennai | 15-JAN-2016 |
Advantages:
— This does not increase the size of the table, since new information is updated.
— This allows us to keep some part of history.
Disadvantages:
— Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Ravi later moves to Bangalore on December 15, 2016, the Chennai information will be lost.
Hopefully, this information about slowly changing dimensions will help you all.
For types of dimensions – Read Here