Slowly changing dimensions


Slowly Changing Dimensions

January 25, 2012

The “Slowly Changing Dimension” problem is a common and persistent issue in data warehousing. Simply this applies to records where attributes change over time. Here is an example;  Sue is a customer with a major retailer. She first lived in Potter Heigham, Norfolk. So, the original entry in the customer lookup table has the following record:

lngCustomerURN strName Town County
1001 Sue Potter Heigham Norfolk

Later she moved to Cambridge, Cambridgeshire in March, 2009. How should we modify its customer table to reflect the change? This is the “Slowly Changing Dimension” problem.

There are in general three ways to solve this type of problem, and they are categorized as follows:

Type 1: Replace the old record with a new one, the only problem is that no trace of the old record exists and any historical information is lost.  If for example, we had sales for one region associated with her, when the replacement is done, the sale will move with her.

Type 2: We add a new record to the customer table. We retain the existing history of old transactions against the customer and new transactions are linked to the new customer record.

Type 3: The original record is modified to reflect the change, this generally involves having an additional column (or columns).  This is a simplified Customer record to show how a type 3 record could look.

lngCustomerURN strName strPreviousTown strPreviousCounty strCurrentTown strCurrentCounty
1001 Sue Potter Heigham Norfolk Cambridge Cambridgeshire

It is important to recognise that even with a type 2 change additional fields need to exist in the table in order to ensure that the data retain their integrity.

lngCustomerURN strName strTown strCounty blnActive blnDateActive
1001 Sue Potter Heigham Norfolk False 22/11/2003
1001 Sue Cambridge Cambridgeshire True 12/03/2009

There is a type 6 change which basically incorporates all the combined functionality of types 1, 2 and 3 combined.  Type 6 was first described by Ralph Kimball who called it a ‘hybrid approach’ combining the three basic techniques.

  • Type 2, creating new rows to capture change
  • Type 3, adding attributes to reflect an alternative view
  • Type 1, which are overwritten for all earlier dimension rows.
Advertisements