The SAPZilla Community Network

Compressing Snapshots in Data Services

Sometimes, when capturing data in a data warehouse, we need to store time-variant pieces of data about a transaction. This somewhat blurs the lines between a traditional fact table and a dimension, since in the traditional model, time-variance is mainly the domain of a dimension.

Take the example of a production backlog at a manufacturer. When an order is made, particularly in an organization that manufactures large and/or complex goods, it may take some time to fulfill. Maintaining a consistent backlog is also a key to ensuring consistent production planning that’s not beset with shutdowns, inefficiencies, or missed delivery dates.

Keeping a backlog at a granular level generally requires tracking backlog on an order-by-order basis. That way, anything about an order (that’s in your warehouse) can be analyzed to look for trends in the business. There’s just one issue: keeping a snapshot of every order in backlog for the full amount of time it’s in backlog can take up a lot of space. For instance in a mid-size company: if the average order is in backlog for three months and the company receives 10,000 orders per year, that’s nearly a million records per year in a daily snapshot. After a while, that can really add up. It’s no wonder Bill Inmon said, “The management of these every day, ongoing changes can consume an enormous amount of resources and can be very, very complex. It is this type of load that rivets the attention of the data architect.” (Snapshots in the Data Warehouse, pg. 2, white paper at http://inmoncif.com/inmoncif-old/www/library/whiteprs/ttsnap.pdf)

Example

Read further: http://sapbiblog.com/2013/12/17/compressing-snapshots-in-data-servi...

0 members like this