Cari Blog Ini

27 Maret 2011

IT - Data Warehousing - Fact and Dimension Tables

Data warehouses are built using dimensional data models
which consist of fact and dimension tables.

Dimension Tables

Dimension tables are used to describe dimensions; they contain
dimension keys, values and attributes.

For example,
* the time dimension would contain every hour, day, week, month,
quarter and year
that has occurred since you started your business operations.
* Product dimension could contain a name and description of products you sell,
their unit price, color, weight and other attributes as applicable.

Dimension tables are typically small, ranging from a few to several
thousand rows.
Occasionally dimensions can grow fairly large, however.

For example,
* a large credit card company could have a customer dimension with
millions of rows.
Dimension table structure is typically very lean,
for example customer dimension could look like following:

Customer_key
Customer_full_name
Customer_city
Customer_state
Customer_country

Although there might be other attributes that you store in the
relational database,
data warehouses might not need all of those attributes.

For example,
* customer telephone numbers, email addresses and other contact information
would not be necessary for the warehouse.
Keep in mind that data warehouses are used to make strategic decisions
by analyzing trends.
It is not meant to be a tool for daily business operations.
On the other hand, you might have some reports that do include data elements
that aren't necessary for data analysis.
Most data warehouses will have one or multiple time dimensions.
Since the warehouse will be used for finding and examining trends,
data analysts will need to know when each fact has occurred.
The most common time dimension is calendar time.
However, your business might also need a fiscal time dimension
in case your fiscal year does not start on January 1st as the calendar year.

Most data warehouses will also contain product or service dimensions
since each business typically operates by offering either products or
services to others.
Geographically dispersed businesses are likely to have a location dimension.



Fact Tables

Fact tables contain keys to dimension tables as well as measurable facts
that data analysts would want to examine.

For example,
* a store selling automotive parts might have a fact table recording a
sale of each item.
* The fact table of an educational entity could track credit hours
awarded to students.
* A bakery could have a fact table that records manufacturing of
various baked goods.

Fact tables can grow very large, with millions or even billions of rows.
It is important to identify the lowest level of facts
that makes sense to analyze for your business this is often referred
to as fact table "grain".

For instance,
* for a healthcare billing company
it might be sufficient to track revenues by month;
daily and hourly data might not exist or might not be relevant.
On the other hand,
the assembly line warehouse analysts might be very concerned in
number of defective goods that were manufactured each hour.
Similarly a marketing data warehouse might be concerned by
the activity of a consumer group with a specific income-level rather
than purchases made by each individual.


Sumber :
http://sqlserverpedia.com/wiki/Data_Warehousing_-_Fact_and_Dimension_Tables