Beginner - Splitting a source table into Dimension and Fact tables

Question

Beginner - Splitting a source table into Dimension and Fact tables

Andrew P

SSCarpal Tunnel

Points: 4649
More actions
July 20, 2015 at 9:33 pm

#298902

I am in a situation similar to below, where a source table has both measures and context, where the amount paid columns are used heavily in reporting:
PAYMENT
PAYMENT_ID (INT, PK)
CLIENT_ID (INT foreign key to client table)
ACCOUNT_ID (INT)
DATE_OF_PAYMENT (DATETIME)
INVOICE_ID (INT)
AMT_PAID_BY_CHEQUE (DECIMAL, used in aggregations)
AMT_PAID_BY_CASH (DECIMAL, used in aggregations)
AMT_PAID_BY_CREDIT (DECIMAL, used in aggregations)
If this were transformed to a snowflake or hybrid schema table, where one of the main functions would be measuring payments by payment type, is there a better way than below? Is the below a decent solution?
DIM_PAYMENT
PAYMENT_ID (INT, PK)
CLIENT_ID (INT foreign key to client dimension)
DATE_OF_PAYMENT (Foreign key to date table, or datetime)
INVOICE_ID (INT, foreign key to invoice dimension)
FACT_PAYMENT_COMPONENT
PAYMENT_ID INT
PAYMENT_COMPONENT_IND (Indicator for Cash, Cheque or Credit)
AMOUNT (DECIMAL)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

newbie_dev SSCommitted Points: 1916 More actions · Answer 1

In my opinion, you should keep the payment related info only in the fact, not in the dimension. And you should create a dimension for all the entities(dimensions for client, account and invoice)

And in the fact to keep the reference for these three dimensions and your measures: the amount and the payment date.

You should have all the references to the dimensions in the fact table, not in the dimension.

For the payment type you could create a separate dimension to contain all three values, but the easiest way would be the way you implement it - with a degenerate dimension.

Hope this helps.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 2

The payment table looks almost as a fact table to me. Just remove the payment_id and invoice_id, and group by the other columns. You need to create a dimension for client, one for account (or merge both into a single dimension depending on how you relate them) and one for time (payment_date). You need to define if cheque, cash and credit should be a dimension or are separate measures (which can be added up). I'd leave them as separate measures.

This can go more complex if you want to include your invoice date in your payments fact. If each account represents a different product, or something else that we can't see.

It could be something like this:

PAYMENT

ACCOUNT_ID (INT)

DATE_OF_PAYMENT (DATE)

AMT_PAID_BY_CHEQUE (DECIMAL, used in aggregations)

AMT_PAID_BY_CASH (DECIMAL, used in aggregations)

AMT_PAID_BY_CREDIT (DECIMAL, used in aggregations)

ACCOUNT

ACCOUNT_ID

CLIENT_NAME

LOCATION

SOME_MORE_COLUMNS_TO_DESCRIBE_ACCOUNT

TIME

DATE

LONG_DATE

SHORT_DATE

WEEK

MONTH

QUARTER

SEMESTER

YEAR

DAY_OF_WEEK

ETC

EDIT: Try to get The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling book

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Martin Schoombee SSCoach Points: 19128 More actions · Answer 3

In my opinion, the first design you've had wasn't far off. I would do the following:

PAYMENT

PAYMENT_ID (INT, PK)

CLIENT_ID (INT foreign key to client table)

ACCOUNT_ID (INT)

DATE_OF_PAYMENT (DATETIME)

INVOICE_ID (INT)

PAYMENT_TYPE_ID (INT)

AMT_PAID (DECIMAL)

PAYMENT_TYPE

PAYMENT_TYPE_ID (INT, PK)

PAYMENT_TYPE_NAME (VARCHAR)

...

(Assuming that you already have a dimension for date, account and client)

Add a dimension for payment type. Your fact table will have multiple entries per invoice...once for each payment type. Doing it that way will allow you to analyze the data by both invoice and payment type.