Introduction
In yesterday’s discussion we looked at creating entities and their related attributes. We learnt that a ‘model’ could be very loosely compared to a database, an entity to a table and attributes to the fields of a table.
Today we are going to load the raw data(member) into our entities and look at all the data within Master Data Services Explorer.
In loading the data, we shall be using a spreadsheet source and SSIS to load the entities. Should you want the packages used in our examples, please feel free to contact me at steve.simon@sqlpass.org
Some background
When we create an entity as we did yesterday, and we save that entity, Master Data Services (unbeknown to us) creates an appropriate staging table in the background. Raw data to be loaded into Master Data Service from no matter what the source, must first pass through this staging table. The original source data may be data from a SQL Server table or perhaps from an Excel spreadsheet. My users tend to bulk insert via a spreadsheet so this is the method that I plan to utilize in this discussion.
The raw data for ‘color’ may be seen below:
Let’s look at the various columns within this spreadsheet.
Note that in column “D” and “E” are the code and name. As mentioned in my previous post, the “key” as we know it from the normal table concept is the “code”. The value of the key is the name.
There are three other weird fields. Those are columns “A”, “B”, and “C”. These are three system required fields in order to insert the data correctly and there usage will become more apparent a bit later.
The ‘ImportType’ is the most critical. An import type of ‘1’ refers to an insert. A ‘2’ indicates an update, an import type of 4 is a hard delete (which we shall talk about in detail when we discuss daily loads). For a complete list of ‘ImportType’ s the user is referred to the following URL.
http://technet.microsoft.com/en-us/library/ee633854.aspx
The import status ID (located in column B) tells us the status of the records that we are trying to insert.
The status of the import process. Possible values are:
- 0, which you specify to indicate that the record is ready for staging.
- 1, which is automatically assigned and indicates that the staging process for the record has succeeded.
- 2, which is automatically assigned and indicates that the staging process for the record has failed.
Last, but not least is column “C” which contains the batch tag. This will become more important to us (as we progress in this series of presentations) as we shall be using the same ‘BatchTag’ over and over again when we discuss the production nightly loads (in a following portion of the article).
Now that we have our source data ‘file’, it is time to put together an SSIS package to insert the data (from the spreadsheet) into the Staging table. The staging table (stg.Color_Leaf), created by Master Data Services, may be seen below:
Note that the table has a few ‘extra’ fields that we have not discussed. These fields are not required to stage the data.
Creating our load package.
I have taken the liberty of showing (below), a screen dump of my load package.
Note that the package also contains ‘data flows’ for Types and Products. We shall be looking at ‘Color’ in our example. The process to load other entities is similar to what I am currently describing.
Notice that to access the sources spreadsheet, I have used an “Excel connection manager” and for the destination staging table, an OLE DB destination connection.
The data flow is shown below:
Having run this portion of the package, one will see the following data within the staging table.
The astute reader will note that in the screen dump above, the ImportType is set to 2 EXCEPT for the last record(to be discussed further down). I changed the ImportType to 2 for preparation of the paper so that I am able to show the batch waiting to be processed FROM this staging table to its FINAL destination in the MDS database. One important point to state at this point in time is, do not look for a ‘color’ table per se. Data storage within MDS is not handled in a similar fashion as it is with a standard SQL Server database.
Having run our package and thus having loaded our staging table, did something ‘strange’ under the covers. Master Data Services woke itself you and if you look within the Integration Management tab of Master Data Manager, you will note that there is a batch there that requires processing (See the screen dump below).
What we now need to do is to “push” this data from the staging table into the MDS database (for production usage). In short, this data is not yet visible to the user.
To complete the process, simply click on ‘Start Batches’.
The system will ask you which version. To date we have not discussed versioning. Simply accept Version_1 by clicking OK.
You will note that our job is ‘Queued to run’ (See the status box) . The run process could take a minute or so to complete.
Once complete, the status will change (see below).
You will note that there we 13 records were updated HOWEVER one was rejected!!! The reason that it was rejected was that we requested that Master Data Services updates our colors. Record 14 from the spreadsheet above has a color “pers” (Dutch for purple) AND MORE IMPORTANTLY as ImportType of 1. This color does ALREADY EXISTS in the Master Data Service table and therefore cannot be INSERTED! Once again, Master Data Services has covered your back!!!
More about this when we discuss view your errors in the error views.
An important point to push home is that all other entities load in a similar fashion.
To look at our data that we have just loaded, we once again go into the Master Data Manager, choose Explorer, click on the ‘Entities’ tab and opt for ‘Color’.
Once we choose color, the color entity is brought up.
Dealing with the mother Product Table
A screen dump of the mother product table is shown below:
Note that the product table has the color already shown. Let us take a look at how this is obtained from the spreadsheet shown in the NEXT screen dump.
Note that column “R” contains the color. Whilst the ‘code’ and the ‘name’ within the ‘color’ load that we just ran, are the same, what in essence is stored within column “R” is the code (the KEY). It is almost as if Master Data Services will do a “Look Up” once loaded. This will become more evident when we discuss our daily load processing (in an upcoming ‘episode’).
It should be noted that in the creation of the Product Entity, the color attribute was set up NOT as free form BUT rather as a “Domain Based” attribute. Why?
It was done in this manner to permit the end user to update a color and yet limit the user to only a prescribed subset of colors. Please see the screen dump immediately below.
Wrapping up
Today we have seen
1) That data within a spreadsheet may be used as a data source for our Master Data Services entities.
2) We have seen that the data (on entry into Master Data Services) is stored within a staging table.
3) We have seen what the data looks like within this staging table.
4) We have looked at an SSIS package to load the data from a spreadsheet into the staging table.
5) We have seen how the Integration Management is able to push the data from the staging table into Master Data Services.
6) We have seen how the product table (mother table or center of the star from yesterday’s discussion) contains the KEY ONLY for the color attribute. The description comes totally from the color entity. Very similar to our SQL look up.
In the next portion of this series
1) We shall be looking at creating a view to actually look at the data that is stored within Master Data Services.
2) We shall be looking at the extremely important concept of a “Current Flag”
3) We shall be looking at extracting data from Master Data Services to be used with SQL Server Reporting Services.
In the interim, should you have any questions or recommendations, I would love to hear from you. You may reach me at steve.simon@sqlpass.org
Happy programming.