March 11, 2013 at 7:41 am
Hi,
I am new to SSIS.
I have to load 50 millions of records(full load) into a SQL Server table.
After the full load I have to create another package for incremental load based on Hash value.
Please suggest me one of the below mentioned approach to get good performance.
-Merge command
-Look Up
-SCD
Thanks in advance
March 11, 2013 at 3:12 pm
Use the Merge, of those 3.
March 12, 2013 at 3:04 am
Hi Rocky´s
I totally agree with that, the use of the merge command is for me the better option. I designed several ETL packages the last year and I tried the 3 options, and at the end I always used the Merge command.
Here I leave you a code snippet from my personal library. This code makes the following:
- Log the start time and name of the package in a custom log table with an identity column
- Merge into a target table the results of a query (as a source). You can adapt the source query to your needs
- The columns that match the merge condition are updated, if not, they are inserted.
- At the end the finish time is log together with the row count of updated and inserted records.
-- I use a table to log the start and finish time, package name, number of rows inserted and number of rows updated
insert into SSIS_Log
(PackageName,DTStart)
values
(?,GetDate()); -- Map the System variable "PackageName"
declare
@mergeResultsTable table (MergeAction VARCHAR(20));
declare
@insertCount int,
@updateCount int;
MERGE Table_A AS TARGET
USING (
SELECT
Col1
,Col2
.....
,ColN
FROMTable_B
) AS SOURCE
ON (TARGET.Key1 = SOURCE.Key1 and TARGET.Key_2 = SOURCE.Key_2)
-- Update
WHEN MATCHED THEN UPDATE SET
TARGET.Col1= SOURCE.Col1
,TARGET.Col2= SOURCE.Col2
,TARGET.UpdateDT=GetDate() -- I create a UpdateDT column in the target table to store the update datetime
-- etc
--Insert
WHEN NOT MATCHED BY TARGET THEN INSERT
(
Col1
,Col2
,CreateDT -- I create a CreateDT column in the target table to store the insert (or creation) datetime
-- etc
)
VALUES
( SOURCE.Col1
,SOURCE.Col2
,GetDate()
-- etc
)
OUTPUT $action into @mergeResultsTable;
update SSIS_Log
set UpdatedCount = -- Log the count of updated rows
(SELECT COUNT(*)
FROM @mergeResultsTable
WHERE MergeAction = 'UPDATE')
, InsertedCount = -- Log the count of inserted rows
(SELECT COUNT(*)
FROM @mergeResultsTable
WHERE MergeAction = 'INSERT')
,DTFinish = GetDate()
where Idx =
(
select MAX(Idx)
from SSIS_Log
);
If your target table should be treated as a SCD I strongly recommend to you the following article by Adam Aspin:
Slowly changing dimensions using T-SQL MERGE[/url]
If you have further question just write me.
Kind Regards
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply