Prevent Loading bad data into Table from excel spreadsheet

  • How do I prevent the bad data from excel loading into a table?

    I already have a pacakge loading from excel spreadsheet to a Table.

    How can i modify (Data flow task: excel to table)) that and add email when ever there is bad data and prevent bad data loading into a table?

  • komal145 - Tuesday, March 7, 2017 12:54 PM

    How do I prevent the bad data from excel loading into a table?

    I already have a pacakge loading from excel spreadsheet to a Table.

    How can i modify (Data flow task: excel to table)) that and add email when ever there is bad data and prevent bad data loading into a table?

    Define 'bad data'.
    Use a combination of Conditional Splits and Error Redirects to trap everything of interest.
    Put error rowcounts into variables so that you can create conditional logic in your package to send e-mails depending on the rowcounts.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I personally prefer to use a stage table with a GUID column to stage all data before they load into the warehouse.  This way I can identify bad data, keep it staged, allow the good data to flow through and then come back to view, correct and re-load the bad data later.  When I have an error in stage, I log the error to a log table which is later read by an alerting process.  This way I can keep a close eye on what needs attention.

    Note, I do try to massage/handle all data coming in to the warehouse in order to eliminate manual intervention.

  • We already have stage Table and first we load to stage and then actual DW. But the requirement mentions to evn stop loading process and send email if you find any bad data.

  • Phil Parkin - Tuesday, March 7, 2017 1:01 PM

    komal145 - Tuesday, March 7, 2017 12:54 PM

    How do I prevent the bad data from excel loading into a table?

    I already have a pacakge loading from excel spreadsheet to a Table.

    How can i modify (Data flow task: excel to table)) that and add email when ever there is bad data and prevent bad data loading into a table?

    Define 'bad data'.
    Use a combination of Conditional Splits and Error Redirects to trap everything of interest.
    Put error rowcounts into variables so that you can create conditional logic in your package to send e-mails depending on the rowcounts.

    Bad data can be anything ...even if the tab name is misspelled " need to send email " and further stop the process from loading and send email.
    Currently I have two "excel spreadsheets" in two Data flow task where DFT 1 loads excel1 and DFT2 loads Excel2

  • komal145 - Thursday, March 9, 2017 12:42 PM

    Phil Parkin - Tuesday, March 7, 2017 1:01 PM

    komal145 - Tuesday, March 7, 2017 12:54 PM

    How do I prevent the bad data from excel loading into a table?

    I already have a pacakge loading from excel spreadsheet to a Table.

    How can i modify (Data flow task: excel to table)) that and add email when ever there is bad data and prevent bad data loading into a table?

    Define 'bad data'.
    Use a combination of Conditional Splits and Error Redirects to trap everything of interest.
    Put error rowcounts into variables so that you can create conditional logic in your package to send e-mails depending on the rowcounts.

    Bad data can be anything ...even if the tab name is misspelled " need to send email " and further stop the process from loading and send email.
    Currently I have two "excel spreadsheets" in two Data flow task where DFT 1 loads excel1 and DFT2 loads Excel2

    To detect tab name issues, you're going to have to write some C# code. Or somehow swallow the error in SSIS when the expected tab name is not present (I do this in code, so I'm not sure how to implement this option).
    There is no single solution to what you are trying to do. Instead, you first need to list all of the different types of 'bad data' you want to handle and then work out, for each one, the best way to do it.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • komal145 - Thursday, March 9, 2017 11:55 AM

    We already have stage Table and first we load to stage and then actual DW. But the requirement mentions to evn stop loading process and send email if you find any bad data.

    You can also place foreign key and check constraints on the staging table.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Thursday, March 9, 2017 12:56 PM

    komal145 - Thursday, March 9, 2017 11:55 AM

    We already have stage Table and first we load to stage and then actual DW. But the requirement mentions to evn stop loading process and send email if you find any bad data.

    You can also place foreign key and check constraints on the staging table.

    True, but such a method would bomb out on the first error.

    If there are likely to be several errors, it's probably better to try to capture them all, by whatever devious means, and then fail. Otherwise, you're into re-run hell.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin - Thursday, March 9, 2017 12:53 PM

    komal145 - Thursday, March 9, 2017 12:42 PM

    Phil Parkin - Tuesday, March 7, 2017 1:01 PM

    komal145 - Tuesday, March 7, 2017 12:54 PM

    How do I prevent the bad data from excel loading into a table?

    I already have a pacakge loading from excel spreadsheet to a Table.

    How can i modify (Data flow task: excel to table)) that and add email when ever there is bad data and prevent bad data loading into a table?

    Define 'bad data'.
    Use a combination of Conditional Splits and Error Redirects to trap everything of interest.
    Put error rowcounts into variables so that you can create conditional logic in your package to send e-mails depending on the rowcounts.

    Bad data can be anything ...even if the tab name is misspelled " need to send email " and further stop the process from loading and send email.
    Currently I have two "excel spreadsheets" in two Data flow task where DFT 1 loads excel1 and DFT2 loads Excel2

    To detect tab name issues, you're going to have to write some C# code. Or somehow swallow the error in SSIS when the expected tab name is not present (I do this in code, so I'm not sure how to implement this option).
    There is no single solution to what you are trying to do. Instead, you first need to list all of the different types of 'bad data' you want to handle and then work out, for each one, the best way to do it.

    Yes, Script task need to validate the excel file columns and if everything is good then proceed laoding into a table , other wise email saying bad data for excel . Can you please provide me with some example if you have worked before with c# coding to validate cells?

  • komal145 - Monday, March 13, 2017 11:31 AM

    Yes, Script task need to validate the excel file columns and if everything is good then proceed laoding into a table , other wise email saying bad data for excel . Can you please provide me with some example if you have worked before with c# coding to validate cells?

    I've written C# code to validate column names, column order and worksheet names, but not the data itself. I think of this as 'structural' validation.

    I do the data validation by loading to SQL Server using SSIS and using a mixture of error redirects in data flows and metadata-driven custom validation queries (for business rules) called from procs..

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply