Data Verification functions

  • Hi All,

    In SSIS, Do we have data verification functions like

    - Isnumeric() //To check if input varchar data contains only numbers

    - Isalphanumeric() //To check if input data is alphanumeric

    - Isdate(valid date) etc //To check if input date is valid

    If not how to verify our input data if it is valid or not?

    Any way to implement these?

    Any suggestion would be of great help.

    Thanks in advance.:-)

  • You do not have these functions out of the box in SSIS, in the sense that they are not present in the Derived Column Transformation.

    However, if you read your source with a SQL statement, you can include those functions there, as long as they exists in the dialect of SQL that you use.

    Another option is to use a script task/component to do your data validation. There you can use the full power of .NET to do your validation.

    A final option is to use a data conversion component. If the conversion fails, reroute the row to an error destination.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thanks for the quick reply..:-)

  • This all assumes that the invalid data can even get into your pipeline in the first place of course.

    If you have appropriate datatypes already set up for your pipeline data, much of the invalid data won't even get loaded & you won't get a chance to analyse it as part of the package.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (9/7/2010)


    This all assumes that the invalid data can even get into your pipeline in the first place of course.

    If you have appropriate datatypes already set up for your pipeline data, much of the invalid data won't even get loaded & you won't get a chance to analyse it as part of the package.

    I think you mean setting the datatypes at the source component, am I right?

    Can't you re-route the error rows from the source to a specific destination? (and then you can analyze all these error rows)

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • <<This all assumes that the invalid data can even get into your pipeline in the first place of course.

    If you have appropriate datatypes already set up for your pipeline data, much of the invalid data won't even get loaded & you won't get a chance to analyse it as part of the package. >>

    Its a good thought. I was looking for the these functions during transformations (to set profiling rules) as the source data am getting is not set with proper datatypes. Ya i agree and appreciate the method you quoted.

  • da-zero (9/7/2010)


    Phil Parkin (9/7/2010)


    This all assumes that the invalid data can even get into your pipeline in the first place of course.

    If you have appropriate datatypes already set up for your pipeline data, much of the invalid data won't even get loaded & you won't get a chance to analyse it as part of the package.

    I think you mean setting the datatypes at the source component, am I right?

    Can't you re-route the error rows from the source to a specific destination? (and then you can analyze all these error rows)

    Yeah - terminology escaped me - brain is fogged today as I'm on support 🙂

    You can do what you suggest, of course - but the reason I posted is that it's not really in the spirit of what the OP was asking, which seems to me to be

    1) Connect to source data

    2) Validate source data

    --do some special processing for data failing validation

    3) Pass validated data to target

    Your suggestion won't work that way because there are probably some validation errors which do not fail as a result of data type problems. So the validation path splits ...

    And if the source is an Excel spreadsheet ... we both know that some of the data probably won't make it past the Excel driver without some pain.

    Phil

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • da-zero (9/7/2010)


    Phil Parkin (9/7/2010)


    This all assumes that the invalid data can even get into your pipeline in the first place of course.

    If you have appropriate datatypes already set up for your pipeline data, much of the invalid data won't even get loaded & you won't get a chance to analyse it as part of the package.

    I think you mean setting the datatypes at the source component, am I right?

    Can't you re-route the error rows from the source to a specific destination? (and then you can analyze all these error rows)

    If you want to deal with bad data at the source that does not make it through your source component, you may have to resort to using the error output from the source.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • There are some data issues that you may not be able to deal with without using a script transform.

    I once had to deal with non-printing characters, including {CR}{LF} in the middle of the record, being read from a text file. The file was a fixed width with data going into 80 or so columns. These bad characters could be found in any column. The only way I found to easily deal with these characters was to read the data in as 1 wide column and pass it to a VB transform to do the clean up and splitting into columns.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Alvin Ramard (9/7/2010)


    There are some data issues that you may not be able to deal with without using a script transform.

    --

    Agreed. You do what you can with the tools available, for the sake of speed and simplicity, but sometimes you just have to roll your sleeves up and do it the long/hard way.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (9/7/2010)


    Alvin Ramard (9/7/2010)


    There are some data issues that you may not be able to deal with without using a script transform.

    --

    Agreed. You do what you can with the tools available, for the sake of speed and simplicity, but sometimes you just have to roll your sleeves up and do it the long/hard way.

    Phil, sounds like you've done this before. Remember what dealing with strings was like in VBScripts compared to using .NET? I wouldn't want to have to go back!!!



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Alvin Ramard (9/7/2010)


    Phil Parkin (9/7/2010)


    Alvin Ramard (9/7/2010)


    There are some data issues that you may not be able to deal with without using a script transform.

    --

    Agreed. You do what you can with the tools available, for the sake of speed and simplicity, but sometimes you just have to roll your sleeves up and do it the long/hard way.

    Phil, sounds like you've done this before. Remember what dealing with strings was like in VBScripts compared to using .NET? I wouldn't want to have to go back!!!

    I have indeed - very similar to what you were describing, in fact. Rogue non-printable characters in source data were causing me plenty of issues for one recent import = time to write some code.

    I still get to use VBScript quite a bit as it's part of the IDE for an application I develop in. So the contrast between that and .NET is usually fresh in my mind!

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply