May 16, 2021 at 5:45 pm
Good Morning,
Have a Question, we have a FACETS feed it has about 50 columns in it. it is daily feed it get loads with all fields (Dates, Money, Strings) as varchar(500) to the table called STAGING_FACETS in sql server.
so here I need to cleanse the data and loads to final table (PR_FACETS) . in this final table data type specific columns, so using a procedure (or any other way also fine),.
here my question is
before Insert/Update for date columns I am using ISDATE() check to make sure the source /stage has valid dates in it
is there any alternative way to di this?
also if there is any non date values in the column that need to e exclude from updating / inserting to target table.
also here the main question is if there is any non dates values (except nulls, valid date values ) that need to be exclude and process the valid ones with out stopping / break the procedure. no hard break....
any advise please?
or any posts/ resources that I can refer?
Thank you
Areshmi
May 16, 2021 at 6:25 pm
It may be that this could mostly be done during the original insert into the staging table. In order to know that, we'd need to know what the "Facets Feed" looks like. Can you post what the "record format" is and a couple of rows of the original data from the feed?
--Jeff Moden
Change is inevitable... Change for the better is not.
May 17, 2021 at 3:08 pm
ISDATE() isn't perfect (https://stackoverflow.com/questions/32755924/isdate-in-not-working-as-expected-in-sql-server)
You may need some other guard clauses or checks to be sure the value is a date. Unfortunately, dates are just not stored in a consistent format across systems.
May 17, 2021 at 4:20 pm
You could use TRY_CAST or TRY_CONVERT to attempt to convert the value to an appropriate date data type (date, datetime). You would have to build the code to attempt to convert the non-null values - and then determine what to do with those rows that cannot be converted.
How is this data loaded to the staging table? Is it through BCP or SSIS or something else? It may be possible to modify that process to perform your data type validation - redirecting 'bad' rows to a separate table/file for further analysis and correction, but it really depends on how you want to process the data.
Further to that - if you have control over how the FACETS file/feed is built then you can and should validate the data type and format on output. With dates you should always output the data in a non-ambiguous format - the best option for SQL Server would be 'YYYYMMDDTHH:MM:SS.nnn' but you can also leave out the T and just use 'YYYYMMDD HH:MM:SS.nnn'.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
May 20, 2021 at 5:54 pm
Good Morning,
Apologize for delay in reponse please see attached sample file ( i mocked little bit) so all these fileds need to be loaded to final table(same structure as stg except in final table they are relevant data formats where as in STG they all just strings )
can you please advise the better way to process each column in terms of data types, for example the date fields how to validate them before loading to final table like wise how to check amounts fields to include them with pennies if any?
Thank yoy in advance
asiti
May 20, 2021 at 5:56 pm
added Jeff please help, sorry for delay as it is bit delay to get the file to me
Thank you
May 20, 2021 at 5:58 pm
We are planning to process with SSIS, but we are not married to any particular solution. can you please guide me for some sample / post that i can do this below exacty said by you. Thank you in advance. Asiti
How is this data loaded to the staging table? Is it through BCP or SSIS or something else? It may be possible to modify that process to perform your data type validation - redirecting 'bad' rows to a separate table/file for further analysis and correction, but it really depends on how you want to process the data.
May 20, 2021 at 8:55 pm
Having a quick look at what you attached, here's what I would do...
I'd lay out a staging table with the correct columns with the correct datatypes.
I'd then setup to do the imports using BULK INSERT along with using the error checking feature, which can be setup to sequester bad rows in a separate file for later repair without stopping the main load. It can also tell you what's wrong with the failed rows but that can require a little bit of a trick if you want it to be humanly readable.
And, no... I wouldn't (and haven't yet) use SSIS for any of this. You can get some pretty exquisite control over the process using a stored procedure and bulk insert.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 21, 2021 at 7:44 am
What action do you wish to take if validation fails (eg, invalid date) for a particular row?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
May 21, 2021 at 3:08 pm
Hello PHil,
Would like to insert these records into separate table (so team can look into it later to check) we dont want to process that record at all. please let me know if you wish me to explain more.
Thank you
asiti
May 21, 2021 at 3:10 pm
Thanks Jeff, sure I will work for SQL side only. no ssis
in this case everyday the staging table ret refreshed (truncate and load from file as all columns nvarchar(500))
so from here we need to check with final table and then load to final table (can be update or insert)
Can you guide me any of resources to use bulk insert and sql procedure that you mentioned in your earlier solution
Thank you
Asiti
May 21, 2021 at 3:23 pm
As you are familiar with SSIS, you will probably know that you can redirect errors to an error table.
If your target staging table is set up with all of the expected data types, you should therefore be able to redirect all the INSERT failures without too much trouble.
Of course, you will also want to know which column caused the error and what the error was. I can help you with that part, if this is how you would like to approach the problem.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
May 21, 2021 at 7:52 pm
Thank you Phil.
Would it be advisable to do it in SQL server itself, as we have some senior management they opted to use SQL server only (son interms of procedure / manual code etc)
any advise? please
Thanks
Asiti
May 22, 2021 at 12:54 am
Thank you Phil.
Would it be advisable to do it in SQL server itself, as we have some senior management they opted to use SQL server only (son interms of procedure / manual code etc)
any advise? please
Thanks
Asiti
To be sure, Phil is correct. The same thing that I propose without SSIS could be done in SSIS. I tend to be a bit anti-SSIS simply because of the way I've seen people abuse it but, done correctly, SSIS is an equally good option for this particular task.
If you have senior management that would prefer not using SSIS, that's up to you and them.
The key here is that you don't need a separate test for validating things like dates, etc. It CAN be done by using the correct datatype in the table being loaded and any rows with such detectable errors can be sequestered by either method without it terminating the whole run.
Pick one of the two methods and run with it.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply