Hi, i've babysat our 3am , 12 erp source (one step per source) ssis pkg/subpkgs based sqlagent job for months now at my new job. Twice now in the last week or so one sales view column "Revenue" from one specific erp has brought the job down. The reason ssis says it is failing is because that column violates constraints. The column cannot be null. And its final landing data type is float. And the source is sql server 2014 which is older than the version where our pks run , 2019.
So what i've done twice now is go to that erp and run the sales view adhoc with 2 extra things...a condition where revenue is null and an extra column that casts revenue to float. And found no problems.
so i reran just that step in the etl on each occasion and it ran fine. Below i show the errors. I started looking at posts on the web using these errors as search criteria but dont see anything relevant to me. I suspect that the proliferation of nolock hints in the view are to blame. And that before me, someone complained about deadlocks and IT unknowingly introduced a new problem by introducing dirty reads.
What does the community think? I'm going to ask our dba to identify for me who might be running things at that time of day on that erp that might be increasing the chances of false reads.
April 6, 2024 at 6:24 pm
Can the column in the source table ever be null?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
April 7, 2024 at 12:57 pm
thx phil. no i looked closely at the view definition too. but that is when / after all the records are committed.
when i strip out all the left joins, what this really comes down to is an inner join between a sales detail table and a general ledger "extension" table.
The latter table has a currency rate that is used for the revenue calculation depending on a company code in the sales detail. If that calc is used, the rate is multiplied by a value (call it A) from the sales detail.
Otherwise the revenue is set to a different value (call it B) from the sales detail table (no multiplication).
what i've done thus far is 1) run the view where revenue is null, 2) run the join where A, B or the rate is null. Nothing.
Both times this occurred, i reran the erroring etl step and it ran fine. I've asked my dept why dirty reads are used in that view and will post the answer back here once i get it.
as to whether any of these columns can be null temporarily persisted as their records are being built, i dont know. i wouldnt have an economical way of checking either. it might be cheaper for me to find a way to run the query without dirty reads even if it has to be at a different time of day. and make conclusions from there.
April 8, 2024 at 7:48 am
Have you considered redirecting error rows to an error table, rather than simply failing the package? This could help you track down what's happening.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
April 8, 2024 at 4:26 pm
yes, i usually consider that road but am very careful not to create a new "cottage industry". By that I mean a whole new process whereby people have to take additional steps they never had to take before that is more economically addressed a different way. In this case, I'm so lazy that i'd prefer to take out the dirty reads and watch from the sidelines for a month or two. And even then, perhaps wait before doing anything to see what kind of answers come back to the question of what could possibly be running at the same time on that aging erp that could give up a null in dirty reads.
I meant it merely as a diagnostic tool, not a permanent change. But taking out the dirty reads sounds like a good idea too.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply