January 27, 2016 at 6:43 pm
The error ,message from ssis goes as follows:
[Execute SQL Task] Error: Executing the query "EXEC [dbo].[sp_load_Medicare_Part_A_Data];" failed with the following error: "Invalid column name 'DUPLICATE'.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
Now see my stored proc code ( attached )
January 27, 2016 at 6:52 pm
Any reason why you need this procedure to add a column named DUPLICATE to a table, i.e. why is the column not already on the table if it is actually needed?
To your issue, I cannot pinpoint the source but based on the error message I will venture a guess that it is related to the column named DUPLICATE. Sorry of that seems obvious but the error is pretty clear. I would avoid implementing DDL in a procedure where possible. There are too many reasons not to do it.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 27, 2016 at 7:26 pm
Orlando Colamatteo (1/27/2016)
Any reason why you need this procedure to add a column named DUPLICATE to a table, i.e. why is the column not already on the table if it is actually needed?
I have to agree with Orlando here. If you already have the column in the table, why would you need to add it again? I don't see where you drop the column later, so why not just add it once and be done with it? Please don't infer that I'm suggesting that you add it and then drop it later - that will just cause you problems.
That being said, try changing your check to the following and see if it works for you:
IF (SELECT COUNT(*) FROM sys.columns WHERE object_id = OBJECT_ID('CMS.STHMCDPTA') AND name = 'DUPLICATE') = 0
BEGIN
ALTER TABLE CMS.STHMCDPTA ADD DUPLICATE Char(1);
END;
January 27, 2016 at 7:31 pm
As a bit of a sidebar, I have to ask about the performance of your procedure. I saw a scalar function, non-SARGable join predicates, non-SARGable where clause predicates and a case where you drop and then recreate a physical table. You may be able to get a significant performance increase by addressing some of these things, but it would take a lot of work.
January 28, 2016 at 1:51 am
Ed Wagner (1/27/2016)
As a bit of a sidebar, I have to ask about the performance of your procedure. I saw a scalar function, non-SARGable join predicates, non-SARGable where clause predicates and a case where you drop and then recreate a physical table. You may be able to get a significant performance increase by addressing some of these things, but it would take a lot of work.
There's so much dodgy coding in the stored procedure, it's difficult to know where to begin. So, tackling the original issue of the duplicate column called "duplicate", you don't need to add a column to a table to identify duplicates:
DELETE c
FROM [CMS].[STHMCDPTA] C
INNER JOIN [mhpdw].[EDW].[MEMBER].[DMember] B -- remote
ON C.BeneEqtblBicHicnNum = (B.HCFANBR + 'A')
INNER JOIN VISIT_IN V -- missing schema
ON V.mem_nbr = B.membernbr
WHERE REPLACE(ClmDgNs1D,'.','') = REPLACE(V.DIAG_I_1,'.','') -- missing table aliases
AND CONVERT(CHAR(8),V.SERV_DT,112) = ClmFromDt -- what datatype is ClmFromDt?
-- if ClmFromDt is DATE then use CAST(V.SERV_DT AS DATE) = ClmFromDt
In addition to the notes in the code, remote joins require very careful handling for performance.
I'd focus on some of the basic stuff first, such as getting all of the details correct in the queries - schemas for tables, table aliases, correct NULL handling, and avoiding if possible all those nasty string comparisons.
If you take care to format your code nicely - any code you're working with, whether you wrote it or not - you will find it easier to read in the next time you encounter it.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 28, 2016 at 5:31 am
ChrisM@Work (1/28/2016)
Ed Wagner (1/27/2016)
As a bit of a sidebar, I have to ask about the performance of your procedure. I saw a scalar function, non-SARGable join predicates, non-SARGable where clause predicates and a case where you drop and then recreate a physical table. You may be able to get a significant performance increase by addressing some of these things, but it would take a lot of work.There's so much dodgy coding in the stored procedure, it's difficult to know where to begin. So, tackling the original issue of the duplicate column called "duplicate", you don't need to add a column to a table to identify duplicates:
DELETE c
FROM [CMS].[STHMCDPTA] C
INNER JOIN [mhpdw].[EDW].[MEMBER].[DMember] B -- remote
ON C.BeneEqtblBicHicnNum = (B.HCFANBR + 'A')
INNER JOIN VISIT_IN V -- missing schema
ON V.mem_nbr = B.membernbr
WHERE REPLACE(ClmDgNs1D,'.','') = REPLACE(V.DIAG_I_1,'.','') -- missing table aliases
AND CONVERT(CHAR(8),V.SERV_DT,112) = ClmFromDt -- what datatype is ClmFromDt?
-- if ClmFromDt is DATE then use CAST(V.SERV_DT AS DATE) = ClmFromDt
In addition to the notes in the code, remote joins require very careful handling for performance.
I'd focus on some of the basic stuff first, such as getting all of the details correct in the queries - schemas for tables, table aliases, correct NULL handling, and avoiding if possible all those nasty string comparisons.
If you take care to format your code nicely - any code you're working with, whether you wrote it or not - you will find it easier to read in the next time you encounter it.
It sounds like you went into the same tailspin I did. There's a lot to consider. π
I was curious about the join predicates as well. The data type question with the Char(8) versus a date is one this, but the ClmDgNs1D is the one that has me thinking. In addition to being a performance problem, this might be a real logic problem. In medical diagnostic codes, if you remove the periods, you might have something that ends up matching when it shouldn't match. Example:
CO.4.51 <> CO.4.5.1
CO.45.1 <> CO.4.51
Even though these values are different, they match if you remove the periods. I'm certainly no expert at medical billing, but I have to believe that there could be duplicate codes when you remove part of the string. By the logic in the statement, if you have 2 people who were diagnosed with similar codes on the same date (excluding time) then they're flagged as duplicates. I presume the ones flagged as duplicates aren't billed, so this will eventually hit the bottom line. If medicare is never billed, they aren't going to pay it. If they don't pay it, the patient is then hit with a bill that medicare should have paid but was deduped out of the system.
I know medical billing is more of an art than a science and there's a lot of "creativity" on both sides. Please tell me I'm wrong in this case.
January 28, 2016 at 6:59 am
Reply to all:
1.) The table gets copied at run time via another process from another db. For the purposes of processing, I need to add the column "DUPLICATE" ( This is very important )
2.) I ran the sp using ssms and it gave the same error message.
3.) My issue is why didn't the sp add the column to the table ? This is my strongest question. Had it done that, I would not have gotten the error message. I am thinking if I had included a "GO" after the IF for adding columns, it would have worked well. I will test that and let you know.
3.)As for performance issues, I am not too worried. It is ok even if it takes many hours.
January 28, 2016 at 7:25 am
mw112009 (1/28/2016)
3.) My issue is why didn't the sp add the column to the table ? This is my strongest question. Had it done that, I would not have gotten the error message. I am thinking if I had included a "GO" after the IF for adding columns, it would have worked well. I will test that and let you know.
Did you try the alternate method of checking for the column I posted above? Did that solve the problem?
January 28, 2016 at 7:38 am
Reply to all:
Please ignore this post.
I found the issue.
I had done things in an unusual sequence and that caused the bug.
See the script I created below. If you run this Sql sever will
not let you create a stored proc. But I was able to ( wired trick ).
1.) If you run this script as it is you get an error.
SQL sever will not let you create the SP. This is reasonable.
Why? At run time the table a does not have a column DUPLICATE
So it says you cannot do an update and stops you from creating
the SP. ( This is perfect , makes sense )
2.) What did I do wrong then.
I tested some of my code while writing and was able to cheat
SQL server.
If you follows the steps in this order you can create a sp.
The order is 1, manually add the column ,2,3
Then what happened was the table was deleted, recreated
( Then I expected the SP to add ther column at run time and
then do the update )-- BUT IT FAILED!
So SSMS is right!
Extra Credit Question π Can you add a column at run time
and then do an update inside the SP ?
Answer: Use dynamic SQL ( I am hoping that is the only way )
------------
create table a ( name varchar(100) ) ;---1
go
create procedure abc---2
as
Begin
IF Not Exists( Select name FROM SYS.COLUMNS where OBJECT_ID = OBJECT_ID('a') AND name='DUPLICATE' )----4
Begin
ALTER TABLE a
ADD DUPLICATE CHAR(1);
End;
UPDATE C
SET C.DUPLICATE = '1'
FROM
a c;
End
go
EXEC abc;---3
Select * FROM a;
drop table a;
drop procedure abc;
--1,4,2,3 - creates the bug
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply