May 7, 2008 at 4:56 am
Issue:
------
I am working on a health care system and currently facing problem with NULL values being introduced as part of my SQL query.
What is the query trying to do?
The query is trying to retrieve ids from summary table for a practice where the patient are dead and find out if there are same patients in other systems like clinics or hospitals where this patient might still be showing as alive. This is basically a data cleansing query.
Query:
------
DECLARE @REGION char(1)
SET @REGION = 'A'
DECLARE @PRAC_ID varchar(5)
SET @PRAC_ID = '11111'
SELECT Max(D.DATAAGE) DATAAGE,
ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
D.DataSource_ID,
DS.DataSource_Description,
DS.Primary_Care_System ,
@REGION as Region
FROM Demography D,
Demography_Summary ds,
DataSource_IDs DS
WHERE ds.LINK_ID = D.LINK_ID
AND ds.LINK_ID IN
(
SELECT LINK_ID
FROM Demography_Summary
WHERE DEAD = 1
AND PRAC_ID = @PRAC_ID
AND Status < 2
)
AND D.DataSource_ID = DS.DataSource_ID
AND (DS.Primary_Care_System = 1 OR DS.Secondary_Care_Clinic = 1)
AND D.DataSource_ID <> 36
AND D.DEAD = 0
AND D.Status < 2
GROUP BY ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
D.DataSource_ID,
DS.DataSource_Description,
DS.Primary_Care_System
ORDER BY ds.Surname, ds.FName1
Result shown:
2007-02-26 00:00:00.0002012956481402203330TestingTest2222250clinic1, Primary Care1T
2008-05-06 15:53:40.0772012956481402203330TestingTestNULL50clinic1, Primary Care1T
Now I don't want the second row that is showing a NULL value for Practice ID (as they are not a valid value for practice id ) and only the first row needs to be shown.
I have tried using D.PRAC_ID is NOT NULL but it doesnt seem to work. Can anyone help with this please.
Thanks in advance,
Sree
May 7, 2008 at 6:15 am
Not knowing your structure I can't rewrite the query myself, but I suspect the issue is you've got a join that's preventing you from eleminating the NULL values. Instead of using old style syntax you should rewrite the query using ANSI standard joins
SELECT...
FROM Demography D
INNER JOIN Demography_Summary ds,
ON D.DataSource_ID = DS.DataSource_ID
...
FROM...
Also, the IN query can be changed to a JOIN as well and will perform much better for it.
If you really want detailed answers though you'll need to publish your structure as well.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 7, 2008 at 6:21 am
Hi Grant,
I had rewritten the query to resolve this issue but also to improve performance as below:
DECLARE @REGION char(1)
SET @REGION = 'T'
DECLARE @PRAC_ID varchar(5)
SET @PRAC_ID = '11060'
SELECT Max(D.DATAAGE) DATAAGE,
ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System ,
@REGION as Region
FROM Demography_Summary ds
INNER JOIN Demography D
ON (ds.LINK_ID = D.LINK_ID)
INNER JOIN DataSource_IDs DI
ON D.DataSource_ID = DI.DataSource_ID
RIGHT OUTER JOIN (SELECT LINK_ID
FROM Demography_Summary
WHERE DEAD = 1
AND PRAC_ID = '11060'
AND Status < 2 ) as dead
ON ds.LINK_ID = dead.LINK_ID
WHERE (DI.Primary_Care_System = 1 OR DI.Secondary_Care_Clinic = 1)
AND D.DataSource_ID <> 36
AND D.DEAD = 0
AND D.Status < 2
AND ds.LINK_ID NOT IN (Select LINK_ID from Demography_Summary where PRAC_ID IS NULL)
GROUP BY ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System
But this is still giving me the same two results. I believe the issue is due to the fact that LINK_ID with actual PRAC_ID and NULL PRAC_ID are same 🙁
The table structure as such are very large but if it will help, I can post the table structure with the column that we need and the relationship between the tables.
I am also trying to re-write the whole querying to eliminate any NULL values, hope something works 😉
Thanks for you help,
Sree
May 7, 2008 at 6:28 am
Much better query. But this part:
AND ds.LINK_ID NOT IN (Select LINK_ID from Demography_Summary where PRAC_ID IS NULL)
Again could be written as an OUTER JOIN. That's possibly your issue right there. With this better proc I don't think you necessarily need to post the structure.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 7, 2008 at 6:33 am
[font="Verdana"]Sree, as you stated earlier, your query is returning the Null though you have used Is Null in your query. Can you try and compare with Euqal To (=) / Not Equal To (<>) and turning the ANSI_NULLS setting to ON?
by the way, Grant will it decrease the performance?
Mahesh[/font]
MH-09-AM-8694
May 7, 2008 at 6:44 am
Hi Grant,
I used an RIGHT outer join as I wanted to retain all values from the right table but it doesnt seem to work, its returning the same two results
DECLARE @REGION char(1)
SET @REGION = 'T'
DECLARE @PRAC_ID varchar(5)
SET @PRAC_ID = '11060'
SELECT Max(D.DATAAGE) DATAAGE,
ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System ,
@REGION as Region
FROM Demography_Summary ds
INNER JOIN Demography D
ON (ds.LINK_ID = D.LINK_ID)
INNER JOIN DataSource_IDs DI
ON D.DataSource_ID = DI.DataSource_ID
RIGHT OUTER JOIN ( SELECT LINK_ID
FROM Demography_Summary
WHERE DEAD = 1
AND PRAC_ID = '11060'
AND Status < 2 ) as dead
ON ds.LINK_ID = dead.LINK_ID
RIGHT OUTER JOIN (Select LINK_ID
from Demography_Summary
where PRAC_ID IS NOT NULL) as invalid
ON ds.LINK_ID = invalid.LINK_ID
WHERE (DI.Primary_Care_System = 1 OR DI.Secondary_Care_Clinic = 1)
AND D.DataSource_ID <> 36
AND D.DEAD = 0
AND D.Status < 2
-- AND ds.LINK_ID NOT IN (Select LINK_ID from Demography_Summary where PRAC_ID IS NULL)
GROUP BY ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System
Hi Mahesh,
The ANSI_NULLS setting is already on.
Thanks to both of you, I am still trying 😀
Sree
May 7, 2008 at 7:10 am
Sreemati,
I rewrote your query using a LEFT OUTER JOIN because I understand them better. The order is get the deceased patients for your practice and then go out and find matches where the Patient is not marked as deceased. Without knowing all your table structures and having some sample data this is the best I can do:
[font="Courier New"]DECLARE @REGION CHAR(1)
SET @REGION = 'T'
DECLARE @PRAC_ID VARCHAR(5)
SET @PRAC_ID = '11060'
SELECT
MAX(D.DATAAGE) DATAAGE,
DS1.LINK_ID,
D.PAT_ID,
DS1.FName1,
DS1.Surname,
D.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System ,
@REGION AS Region
FROM
Demography_Summary DS1 INNER JOIN -- this practice deceased patients
Demography D ON
DS1.Link_ID = D.Link_ID INNER JOIN
DataSource_IDs DI ON
D.DataSource_ID = DI.DataSource_ID LEFT JOIN
Demography_Summary DS2 ON -- other practices with the same patients
DS1.Link_ID = DS2.Link_ID AND
DS1.Prac_Id <> DS2.Prac_Id AND-- eliminates the same practice
DS1.DEAD <> DS2.DEAD -- eliminates records where the patient is dead
WHERE
-- get the patients for this practice who are deceased and status < 2
DS1.DEAD = 1 AND
DS1.PRAC_ID = '11060' AND
DS1.Status < 2 AND
-- eliminate nulls from other practices/no matching records from other practices
DS2.LinkID IS NOT NULL
GROUP BY
DS1.LINK_ID,
D.PAT_ID,
DS1.FName1,
DS1.Surname,
D.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System [/font]
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 7, 2008 at 8:00 am
Hi All,
It worked, I mean the query had been right since the beginning, it was the data. It was not an actual NULL but a TEXT value NULL and hence even if I used IS NOT NULL, IS NULL or = NULL or <> NULL, it never returned anything.
Sometimes the most obvious things are cause for hours of foundling for no reasons, anyways, hope next time I look for most obvious things first 😉
What I did? Ctrl 0 to turn the value to an actual NULL in the value field.
DECLARE @REGION char(1)
SET @REGION = 'T'
DECLARE @PRAC_ID varchar(5)
SET @PRAC_ID = '11060'
SELECT MAX(D.DATAAGE) DATAAGE,
ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
ds.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System ,
@REGION as Region
--INTO #temp_value1
FROM Demography_Summary ds
INNER JOIN Demography D
ON (ds.LINK_ID = D.LINK_ID)
INNER JOIN DataSource_IDs DI
ON D.DataSource_ID = DI.DataSource_ID
RIGHT OUTER JOIN (SELECT LINK_ID
FROM Demography_Summary
WHERE DEAD = 1
AND PRAC_ID = '11060'
AND Status < 2 ) as dead
ON ds.LINK_ID = dead.LINK_ID
WHERE (DI.Primary_Care_System = 1 OR DI.Secondary_Care_Clinic = 1)
AND D.DataSource_ID <> 36
AND D.DEAD = 0
AND D.Status < 2
AND D.PRAC_ID IS NOT NULL
GROUP BY ds.LINK_ID,
D.PAT_ID,
ds.FName1,
ds.Surname,
D.PRAC_ID,
ds.PRAC_ID,
D.DataSource_ID,
DI.DataSource_Description,
DI.Primary_Care_System
Thanks for all your help and esp grant for his patience.
Cheers,
Sree
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply