March 19, 2017 at 9:37 pm
These table contain many more columns in reality but I am simplifying to narrow to the issue.
Table: tblPerson
PersonID int PK
FirstName VARCHAR(100)
LastName VARCHAR(100)
Table: tblPersonPhoneNumber
PersonPhoneNumberID int PK
PersonID int
PhoneNumber VARCHAR(20)
IsPrimary BIT
PhoneNumberType int (1-Home, 2-Business, 3-Cell)
LastUpdate DATETIME
The report requirement is for me to pull the Primary Phone Number for each Person record JOINED by PersonID. That should have been simple BUT we have a data issue where
for any given PersonID in the tblPersonPhoneNumber table, there can be many PhoneNumber records and PhoneNumberTypes. There SHOULD be only one record with the IsPrimary bit set
to true. But the data issue is that we have more than one record with IsPrimary set to true. (ie. a Cell Phone and a Home record both set to true). So a new requirement was made to get around this and
that is to chose the IsPrimary record that has the more recent LastUpdate date only. I have tried a few things below but the return results are taking too long. Best time was 35 seconds and worst was like 3 minutes from all the things I tried just to return one Person with Primary Phone. One problem is the tblPersonPhoneNumber contains over 5 million records.
Here is what I have tried:
CREATE TABLE #PersonPhoneNumberMaxLastUpdate
(
PersonID INT PRIMARY KEY NONCLUSTERED,
PhoneNumber VARCHAR(20),
LastUpdate DateTime
)
CREATE INDEX IX_PPNMAX_LastUpdate ON #PersonPhoneNumberMaxLastUpdate(LastUpdate)
In the above code it was taking too long to load the temp table. .
I have also tried embedding grouping within the join instead of loading a temp table below:
SELECT
p.FirstName,
p.Lastname,
ppn.PhoneNumber
FROM tblPerson p
LEFT JOIN
(
SELECT a.PersonID,a.PhoneNumber FROM tblPersonPhoneNumber a
INNER JOIN (SELECT PersonID, Max( LastUpdate ) [LastUpdate]
FROM tblPersonPhoneNumber WHERE IsPrimary = 1
GROUP BY PersonID ) b ON a.PersonID = b.PersonID AND a.LastUpdate = b.LastUpdate
WHERE a.IsPrimary = 1
) ppn ON ppn.PersonID = p.PersonID
WHERE p.PersonID = 1
and finally I tried putting sub select code in a View like this and joining to the view
CREATE VIEW vwPersonPhoneNumberMaxLastUpdate
AS
SELECT a.PersonID,a.PhoneNumber FROM tblPersonPhoneNumber a
INNER JOIN (SELECT PersonID, Max( LastUpdate ) [LastUpdate]
FROM tblPersonPhoneNumber WHERE IsPrimary = 1
GROUP BY PersonID ) b ON a.PersonID = b.PersonID AND a.LastUpdate = b.LastUpdate
WHERE a.IsPrimary = 1
None of the above options worked in getting the query time down. I would like to get the time down to no more
than 5 seconds. Fixing the data issue isn't an option right now. So I have to come up with a way to get just one
of the primary records for any given PersonID. Any help would be much appreciated in coming up with a query
that can accomplish what I want and not have a big performance hit due to the 5 million records.
March 20, 2017 at 12:33 am
You could try to get the Primary number first ...
WITH ctePhoneNumber AS (
SELECT
pn.PersonID,
pn.PhoneNumber,
rn = ROW_NUMBER() OVER (PARTITION BY BY pn.PersonID ORDER BY pn.LastUpdate DESC)
FROM tblPersonPhoneNumber AS pn
-- WHERE IsPrimary = 1
)
SELECT
p.PersonID,
p.FirstName,
p.Lastname,
ppn.PhoneNumber
FROM tblPerson p
LEFT JOIN ctePhoneNumber AS ppn
ON p.PersonID = ppn.PersonID
WHERE ppn.rn = 1
OR ppn.rn IS NULL
March 20, 2017 at 1:05 am
Thank you. I will give this a try and let you know the results.
March 20, 2017 at 4:56 am
-- Here's the (fairly standard) APPLY version:
SELECT
p.PersonID,
p.FirstName,
p.Lastname,
x.PhoneNumber
FROM tblPerson p
OUTER APPLY (
SELECT TOP(1)
pn.PhoneNumber
FROM tblPersonPhoneNumber pn
WHERE pn.PersonID = p.PersonID
AND pn.IsPrimary = 1
ORDER BY pn.LastUpdate DESC
) x
-- you'll want an appropriate index for this to work well:
-- CREATE INDEX ix_Whatever ON tblPersonPhoneNumber (PersonID,LastUpdate) INCLUDE (PhoneNumber)
If it takes more than a second, post up an actual execution plan as a .sqlplan attachment
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply