July 3, 2017 at 10:18 pm
I have a small table Members (1000 rows) that contains member information.
I have a large view History (1bn rows) that contain member transaction history.
I need to find the latest transaction in History for each Member. I tried using Cross Apply, but am getting a time out error when running. I only have limited read rights on the server, so I can not see the view definition, nor can I run an execution plan.
My thought process below was to
1.) Choose a DATEOFSERVICE within the last 30 days to limit rows to work with
2.) Group from most common attribute to least
However, this still did not work and could not return results within the 6 minute timeout. Any ideas for a better way to approach this issue?
SELECT * FROM MEMBERS M
CROSS APPLY (SELECT TOP 1 DATEOFSERVICE,BATCHID, MEMBERID FROM vwHISTORY
GROUP BY MEMBERID,BATCHID, DATEOFSERVICE HAVING MEMBERID = M.MEMBERID AND DATEOFSERVICE >= '20170601'
ORDER BY DATEOFSERVICE DESC) A
July 4, 2017 at 1:30 am
SirWittles - Monday, July 3, 2017 10:18 PMI have a small table Members (1000 rows) that contains member information.
I have a large view History (1bn rows) that contain member transaction history.I need to find the latest transaction in History for each Member. I tried using Cross Apply, but am getting a time out error when running. I only have limited read rights on the server, so I can not see the view definition, nor can I run an execution plan.
My thought process below was to
1.) Choose a DATEOFSERVICE within the last 30 days to limit rows to work with
2.) Group from most common attribute to leastHowever, this still did not work and could not return results within the 6 minute timeout. Any ideas for a better way to approach this issue?
SELECT * FROM MEMBERS M
CROSS APPLY (SELECT TOP 1 DATEOFSERVICE,BATCHID, MEMBERID FROM vwHISTORY
GROUP BY MEMBERID,BATCHID, DATEOFSERVICE HAVING MEMBERID = M.MEMBERID AND DATEOFSERVICE >= '20170601'
ORDER BY DATEOFSERVICE DESC) A
I think you're overcomplicating it. If all you want is the most recent row (by DateOfService) from the history view, then something like this should do it:SELECT m.*, a.*
FROM MEMBERS m
CROSS APPLY (
SELECT TOP(1)
h.DATEOFSERVICE, h.BATCHID, h.MEMBERID
FROM vwHISTORY h
WHERE h.MEMBERID = M.MEMBERID
ORDER BY h.DATEOFSERVICE DESC
) a
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 4, 2017 at 1:55 am
Without having access to the views and underlying tables/index its impossible to give a good answer.
not knowing the data also doesn't help.
You will need to get your DBA's involved in this in order to determine the best way to do it.
But you could try the following.
-- Assumption - batchid is incremented sequentially
if object_id('tempdb..#t1') is not null
drop table #t1;
select vh.MEMBERID
, MAX(vh.DATEOFSERVICE) as DATEOFSERVICE
, MAX(vh.BATCHID) as BATCHID
into #t1
from vwHISTORY vh
where vh.DATEOFSERVICE >= '20170601' -- try without this to see if performance is different
group by vh.MEMBERID
select m.*
, t.DATEOFSERVICE
, t.BATCHID
from MEMBERS m
inner join #t1 t
on t.MEMBERID = m.MEMBERID
-- Assumption - batchid is NOT incremented sequentially
-- requires 2 steps
-- first one to get the highest date of service
-- second to get the highest batchid for the found date of service
if object_id('tempdb..#t1') is not null
drop table #t1;
select vh.MEMBERID
, MAX(vh.DATEOFSERVICE) as DATEOFSERVICE
, MAX(vh.BATCHID) as BATCHID
into #t1
from vwHISTORY vh
where vh.DATEOFSERVICE >= '20170601' -- try without this to see if performance is different
group by vh.MEMBERID
if object_id('tempdb..#t2') is not null
drop table #t2;
select vh.MEMBERID
, vh.DATEOFSERVICE
, max(vh.BATCHID) as BATCHID
into #t2
from vwHISTORY vh
inner join #t1 t
on t.MEMBERID = vh.MEMBERID
and t.DATEOFSERVICE = vh.DATEOFSERVICE
group by vh.MEMBERID
, vh.DATEOFSERVICE
select m.*
, t.DATEOFSERVICE
, t.BATCHID
from MEMBERS m
inner join #t2 t
on t.MEMBERID = m.MEMBERID
July 4, 2017 at 8:56 am
SirWittles - Monday, July 3, 2017 10:18 PMI have a small table Members (1000 rows) that contains member information.
I have a large view History (1bn rows) that contain member transaction history.I need to find the latest transaction in History for each Member. I tried using Cross Apply, but am getting a time out error when running. I only have limited read rights on the server, so I can not see the view definition, nor can I run an execution plan.
My thought process below was to
1.) Choose a DATEOFSERVICE within the last 30 days to limit rows to work with
2.) Group from most common attribute to leastHowever, this still did not work and could not return results within the 6 minute timeout. Any ideas for a better way to approach this issue?
SELECT * FROM MEMBERS M
CROSS APPLY (SELECT TOP 1 DATEOFSERVICE,BATCHID, MEMBERID FROM vwHISTORY
GROUP BY MEMBERID,BATCHID, DATEOFSERVICE HAVING MEMBERID = M.MEMBERID AND DATEOFSERVICE >= '20170601'
ORDER BY DATEOFSERVICE DESC) A
1) Based on naming convention, you almost certainly have a view in play. That could be G-d knows what.
2) It is ridiculous to think that you can tune a query without being able to see the query plan, even the estimated one.
3) You don't need a GROUP BY to find the most recent something. TOP 1 ... ORDER BY date DESC is all.
4) If you can see the definition of the objects, then we need the INDEXING on the history table to be able to help. If you can't see that, then this also falls under the "ridiculous" category mentioned in item 2.
5) Can you rewrite the query to just do item 3 for ONE of your members? Do this with SET STATISTICS IO ON. and if at all possible just directly access the history table - take the VIEW out of the equation. It is almost certainly not required for your stated need.
6) Barring other information, I am going to say that you are doing a TABLE SCAN on the history table for every member due to suboptimal indexing. Hopefully doing just ONE member will allow the query to complete and you can see the bajillion reads on the history table caused by the table scan.
7) WITH proper indexing the simplified version of this query should take no more than a few hundred ms for all 1000 members.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 4, 2017 at 9:23 am
TheSQLGuru - Tuesday, July 4, 2017 8:56 AMSirWittles - Monday, July 3, 2017 10:18 PMI have a small table Members (1000 rows) that contains member information.
I have a large view History (1bn rows) that contain member transaction history.I need to find the latest transaction in History for each Member. I tried using Cross Apply, but am getting a time out error when running. I only have limited read rights on the server, so I can not see the view definition, nor can I run an execution plan.
My thought process below was to
1.) Choose a DATEOFSERVICE within the last 30 days to limit rows to work with
2.) Group from most common attribute to leastHowever, this still did not work and could not return results within the 6 minute timeout. Any ideas for a better way to approach this issue?
SELECT * FROM MEMBERS M
CROSS APPLY (SELECT TOP 1 DATEOFSERVICE,BATCHID, MEMBERID FROM vwHISTORY
GROUP BY MEMBERID,BATCHID, DATEOFSERVICE HAVING MEMBERID = M.MEMBERID AND DATEOFSERVICE >= '20170601'
ORDER BY DATEOFSERVICE DESC) A1) Based on naming convention, you almost certainly have a view in play. That could be G-d knows what.
2) It is ridiculous to think that you can tune a query without being able to see the query plan, even the estimated one.
3) You don't need a GROUP BY to find the most recent something. TOP 1 ... ORDER BY date DESC is all.
4) If you can see the definition of the objects, then we need the INDEXING on the history table to be able to help. If you can't see that, then this also falls under the "ridiculous" category mentioned in item 2.
5) Can you rewrite the query to just do item 3 for ONE of your members? Do this with SET STATISTICS IO ON. and if at all possible just directly access the history table - take the VIEW out of the equation. It is almost certainly not required for your stated need.
6) Barring other information, I am going to say that you are doing a TABLE SCAN on the history table for every member due to suboptimal indexing. Hopefully doing just ONE member will allow the query to complete and you can see the bajillion reads on the history table caused by the table scan.
7) WITH proper indexing the simplified version of this query should take no more than a few hundred ms for all 1000 members.
Almost certainly a view in play indeed π
Cheers!
July 4, 2017 at 10:02 am
Almost certainly a view in play indeed
Cheers!
I travel a quite a lot for client visits and especially to present at conferences. So I tell you what Jacob: if those three fields can't be easily acquired from a single table or at worst a simple 2-table join, I will by you a dinner should we ever be in the same town at the same time. π
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 4, 2017 at 10:14 am
TheSQLGuru - Tuesday, July 4, 2017 10:02 AMAlmost certainly a view in play indeed
Cheers!
I travel a quite a lot for client visits and especially to present at conferences. So I tell you what Jacob: if those three fields can't be easily acquired from a single table or at worst a simple 2-table join, I will by you a dinner should we ever be in the same town at the same time. π
I don't find that particularly unlikely, so I'm hesitant to agree to this arrangement (unless I don't have to put up any stakes).
I was just pointing out that the OP had already explicitly said History was a view, so no fancy inferences were necessary π
On second thought, though, I have yet to meet any fellow Threadizens, so it might be worth putting up some stake just for that π
Cheers!
July 4, 2017 at 1:42 pm
Jacob Wilkins - Tuesday, July 4, 2017 10:14 AMTheSQLGuru - Tuesday, July 4, 2017 10:02 AMAlmost certainly a view in play indeed
Cheers!
I travel a quite a lot for client visits and especially to present at conferences. So I tell you what Jacob: if those three fields can't be easily acquired from a single table or at worst a simple 2-table join, I will by you a dinner should we ever be in the same town at the same time. π
I don't find that particularly unlikely, so I'm hesitant to agree to this arrangement (unless I don't have to put up any stakes).
I was just pointing out that the OP had already explicitly said History was a view, so no fancy inferences were necessary π
On second thought, though, I have yet to meet any fellow Threadizens, so it might be worth putting up some stake just for that π
Cheers!
It is kind of neat actually to meet someone you have come across on forums. It is actually one of my favorite things about conferences - seeing who I meet FtF while there.
You don't need to put up any stakes for this one. But if you would like to, how about you agree to ask your company to let me spend one hour doing any form of SQL Server performance review they feel they need ... at no cost. :w00t: No risk for you there (well, unless I drop a database or something - then they might be a bit peeved at you). π
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 5, 2017 at 11:51 pm
Thank you for all the input guys. I tried a little of everything to get this tuned, and agree that it is pointless to try to optimize if I can't see basic view and query plan information.
I believe another part of the performance issue is that the member identifier that I was joining and filtering on can contain leading zeros, so it is a varchar field instead of an int.
In any event, I am happy to report I located a much smaller table that was able to give me the same info needed, and the query now processes in seconds as expected.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply