July 24, 2018 at 7:07 am
hi all, i want to make faster e more elegant a procedure, if possible transorm into a single query.
this the specifications:
the procedure extract n random records betweeen those with greater date.
if the number of docs with greater date is lower then n, the procedure will consider the docs with previous date too.
this is a simplified version of the proc to focus the logic:
set nocount on
declare @x int=4 -- number of docs to return
declare @test-2 table(docId int, docDate datetime)
declare @validDates table(docDate datetime)
insert into @test-2(docId, docDate)
values (1, '20180724')
,(2, '20180724')
,(3, '20180720')
,(4, '20180719')
,(5, '20180719')
,(6, '20180719')
,(7, '20180719')
,(8, '20180710')
,(9, '20180710')
,(10, '20180710')
declare @docDate datetime
declare @docCount int = 0
declare @totCount int = 0
declare @stopLoop int = 0
DECLARE c CURSOR FOR
select docdate, COUNT(*) docCount from @test-2
group by docDate
order by 1 desc
OPEN c
FETCH NEXT FROM c INTO @docDate, @docCount
WHILE @@FETCH_STATUS = 0 and @stopLoop = 0
BEGIN
print '@docDate=' + cast(@docDate as varchar) + '@docCount=' + cast(@docCount as varchar)
insert into @validDates(docDate) values(@docDate)
set @totCount = @totCount + @docCount
if @totCount>=@x begin
set @stopLoop=1
end
FETCH NEXT FROM c INTO @docDate, @docCount
END
CLOSE c;
DEALLOCATE c;
set rowcount @x
select t.*, NEWID() ordine from @test-2 t inner join @validDates d on t.docDate = d.docDate
order by ordine
July 24, 2018 at 7:39 am
DECLARE @CutOffDate DATE = '20180719'
SELECT TOP(@x) *
FROM @test-2
WHERE docDate <= @CutOffDate
ORDER BY docDate DESC, NEWID()
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 24, 2018 at 8:00 am
The main reason that your procedure is slow is that you are using a CURSOR. SQL is optimized for set-based operations, but the CURSOR forces it run for each member of the set separately rather than operating on the set as a whole. Getting rid of the CURSOR will almost entirely be a far greater improvement than any other optimization that you can do.
Here is another approach that is a bit more flexible than the TOP()/BOTTOM(). It was specifically designed for paging results, so it can be used to show results in the middle of the range rather than just the top/bottom.
select docdate, docId
from @test-2
order by docDate desc, NEWID()
OFFSET 0 ROWS
FETCH NEXT @x ROWS ONLY
The execution plans for top/bottom and offset/fetch are superficially identical. I didn't dig into it to see if there were any buried differences.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply