select random from the records with last date

  • hi all, i want to make faster e more elegant a procedure, if possible transorm into a single query.
    this the specifications:

    the procedure extract n random records betweeen those with greater date.
    if the number of docs  with  greater date is lower then n, the procedure will consider the docs with previous date too.

    this is a simplified version of the proc to focus the logic:

    set nocount on
    declare @x int=4 -- number of docs to return

    declare @test-2 table(docId int, docDate datetime)
    declare @validDates table(docDate datetime)

    insert into @test-2(docId, docDate)
    values (1, '20180724')
    ,(2, '20180724')
    ,(3, '20180720')
    ,(4, '20180719')
    ,(5, '20180719')
    ,(6, '20180719')
    ,(7, '20180719')
    ,(8, '20180710')
    ,(9, '20180710')
    ,(10, '20180710')

    declare @docDate datetime
    declare @docCount int = 0
    declare @totCount int = 0
    declare @stopLoop int = 0
    DECLARE c CURSOR  FOR
        select docdate, COUNT(*) docCount from @test-2
        group by docDate
        order by 1 desc
    OPEN c
    FETCH NEXT FROM c INTO @docDate, @docCount

    WHILE @@FETCH_STATUS = 0 and @stopLoop = 0
    BEGIN
        
        print '@docDate=' + cast(@docDate as varchar) + '@docCount=' + cast(@docCount as varchar)
        insert into @validDates(docDate) values(@docDate)

        set @totCount = @totCount + @docCount

        if @totCount>=@x begin
            set @stopLoop=1
        end
        
        FETCH NEXT FROM c INTO @docDate, @docCount
    END 

    CLOSE c;
    DEALLOCATE c;

    set rowcount @x

    select t.*, NEWID() ordine from @test-2 t inner join @validDates d on t.docDate = d.docDate
    order by ordine

  • DECLARE @CutOffDate DATE = '20180719'

    SELECT TOP(@x) *

    FROM @test-2

    WHERE docDate <= @CutOffDate

    ORDER BY docDate DESC, NEWID()

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The main reason that your procedure is slow is that you are using a CURSOR.  SQL is optimized for set-based operations, but the CURSOR forces it run for each member of the set separately rather than operating on the set as a whole.  Getting rid of the CURSOR will almost entirely be a far greater improvement than any other optimization that you can do.

    Here is another approach that is a bit more flexible than the TOP()/BOTTOM().  It was specifically designed for paging results, so it can be used to show results in the middle of the range rather than just the top/bottom.


    select docdate, docId
    from @test-2
    order by docDate desc, NEWID()
    OFFSET 0 ROWS
    FETCH NEXT @x ROWS ONLY

    The execution plans for top/bottom and offset/fetch are superficially identical.  I didn't dig into it to see if there were any buried differences.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply