estimated # of rows when modification count > 0

  • hi,

    I've got a table with a range of different integer values in 2 different columns and stats have been calculated on the data. Then, I've got modifications to the data like inserting a few rows into the table (not enough to trigger auto stats). I'm observing the following behavior.

    1. when querying on values that are outside the histogram values, seems like estimated rows will be equal to # of modifications to the table. Could anyone confirm that this is expected behavior?
    2. when querying on values within the histogram range, estimated # rows appears to be based on some calculation. But, would anyone please explain what the calculation is?

    Here's a sample script.

    --create table for testing purposes
    create table testtab
    (col1 int, col2 int)

    --insert 100k rows using value of 1
    DECLARE @count1 INT;
    SET @count1 = 1;
    WHILE @count1 <= 100000
    BEGIN
    INSERT INTO testtab VALUES(1, @count1)
    SET @count1 = @count1 + 1;
    END;

    --insert 10 rows with value of 2
    DECLARE @count2 INT;
    SET @count2 = 1;
    WHILE @count2 <= 10
    BEGIN
    INSERT INTO testtab VALUES(2, @count2)
    SET @count2 = @count2 + 1;
    END;

    --insert 5 rows with a value of 3
    DECLARE @count3 INT;
    SET @count3 = 1;
    WHILE @count3 <= 5
    BEGIN
    INSERT INTO testtab VALUES(3, @count3)
    SET @count3 = @count3 + 1;
    END;

    --insert 50k rows with a value of 4
    DECLARE @count4 INT;
    SET @count4 = 1;
    WHILE @count4 <= 50000
    BEGIN
    INSERT INTO testtab VALUES(4, @count4)
    SET @count4 = @count4 + 1;
    END;

    --execute query with filter condition to generate stats. Also, use optimization level = FULL
    select col2 from testtab where col1 = 1 OPTION (QUERYTRACEON 8757)

    --insert some more data, but not enought to trigger auto stats
    DECLARE @count5 INT;
    SET @count5 = 1;
    WHILE @count5 <= 13
    BEGIN
    INSERT INTO testtab VALUES(5, @count5)
    SET @count5 = @count5 + 1;
    END;

    --now, run query with predicate where value is outside of range of values.
    --estimated # of rows shows 13!
    select col2 from testtab where col1 = 6 OPTION (QUERYTRACEON 8757)

    --clear plan cache
    DBCC FREEPROCCACHE

    --go back and requery for col1 = 1 and estimated # of rows shows 100009!
    --how did optimizer estimated the 100009?
    select col2 from testtab where col1 = 1 OPTION (QUERYTRACEON 8757)

    Thanks!

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • With all the automatic query tuning stuff they did in 2022, who knows how the optimizer estimates things now?

    It'll be interesting if anyone shows up that actually knows.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply