Cardinality Estimation

Question

Post reply

Cardinality Estimation

manujaidka

SSC Veteran

Points: 269
More actions
July 15, 2014 at 11:14 pm

#291156

May I know how query engine estimated 68412.4 rows when we execute below query on sql2012 adventureworks database?
SELECT
*
FROM Production.TransactionHistory AS TH
WHERE
TransactionID BETWEEN 100000 AND 168412
I understand that all rows in this table are distinct and 1/113443 gives 0.00000881500 as all density value but, I am unable to guess how 68412.4 has been derived.
Many Thanks,
Manu

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004582 More actions · Answer 1

I don't know for sure but when was the last time you rebuilt stats and are you using just sampled stats or full stats?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

manujaidka SSC Veteran Points: 269 More actions · Answer 2

Jeff,

Stats were updated today itself with full scan. In fact, I downloaded this database today morning only.

Question strike me while reading http://sqlperformance.com/2014/01/sql-plan/cardinality-estimation-for-multiple-predicates

Thanks,

MJ

@SQLFRNDZ SSCrazy Eights Points: 9095 More actions · Answer 3

No of estimated rows are derived from last updated statistics. as long as your stats are updated the estimated number of rows will match to the actual number of rows.

[font="Tahoma"]
--SQLFRNDZ[/url]
[/font]

manujaidka SSC Veteran Points: 269 More actions · Answer 4

I am happy with the results provided by sql engine but, my question is how exactly this figure (estimated rows 68412.4) is derived.

Thanks,

Manu Jaidka

Grant Fritchey SSC Guru Points: 399016 More actions · Answer 5

What about the histogram in the statistics. How many rows would fall within the range you specified within the histogram?

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Steve Jones - SSC Editor SSC Guru Points: 737751 More actions · Answer 6

This shows some of the calculations used in the new and old estimators. The 2012 version is the old one:

http://blogs.technet.com/b/dataplatforminsider/archive/2014/03/17/the-new-and-improved-cardinality-estimator-in-sql-server-2014.aspx

It's not a complex calculation, but it requires a bit of work for you to gather information and determine what values are being used.

This white paper also walks through some of the calculations involved: http://technet.microsoft.com/en-us/library/dd535534(v=SQL.100).aspx

manujaidka SSC Veteran Points: 269 More actions · Answer 7

Grant,

There are 113443 rows in this table and all rows are distinct rows but, I ma confused why optimizer estimated 68412.4 rows than 68412??

Thank you..

Grant Fritchey SSC Guru Points: 399016 More actions · Answer 8

manujaidka (7/16/2014)
Grant,
There are 113443 rows in this table and all rows are distinct rows but, I ma confused why optimizer estimated 68412.4 rows than 68412??
Thank you..

Don't just look at the uniqueness of the data. Look at the distribution of the data within the histogram. You might see that there are two rows showing an estimated value of 1, or you might see 200 rows with varying estimates. But the cardinality estimation doesn't just run off the density. it also uses the histogram and if there are 200 rows in the histogram with varying estimates, it can help you understand what's going on.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

manujaidka SSC Veteran Points: 269 More actions · Answer 9

Thanks everyone for your prompt replies:-)

I was mistakenly taking whole column density (0.000008815) into consideration rather than the predicate one (0.603061).

Predicate--> WHERE TH.TransactionID BETWEEN 100000 AND 168412 --> Selectivity is 68413 / 113443 = 0.603061

Optimizer got 68412.4 by multiplying number of rows in table with selectivity--> 113442*0.603061

Thanks again to you guys!!

Cardinality Estimation

Cookies on SQLServerCentral