March 1, 2010 at 10:48 am
agustingarzon (3/1/2010)
I double posted by mistake, sorry.Here are all the plans in the attached zip.
I am having the same problem as Paul on the sqlplan attachments. The zip worked though. Thanks,...
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 1, 2010 at 10:55 am
Please...try this:
WITH Numbered
AS (
SELECT CF.ID,
rn = ROW_NUMBER() OVER (ORDER BY CF.ID ASC)
FROM dbo.ContactForms CF
),
Keys
AS (
SELECT TOP (20)
N.ID
FROM Numbered N
WHERE N.rn >= 30000
ORDER BY
N.rn
)
SELECT data.*
FROM Keys K
INNER
LOOP
JOIN dbo.ContactForms data
ON data.ID = K.ID
OPTION (FORCE ORDER);
Paul
edit: code clarity/layout
March 1, 2010 at 10:57 am
The text file in the ZIP is empty 🙁
The problem in your execution plan was the merge join. A poor choice by the optimizer.
Paul
March 1, 2010 at 11:00 am
Try it like this:
WITH
cteNumber AS
(
SELECT ID, ROW_NUMBER() OVER (ORDER BY ID ASC) AS IXX
FROM dbo.ContactForms
)
ctetop20 AS
(
SELECT TOP 20 num.*
FROM cteNumber num
WHERE num.IXX > 15000
)
SELECT data.*
FROM dbo.ContactForms data
INNER JOIN ctetop20 num
ON data.ID = num.ID
;
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 1, 2010 at 11:04 am
Paul White (3/1/2010)
The text file in the ZIP is empty 🙁The problem in your execution plan was the merge join. A poor choice by the optimizer.
Paul
Hmm, I thought that it was that it was doing the larger index first, when it should do the smaller first. But I guess that doesn't mean as much in a merge join, so ... hmm ...
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 1, 2010 at 11:05 am
Still a risk of the optimizer choosing a merge join with that one, Barry.
March 1, 2010 at 11:11 am
RBarryYoung (3/1/2010)
Hmm, I thought that it was that it was doing the larger index first, when it should do the smaller first. But I guess that doesn't mean as much in a merge join, so ... hmm ...
It means nothing with a one-to-one merge join, given pre-sorted inputs, as we have here.
The optimizer is seriously under-estimating the cost of scanning that clustered index.
Loop join is optimal with small inputs (not necessarily small tables).
The twenty rows coming from my non-clustered index needs to be the outer (driving) input.
Twenty seeks into the clustered index shouldn't take too long, even with huge LOBs off-row.
Paul
March 1, 2010 at 11:17 am
Paul White (3/1/2010)
Still a risk of the optimizer choosing a merge join with that one, Barry.
True, but I prefer to try to get the optimizer to "do the right thing" with using hints to force it if I can. That way, it still has a chance of adapting to future changes in statistical counts and distributions.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 1, 2010 at 11:21 am
RBarryYoung (3/1/2010)
Paul White (3/1/2010)
Still a risk of the optimizer choosing a merge join with that one, Barry.True, but I prefer to try to get the optimizer to "do the right thing" with using hints to force it if I can. That way, it still has a chance of adapting to future changes in statistical counts and distributions.
Indeed. In general I agree. My concern is that this thread needs a 'quick win' that is guaranteed. We can tweak later. We already know the optimizer has good statistics on the columns concerned (indexes rebuilt), so this is just a case of it getting things wrong. Maybe is won't choose a merge join for your query. This time. 😀
Paul
March 1, 2010 at 11:22 am
Paul White (3/1/2010)
...The optimizer is seriously under-estimating the cost of scanning that clustered index.
...
Yes, that's the big question here, why is it doing that? It's RowCount estimate is 66 for a Clustered Index scan with No Predicate on a table with 15,020 rows. Something is not right there. Sounds like bad statistics, but I don't see the little warning icon you usually get for that.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 1, 2010 at 11:24 am
Paul White (3/1/2010)
RBarryYoung (3/1/2010)
Paul White (3/1/2010)
Still a risk of the optimizer choosing a merge join with that one, Barry.True, but I prefer to try to get the optimizer to "do the right thing" with using hints to force it if I can. That way, it still has a chance of adapting to future changes in statistical counts and distributions.
Indeed. In general I agree. My concern is that this thread needs a 'quick win' that is guaranteed. We can tweak later. ...
Paul
Oops, I must have missed that. Sorry.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 1, 2010 at 11:29 am
RBarryYoung (3/1/2010)
Oops, I must have missed that. Sorry.
That's just my view - I may be wrong about it. Never any need to apologise.
It's 7:30am and I need to go very shortly.
I look forward to catching up on this later.
I'll leave it in your capable hands.
March 1, 2010 at 11:41 am
Last point: I have just looked at the what-about execution plan. Here (with 500 set as the limit, and no clustered index) the optimizer chooses a much better plan. I would expect it to be very much faster than the others. The clustered index was dropped for this one, so no merge join was possible (there's no covering n-c index for the final SELECT). Re-instating the clustered index and using my last posted query should be just as fast, if not slightly faster (fetching row data via the clustered index should be a bit quicker than the RID lookups). I doubt there would be very much in it for twenty rows though.
Paul
March 1, 2010 at 11:52 am
Dear Barry, the exec plan for your suggestion is the same as NewExec I just posted.
The approach is ok, but it's making a clust-ix-scan that scans through the first 15.000 records based on your example.
But if I add an ORDER BY num.IXX ASC as Paul suggested, the exec plan changes the large scan for a seek operation to get the exact 20 required rows.
I haven't run the test to see if it gets any faster, will do in a couple hours and post back.
Paul, if nvarchar-max columns are stored off row as you said and as explained in the documentation, why would the clustered index behave in such a way ? The 3 nvarchar(255) fields add up to 0.4 kbs in the largest row. I can't find a way to get the table "large value types out of row" value, I know of the sp_tableoption but it's only for setting the value, perhaps it's storing in row but that's not the default.
March 1, 2010 at 12:06 pm
agustingarzon (3/1/2010)
Dear Barry, the exec plan for your suggestion is the same as NewExec I just posted.The approach is ok, but it's making a clust-ix-scan that scans through the first 15.000 records based on your example.
But if I add an ORDER BY num.IXX ASC as Paul suggested, the exec plan changes the large scan for a seek operation to get the exact 20 required rows.
I haven't run the test to see if it gets any faster, will do in a couple hours and post back.
Paul, if nvarchar-max columns are stored off row as you said and as explained in the documentation, why would the clustered index behave in such a way ? The 3 nvarchar(255) fields add up to 0.4 kbs in the largest row. I can't find a way to get the table "large value types out of row" value, I know of the sp_tableoption but it's only for setting the value, perhaps it's storing in row but that's not the default.
Just to clarify, are you saying that the ORDER BY fixes it?
Hmm, maybe that's all it is then? The ORDER BY is supposed to be required for TOP, but they don't enforce that syntactically. So maybe the optimizer just gets dumb sometimes, without it?
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
Viewing 15 posts - 16 through 30 (of 46 total)
You must be logged in to reply to this topic. Login to reply