Tuning help

Question

Post reply

Tuning help

Analyst

Ten Centuries

Points: 1063
More actions
June 5, 2008 at 3:01 am

#188512

Hi,
I have a table with appr. 10 million records.(history data)
While trying to retrieve records with a max query it takes around 4 minutes (appr. 20 lakh records)
I have included indexes ; clustered index on the data but so far no partioning has been done
Please let me know Will retieving data be faster if partioning is done?
And also steps for the same ...
Thanks!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

ChiragNS One Orange Chip Points: 26137 More actions · Answer 1

Hi

Is your query using the clustered index?

Does your query have a where clause? If it does and you partition your data according to the condition in the where clause AND your query does not need to access more than 1 partition then i think partition might be helpfull. Whew:D

Other opinions most welcome..

"Keep Trying"

Analyst Ten Centuries Points: 1063 More actions · Answer 2

Yes teh indexes are used in the where clause, a glimpse of the query. Highlighted are the indexes used:

Createddatetime uses clustered index and sid uses non clustered

SELECT * FROM Ex A WHERE

( A.CreatedDateTime in (select max(CS.CreatedDateTime) from Ex CS where CS.conId = A.ConId group by Conid))

AND ( A.SId in (Select Max(CS.SId) from EX CS where CS.ConId = A.ConId and CS.CreatedDateTime = A.CreatedDateTime group by ConId))

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 3

There's "used" and there's "used"

Do the indexes get scanned or is there a seek operation against them? If scanned, how many rows are reported as being scanned?

Also, the code snippet you showed was using IN clauses with select statements. These should be converted into JOIN statements. They will perform better.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Analyst Ten Centuries Points: 1063 More actions · Answer 4

Analyst

Ten Centuries

Points: 1063

June 5, 2008 at 7:00 am

#824767

Am using 'IN' since its a self join

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 5

Not a problem. Just alias the table. Self-join or not, you're looking at the equivalent of a cursor.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

noeld SSC Guru Points: 96590 More actions · Answer 6

jananik (6/5/2008)
Yes teh indexes are used in the where clause, a glimpse of the query. Highlighted are the indexes used:
Createddatetime uses clustered index and sid uses non clustered
SELECT * FROM Ex A WHERE
( A.CreatedDateTime in (select max(CS.CreatedDateTime) from Ex CS where CS.conId = A.ConId group by Conid))
AND ( A.SId in (Select Max(CS.SId) from EX CS where CS.ConId = A.ConId and CS.CreatedDateTime = A.CreatedDateTime group by ConId))

you may want to try:

With CTE

AS

( SELECT col1,.... -- Use column names not "*"

ROW_NUMBER() OVER ( Partition by CondId, order by createdDateTime DESC, Sid DESC) rn

)

SELECT col1,.... -- Use column names not "*"

FROM CTE

where rn = 1

* Noel

RBarryYoung SSC Guru Points: 143329 More actions · Answer 7

Actually, the only thing that I see wrong with your query is that you are using GROUP BY in your subSelects when you do not need to.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

GSquared SSC Guru Points: 260824 More actions · Answer 8

Try this, see if it doesn't speed up the query a bit:

;with

CTE_CreateTime (ConID, Created) as

(select conid, max(createddatetime)

from Ex

group by conid),

CTE_SID (ConID, Created, [SID]) as

(select ex.conid, ex.createddatetime, max([sid])

from Ex

inner join CTE_CreateTime cte1

on ex.conid = cte1.conid

and ex.createddatetime = cte1.created

group by ex.conid, ex.createddatetime)

SELECT a.*

FROM Ex A

inner join CTE_SID

on A.conid = cte_sid.conid

and a.createddt = cte_sid.created

and a.[sid] = cte_sid.[sid]

I recommend replacing the "select *" with a specific list of columns. Even if it's the whole table, it's a good idea, for various reasons.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

RBarryYoung SSC Guru Points: 143329 More actions · Answer 9

GSquared (6/5/2008)
Try this, see if it doesn't speed up the query a bit:
;with
CTE_CreateTime (ConID, Created) as
(select conid, max(createddatetime)
from Ex
group by conid),
CTE_SID (ConID, Created, [SID]) as
(select ex.conid, ex.createddatetime, max([sid])
from Ex
inner join CTE_CreateTime cte1
on ex.conid = cte1.conid
and ex.createddatetime = cte1.created
group by ex.conid, ex.createddatetime)
SELECT a.*
FROM Ex A
inner join CTE_SID
on A.conid = cte_sid.conid
and a.createddt = cte_sid.created
and a.[sid] = cte_sid.[sid]

Actually, this gives almost identical performance and query plans as the "No Group By" version:

SELECT *

FROM Ex A

WHERE A.CreatedDateTime in (select max(CS.CreatedDateTime) from Ex CS

where CS.conId = A.ConId

--group by Conid --dont need these

)

AND A.SId in (Select Max(CS.SId) from EX CS

where CS.ConId = A.ConId

and CS.CreatedDateTime = A.CreatedDateTime

--group by ConId --dont need these

)

The "No Group By" version is about 00.01% faster according to the query plans because it eliminates an insignificant Sort step. Both are about 10% faster than the original.

To do any better, we really need to see your table DDL and your query plan.

Both are about

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

John.Sansom SSCertifiable Points: 7335 More actions · Answer 10

Hi Everyone,

Just to confuse the issue, given that you are working with historical (and therefore I assume relatively static) data, it sounds like you may have a good case for a Data Warehouse solution and might wish to concider the use of Analysis Services as a more robust approach to working with your data.

Cheers,

John

John Sansom (@sqlBrit) | www.johnsansom.com

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 11

jananik (6/5/2008)
While trying to retrieve records with a max query it takes around 4 minutes (appr. 20 lakh records)

That's 2,000,000 rows that you're returning to the screen... don't measure how long that takes... measure how long it takes to insert into a table with no indexes using SELECT/INTO... see how long that takes... better measure would just be to select into variables that throw the data away... that would be a closer measure of how the server is going to handle things

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)