Need help optimizing a query

Question

Need help optimizing a query

jrestuccio

Valued Member

Points: 72
More actions
April 8, 2013 at 2:48 pm

#146045

Just fair warning, I am definitely a newbie, I fell into this work so to speak.
I have a query that is taking between 30 seconds to 30 minutes depending on how much data is being run at the current time. I used SQL Server Management Studio Express to analyze the query and show me the execution plan with statistics ON.
I have since discovered where the problem is. I have one table in my database where we collect information, this information is updated daily. In fact it's not really updated, it's brand new data everyday collected from various websites.
Because this table collects brand new data daily it is not indexed, it's a HEAP. There are between 4-6 million rows of data on any given day depending on how much data is run that day. We clear all records older then 3 days from this table every night.
Now that I've explained the table a little, here's where the problem comes in: (This is not the entire stored procedure just the part that takes 98% operator cost time)
-------------------------------------------------------------------
-- GET THE RATES IN TEMP TABLE
-------------------------------------------------------------------
SELECT u.*,
NULL AS RateTypeID,
NULL AS RoomTypeID,
NULL ASRatePriority,
NULL AS RoomPriority
INTO #RESULTS
FROM TFX_V2_UnfilteredRates u
WHERE SiteID IN
(
SELECT * FROM #SITES
)
ANDHotelID IN
(
SELECT * FROM #HOTELS
)
AND timeInserted > @JobDate
DROP TABLE #SITES
DROP TABLE #HOTELS
----------------------------------------------
It takes far too long to scan this table for the appropriate rates. Any suggestions/ideas on how to make this process run more efficiently would greatly be appreciated. I've thought about creating an index for the TFX_v2_unfiltedrates table which is the one that is 5million rows give or take. However I am worried that it would slow down my inserts on that table, since we need to insert data very quickly.
If you need more info please let me know, however I was trying to prevent from pasting the entire stored procedure.
Thanks
-Josh

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 1

Heya Josh,

You're most likely on the right track with the heap, but for optimization items there's a bunch of data most of us prefer to see before we start bounding through possibilities... primarily because we'd like to show you the right way on the first try.

If you take a look in my signature below, in the 'help on indexing/tuning' link there's a list of what we'll want and it walks you through getting it for us. Short form: All tables involved and their indexes (including the other tables), the query itself in full form, and the .sqlplan from an actual execution (as the heap stands now).

From there, we'll be able to analyze its connections and expectations and get you on track pretty quickly.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

jrestuccio Valued Member Points: 72 More actions · Answer 2

Here you go, I tried to include all relevant information.

Again thanks for helping!

-Josh

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

jrestuccio (4/8/2013)
Here you go, I tried to include all relevant information.
Again thanks for helping!
-Josh

Nope, not what I'd want to see. I'd want to see the following:

1) DDL for the tables including the DDL for indexes

2) Actual execution plan as a .sqlplan file, not XML plans you included in the spreadsheets.

3) your current code

jrestuccio Valued Member Points: 72 More actions · Answer 4

Did you look at all the sheets within the excel workbook?

It has the stored procedure (my code), also it is a SQL 2000 database and the save execution plan is greyed out when I try to save it.

Sorry for the silly question but what are you needing when you say DDL for all tables and indexes?

What's a quick way to pull that?

I thought I had this info on the "Relevant Tables and Indexes" sheet.

-Josh

Sean Lange SSC Guru Points: 286590 More actions · Answer 5

jrestuccio (4/9/2013)
Did you look at all the sheets within the excel workbook?
It has the stored procedure (my code), also it is a SQL 2000 database and the save execution plan is greyed out when I try to save it.
Sorry for the silly question but what are you needing when you say DDL for all tables and indexes?
What's a quick way to pull that?
I thought I had this info on the "Relevant Tables and Indexes" sheet.
-Josh

What we want to see is ddl (create table statements and create index statements), sample data (insert statements). This is all detailed in the link in my signature about best practices when posting questions. The idea here is so that we can recreate your tables and such in our environment. We also need to see the actual execution plan.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 6

try this index on your heap table......not quite sure why you are worried about data load speeds...seems you are already deleting data over three days old and adding new. Therefore I am going to assume you have a period at some time to perform this task. You could drop the index first and then recreate once loaded...that may help...but an index on your date may also assist the delete.

anyways...food for thought...here is some test data to paly with...its not a replica of your system ...but it does have 5M rows and some hotels/sites.

kind regards

USE [tempdb]

GO

DROP TABLE [dbo].[Hotel]

GO

DROP TABLE [dbo].[Site]

GO

DROP TABLE [dbo].[TransData]

GO

SELECT TOP 5000000 ---- NOTE 5 MILLION rows

TranID = IDENTITY(INT, 1, 1),

SiteID = 1 + CAST(Abs(Checksum(Newid()) % 10000) AS INT),

HotelID = 1 + CAST(Abs(Checksum(Newid()) % 1000 ) AS INT),

SalesAmount = CAST(Rand(Checksum(Newid())) * 99 + 1 AS DECIMAL(5, 2)),

TransDate = Dateadd(dd, Abs(Checksum(Newid()))%Datediff(dd, '2010', '2013'), '2010')

INTO TransData

FROM sys.all_columns ac1

CROSS JOIN sys.all_columns ac2

CROSS JOIN sys.all_columns ac3

SELECT TOP 500

SiteID = 1 + CAST(Abs(Checksum(Newid()) % 10000) AS INT)

INTO Site

FROM sys.all_columns ac1

CROSS JOIN sys.all_columns ac2

GO

SELECT TOP 50

HotelID = 1 + CAST(Abs(Checksum(Newid()) % 1000) AS INT)

INTO Hotel

FROM sys.all_columns ac1

CROSS JOIN sys.all_columns ac2

GO

set statistics time, IO on

declare @jobdate as datetime

set @jobdate = '2011-07-09'

SELECT Hotel.HotelID, Site.SiteID, TransData.TransDate

FROM TransData INNER JOIN

Site ON TransData.SiteID = Site.SiteID INNER JOIN

Hotel ON TransData.HotelID = Hotel.HotelID

WHERE (TransData.TransDate = @jobdate)

set statistics time, IO OFF

/*create index*/

CREATE NONCLUSTERED INDEX [NIX_transdata] ON [dbo].[TransData]

(

[TransDate] ASC,

[HotelID] ASC,

[SiteID] ASC

) ON [PRIMARY]

GO

set statistics time, IO on

declare @jobdate as datetime

set @jobdate = '2011-07-09'

SELECT Hotel.HotelID, Site.SiteID, TransData.TransDate

FROM TransData INNER JOIN

Site ON TransData.SiteID = Site.SiteID INNER JOIN

Hotel ON TransData.HotelID = Hotel.HotelID

WHERE (TransData.TransDate = @jobdate)

set statistics time, IO OFF

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 7

Also, if you are adding and deleting data, you really should create a clustered index on the table. As a heap it will just continue to grow and not reuse empty space allocated to the table.

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 8

Lynn Pettis (4/9/2013)
Also, if you are adding and deleting data, you really should create a clustered index on the table. As a heap it will just continue to grow and not reuse empty space allocated to the table.

Thanks Lynn

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

jrestuccio Valued Member Points: 72 More actions · Answer 9

Thank you SO MUCH for the replies, I am trying to take it all in.

Let me describe a little bit about why the data is deleted so regularly and why it's important to not slow down insert speeds.

Imagine 1000 searches being performed at one time, and as soon as the search is done, it collects the data and inserts the data into the TFX_v2_unfilteredrates table. The reason we need this to happen quickly is because we have a massive amount of searches to complete, and since the data needs to be FRESH it needs to be collected within 1 day.

The reason for deleteing the data is merely to keep the size of the table down to a workable size. Since I believe our queries are poorly written........

-Josh

Jason Wolfkill SSCrazy Eights Points: 9832 More actions · Answer 10

jrestuccio (4/9/2013)
Thank you SO MUCH for the replies, I am trying to take it all in.
Let me describe a little bit about why the data is deleted so regularly and why it's important to not slow down insert speeds.
Imagine 1000 searches being performed at one time, and as soon as the search is done, it collects the data and inserts the data into the TFX_v2_unfilteredrates table. The reason we need this to happen quickly is because we have a massive amount of searches to complete, and since the data needs to be FRESH it needs to be collected within 1 day.
The reason for deleteing the data is merely to keep the size of the table down to a workable size. Since I believe our queries are poorly written........
-Josh

Assuming that your hardware is spec'ed appropriately for the workload, even 1,000 inserts of 5,000 rows each (5 million rows total) probably shouldn't take more than five or ten minutes, even with a couple or three indexes to update. This code inserts 5 million rows into a test table - it ran on my DEV server in about 30 seconds:

create table dbo.insert_test (date_collected datetime, hotel_name varchar(20), hotel_id int, rate decimal(6,2))

go

set nocount on

declare @i int = 1

while @i<=1000

begin

insert into dbo.insert_test

select dateadd(day, (@i*-1), getdate()), 'xxxxxxxxxxxxxxxxxxxx', t.N, 1.32 * t.N

from .dbo.tally t

where t.N <= 5000

set @i = @i+1

end

If you're avoiding indexes to save a few minutes once a day when the table is loaded with new data, but the absence of indexes is costing you several minutes every time you run query, you're shooting yourself in the foot. Before you suffer through the poor performance of your query much longer, I'd investigate the impact of a proper indexing scheme on both the insert/delete process and the query performance to make sure you're striking an appropriate balance.

Jason Wolfkill

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 11

Sorry, got inundated yesterday, but wanted to catch up with this. As mentioned, we really need everything for tuning assistance. The article describes the easiest ways to get it for us.

jrestuccio (4/9/2013)
Let me describe a little bit about why the data is deleted so regularly and why it's important to not slow down insert speeds.

Even with a clustered index you won't see a significant difference in insert speeds. Start doing indexed views and triggers and that's a different story, or overloading the non-clustered indexing.

Imagine 1000 searches being performed at one time, and as soon as the search is done, it collects the data and inserts the data into the TFX_v2_unfilteredrates table.

Not really an issue. I/O is I/O. You might even get better performance with a split index to avoid hotspotting and using a really loose fillfactor, but that's a different story.

The reason we need this to happen quickly is because we have a massive amount of searches to complete, and since the data needs to be FRESH it needs to be collected within 1 day.
The reason for deleteing the data is merely to keep the size of the table down to a workable size. Since I believe our queries are poorly written........

It's hard not to have poorly written queries when those queries have no indexes to work from. Get indexes into place and you may find you can actually keep your archive.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

jrestuccio Valued Member Points: 72 More actions · Answer 12

Hey everyone, thanks again for the replies!

I am still a little confused on the best way to supply the DDL information, in the articles you link to it says the SP_help output should work, which I included in my spreadsheet.

Also, the "save execution plan" is greyed out in SSMS express, so what's the best way to get you the execution plan?

Again this is a SQL Server 2000 database.

Thanks again! Also, not sure if I am breaking any forum rules or anything (hope not) but how much would any of you charge as a consultant to take a hands on look?

-Josh

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

jrestuccio (4/12/2013)
Also, the "save execution plan" is greyed out in SSMS express, so what's the best way to get you the execution plan?

See the article in Craig's sig. Describes how to get an exec plan in SQL 2000, easiest way to save it is in a spreadsheet. Haven't looked at your spreadsheet to see what's in there, a little busy with stuff.

Thanks again! Also, not sure if I am breaking any forum rules or anything (hope not) but how much would any of you charge as a consultant to take a hands on look?

If by hands-on you mean remote, disconnected and across a couple of oceans, roughly $100/hr with a minimum of 8 hours. If you want in-person then I can't help you there.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

Lynn Pettis (4/8/2013)
jrestuccio (4/8/2013)
Here you go, I tried to include all relevant information.
Again thanks for helping!
-Josh
2) Actual execution plan as a .sqlplan file, not XML plans you included in the spreadsheets.

SQL 2000 forum. Hence assumed to be a SQL 2000 server unless specified other. Hence no XML plans.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass