November 18, 2005 at 8:20 am
The following are two update queries I received from a user. These queries are taking too long to execute and are causing our server to slow down drastically. I have documented the queries and have provided record counts. I know the user has tried different indexes (Clustered verses Non-Clustered.) Please reveiw the queries and let me know what are some things I can try to make these queries more efficient. Please suggest the best indexes to use and intermediate steps that may be used to simplify the Update Statements. One thing I noticed is that the user is doing a LEFT JOIN when really this could be an INNER JOIN. I do not know if this would slow this query that much though. (We have SQL Server 2000, SP3.)
-- FIRST UPDATE STATMENT
-- Try Update Statement with one join.
-- This still causes problems ... taking too long and slowing up the server.
UPDATE pemaster.dbo.tradeDaily
SET
sm_ind = s.StarMine_Indicator_Region_100,
sm_dt = s.as_of_date
FROM pemaster.dbo.tradeDaily AS d
LEFT JOIN pemaster.dbo.starmine_gbl_daily AS s
ON d.cusiph = s.then_cusip_sedol
AND d.td = s.as_of_date
-- FYI: These Selects on the TradeDaily and Starmine_Gbl_Daily Tables are also slow
SELECT DISTINCT td FROM pemaster.dbo.tradedaily --(50 Million Records)
SELECT DISTINCT as_of_date FROM pemaster.dbo.starmine_gbl_daily --(44 Million Records)
-- SECOND UPDATE STATMENT
--TradeDaily Table - Contains Pricing Data for over 2700 Compainies for
----over 2500 days. Originates from the QAI Front-End Application.
--Holt Table - Contains Security Valuation Data. Received from the Holt Vendor. Received weekly and monthly.
--Starmine Table - Contains Analysts Forecast Revisions for each company.
----Comes from a CD. Static data. Never updated. Received from the Starmine ----Vendor.
-- Updates the TradeDaily Table by joining the Holt, Starmine and Sloan Tables.
-- The updates are only for the new fields added to the TradeDate Table.
-- This is causing problems ... taking too long and slowing up the server.
UPDATE pemaster.dbo.tradeDaily
SET
pct_best = h.H_PCT_BEST,
holt_dtser = h.dtstr,
sm_ind = s.StarMine_Indicator_Region_100,
sm_dt = s.as_of_date,
sloan_cfc = c.sloan_cfc,
sloan_dt = c.bestAvailDate
FROM pemaster.dbo.tradeDaily AS d
LEFT JOIN pemaster.dbo.holt AS h
ON d.cusiph = h.H_cusip
AND h.dtstr =
-- For every record in the TradeDaily Table find the most recent Holt Record
-- searching back two years previous of the TradeDaily Date.
(SELECT MAX(h2.dtstr) FROM pemaster.dbo.holt AS h2
WHERE h2.H_cusip = d.cusiph AND h2.dtstr <= d.td
AND h2.dtstr > DATEADD(m,-2,d.td))
LEFT JOIN pemaster.dbo.starmine_gbl_daily AS s
ON d.cusiph = s.then_cusip_sedol
AND d.td = s.as_of_date
AND s.Security_Country_Code = 'US'
LEFT JOIN pemaster.dbo.sloan_cfc AS c
ON c.cusiph = d.cusiph
AND c.bestAvailDate =
-- For every record in the TradeDaily Table find the most recent Sloan Record
-- searching back two years previous of the TradeDaily Date.
(SELECT MAX(c2.bestAvailDate) FROM pemaster.dbo.sloan_cfc AS c2
WHERE d.cusiph = c2.cusiph AND c2.bestAvailDate <= d.td
AND c2.bestAvailDate > DATEADD(y,-2,d.td))
November 18, 2005 at 8:34 am
It is difficult to fully analyze these queries without knowing which indexes you already have on your tables. One thing I would recommend doing is taking these queries, placing them in Query Analyzer, and using the Index Tuning wizard. SQL Server will recommend indexes for you and tell you how much of an efficiency gain those index suggestions will give you.
Do not run the Index Tuning wizard against your production environment as it creates hypothetical indexes on your tables as it does it's analysis. Do this in your test sysytem and see what kind of improvements you see.
November 19, 2005 at 9:57 pm
"These Selects on the TradeDaily and Starmine_Gbl_Daily Tables are also slow
SELECT DISTINCT td FROM pemaster.dbo.tradedaily --(50 Million Records)
SELECT DISTINCT as_of_date FROM pemaster.dbo.starmine_gbl_daily --(44 Million Records)
"
What does "slow" mean ?
Please post the output of:
1. the detailed execution plan information for both statements by running "set showplan_all on"
2. run sp_spaceused 'dbo.tradedaily','true'
3. run sp_spaceused 'dbo.starmine_gbl_daily','true'
Please describe the server hardware including:
Number of processors, processor speed, bus speed, L2 cache, disk drive size and speed.
For disk drive layout:
What drives are in a RAID - 0, 1 , 5,10 ?
Where are the user database files? e.g. are the transaction log (ldf) and data (mdf) on seperate drives.
Is the tempdb on a seperate drive?
Are the tempdb data and transaction log on seperate drives?
How many datafiles for tempdb?
SQL = Scarcely Qualifies as a Language
November 20, 2005 at 10:38 am
Yup, I'd lose the OUTER JOIN if an INNER JOIN will do. Also, like someone else asked, "What is slow"? I normally shoot for about 500,000 rows per minute (sometimes you get lucky with more but that's what I shoot for) on such updates on a "quiet" table. That means that 44 million records should take somewhere around 88 minutes to update.
And, since this will be locking extents and possibly the whole table, you will having locking waits if other people are using the table with committed reads.
Slowing up the server? You bet it will... it's probably expanding the log file and it's probably doing it at 10% of the current size because most people don't make it big enough to handle things like this and they don't use a fixed expansion size. As it expands the log file, the whole world waits not to mention the disk activity logging that many records.
44 millions records is a huge update and the log file will be huge... consider breaking it into, say, 20 - 40 coherent pieces so it doesn't have to lock the whole table and the log doesn't get so big. For goodness sake, I hope you don't have a clustered index on any of the columns you're updating and (gotta say this to make me feel better), I hope the table has a primary key that isn't being updated.
You might also want to consider turning on SELECT INTO/BUCKCOPY and turning your UPDATE into a SELECT/INTO on a working table (prevents logging if other conditions met, as well. See Books Online). Once complete, you can apply indexes and a new primary key, rename the old table as something else and rename the new table as the old table. The renaming, no matter the size of the table, takes about 65 milliseconds so you're users would never see the rename. Of course, if hard disk space is a problem, this method is not for you. Although this method is VERY fast, it's still going to take a while just because of the shear volume but at least it stands a chance of getting done sometime in the same week.
The bottom line is, one way or another, you're going to have to divide and conquer on this one...
--Jeff Moden
Change is inevitable... Change for the better is not.
November 20, 2005 at 10:41 am
Almost forgot... does the table being updated have any triggers? That really make things slow for this big an update especially if those triggers are writing to audit tables or have RBAR (Row By Agonizing Row) code in them or some exotic joins to some long winded views.
If you use the SELECT/INTO method with the rename, don't forget to drop the triggers on the old table and to reinstantiate your triggers on the new one.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 20, 2005 at 10:51 am
One more thing! Don't even think of using the SELECT/INTO/RENAME method if others are updating the table at the same time or inserting new rows.... YOU WILL LOSE DATA. The SELECT/INTO method is for mass one-time updates while no one else is using adding to or modifying the table!
--Jeff Moden
Change is inevitable... Change for the better is not.
November 20, 2005 at 2:32 pm
You update WHOLE table, not only rows you need.
Correct options:
UPDATE d --pemaster.dbo.tradeDaily
SET
sm_ind = s.StarMine_Indicator_Region_100,
sm_dt = s.as_of_date
FROM pemaster.dbo.tradeDaily AS d
LEFT JOIN pemaster.dbo.starmine_gbl_daily AS s
ON d.cusiph = s.then_cusip_sedol
AND d.td = s.as_of_date
OR
UPDATE pemaster.dbo.tradeDaily
SET
sm_ind = s.StarMine_Indicator_Region_100,
sm_dt = s.as_of_date
FROM pemaster.dbo.starmine_gbl_daily AS s
WHERE pemaster.dbo.tradeDaily.cusiph = s.then_cusip_sedol
AND pemaster.dbo.tradeDaily.td = s.as_of_date
Same kind of changes to second query.
_____________
Code for TallyGenerator
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply