June 1, 2006 at 9:15 am
I'm trying to run an update query between two tables. They share much of the same structure with 1 field being the clustered index on each table (Phone field). I have dropped and recreated each of the clustered indexes yet the time for me to run an update query using an inner join is absolutely horrendous.
Some background information on the problem:
Table1 has clustered index on Phone (22.4Million Records)
Table2 has clustered index on Phone (17.11Million Records)
Sample query:
update T2 set T2.Field1=T1.Field1,T2.Field2=T1.Field1 from Table2 T2 inner join Table1 T1 on T2.Phone=T1.Phone
My first test on this was to simply try to update 1000 records and the query has been running over 8 minutes and still nothing.
I've removed all indexes on both tables in order to remove that from being a possiblity, I have also tried forcing the QA to use Index Hints, and finally I have verified there are no transactions blocking this and still just simply can't get this query to update in anything I would consider reasonable.
Any ideas of where else I might be able to check?
June 1, 2006 at 9:19 am
What is the query execution plan ? (Paste the SQL into Query Analyser and hit CTRL-L)
Are any of the columns being updated foreign keys to other tables ?
Are there any triggers on the table being updated ?
June 1, 2006 at 9:29 am
No foreign keys, and no triggers.
These are both stand-alone tables that I'm basically just trying to update with a new quarterly update (Table1 is last Quarter, Table2 is this quarter basically) and I'm hoping this copy and paste shows you what you're asking for (or if you know of a good way to display it to you graphically let me know)
SET SHOWPLAN_ALL ON 11 1 0 NULL NULL 1 NULL NULL NULL NULL NULL NULL NULL NULL SETON 0 NULL
update C3 set C3.LC_Adsa=C2.LC_Adsa, C3.Attempts=C2.Attempts,C3.Available=1,C3.Updated=1,C3.LastDistributed=C2.LastDistributed from Cold3 C3 inner join Cold2 C2 on C3.Phone=C2.Phone 12 2 0 NULL NULL 2 NULL 1.5614053E+7 NULL NULL NULL 6409.3833 NULL NULL UPDATE 0 NULL
|--Index Update(OBJECT[Cold_Leads].[dbo].[Cold3].[IX_Updated]), SET[Updated1008]=[Cold3].[Updated], [Phone1009]=[C3].[Phone], [IdxBmk1007]=RaiseIfNull([Bmk1003]))) 12 3 2 Index Update Update OBJECT[Cold_Leads].[dbo].[Cold3].[IX_Updated]), SET[Updated1008]=[Cold3].[Updated], [Phone1009]=[C3].[Phone], [IdxBmk1007]=RaiseIfNull([Bmk1003])) NULL 1.5614053E+7 1.0079491E-2 31.228107 32 6409.3833 NULL NULL PLAN_ROW 0 1.0
|--Sort(ORDER BY[Cold3].[Updated] ASC, [C3].[Phone] ASC, [Bmk1003] ASC, [Act1006] ASC)) 12 4 3 Sort Sort ORDER BY[Cold3].[Updated] ASC, [C3].[Phone] ASC, [Bmk1003] ASC, [Act1006] ASC) NULL 3.1228106E+7 1598.491 762.12024 47 6166.9053 [C3].[Phone], [Bmk1003], [Cold3].[Updated], [Act1006] NULL PLAN_ROW 0 1.0
|--Split 12 5 4 Split Split NULL [Act1006] 3.1228106E+7 0.0 217.81604 47 3624.9395 [C3].[Phone], [Bmk1003], [Cold3].[Updated], [Act1006] NULL PLAN_ROW 0 1.0
|--Clustered Index Update(OBJECT[Cold_Leads].[dbo].[Cold3].[IX_Phone]), SET[Cold3].[Updated]=1, [Cold3].[LastDistributed]=[C2].[LastDistributed], [Cold3].[Attempts]=[C2].[Attempts], [Cold3].[Available]=1, [Cold3].[LC_ADSA]=[C2].[LC_ADS 12 6 5 Clustered Index Update Update OBJECT[Cold_Leads].[dbo].[Cold3].[IX_Phone]), SET[Cold3].[Updated]=1, [Cold3].[LastDistributed]=[C2].[LastDistributed], [Cold3].[Attempts]=[C2].[Attempts], [Cold3].[Available]=1, [Cold3].[LC_ADSA]=[C2].[LC_ADSA]) NULL 1.5614053E+7 1.0132615E-2 15.614054 75 3407.1235 [C3].[Phone], [Bmk1003], [Cold3].[Updated], [Expr1005] NULL PLAN_ROW 0 1.0
|--Top(ROWCOUNT est 0) 12 8 6 Top Top NULL NULL 1.5614053E+7 0.0 1.5614053 68 3389.9377 [Bmk1000], [C3].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed], [Cold3].[Updated] NULL PLAN_ROW 0 1.0
|--Parallelism(Gather Streams) 12 9 8 Parallelism Gather Streams NULL NULL 1.5614053E+7 0.0 128.39944 68 3388.3765 [Bmk1000], [C3].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed], [Cold3].[Updated] NULL PLAN_ROW -1 1.0
|--Hash Match(Aggregate, HASH[Bmk1000]), RESIDUAL[Bmk1000]=[Bmk1000]) DEFINE[C3].[Phone]=ANY([C3].[Phone]), [Cold3].[Updated]=ANY([Cold3].[Updated]), [C2].[LC_ADSA]=ANY([C2].[LC_ADSA]), [C2].[Attempts]=ANY([C2].[Attemp 12 10 9 Hash Match Aggregate HASH[Bmk1000]), RESIDUAL[Bmk1000]=[Bmk1000]) [C3].[Phone]=ANY([C3].[Phone]), [Cold3].[Updated]=ANY([Cold3].[Updated]), [C2].[LC_ADSA]=ANY([C2].[LC_ADSA]), [C2].[Attempts]=ANY([C2].[Attempts]), [C2].[LastDistributed]=ANY([C2].[LastDistributed]) 1.5614053E+7 652.91821 455.58984 68 3259.9771 [Bmk1000], [C3].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed], [Cold3].[Updated] NULL PLAN_ROW -1 1.0
|--Parallelism(Repartition Streams, PARTITION COLUMNS[Bmk1000])) 12 11 10 Parallelism Repartition Streams PARTITION COLUMNS[Bmk1000]) NULL 1.5614053E+7 0.0 153.95184 78 2151.469 [Bmk1000], [C3].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed], [Cold3].[Updated] NULL PLAN_ROW -1 1.0
|--Hash Match(Inner Join, HASH[C3].[Phone])=([C2].[Phone]), RESIDUAL[C2].[Phone]=[C3].[Phone])) 12 12 11 Hash Match Inner Join HASH[C3].[Phone])=([C2].[Phone]), RESIDUAL[C2].[Phone]=[C3].[Phone]) NULL 1.5614053E+7 614.02094 390.46759 78 1997.5171 [Bmk1000], [C3].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed], [Cold3].[Updated] NULL PLAN_ROW -1 1.0
|--Bitmap(HASH[C3].[Phone]), DEFINE[Bitmap1010])) 12 13 12 Bitmap Bitmap Create HASH[C3].[Phone]) [Bitmap1010] 1.7115258E+7 0.0 106.77637 48 265.37921 [Bmk1000], [C3].[Phone], [Cold3].[Updated] NULL PLAN_ROW -1 1.0
| |--Parallelism(Repartition Streams, PARTITION COLUMNS[C3].[Phone])) 12 14 13 Parallelism Repartition Streams PARTITION COLUMNS[C3].[Phone]) NULL 1.7115258E+7 0.0 106.77637 48 265.37921 [Bmk1000], [C3].[Phone], [Cold3].[Updated] NULL PLAN_ROW -1 1.0
| |--Index Scan(OBJECT[Cold_Leads].[dbo].[Cold3].[IX_Updated] AS [C3]), ORDERED FORWARD) 12 15 14 Index Scan Index Scan OBJECT[Cold_Leads].[dbo].[Cold3].[IX_Updated] AS [C3]), ORDERED FORWARD [Bmk1000], [C3].[Phone], [Cold3].[Updated] 1.7115258E+7 149.18944 9.4134312 48 158.60286 [Bmk1000], [C3].[Phone], [Cold3].[Updated] NULL PLAN_ROW -1 1.0
|--Parallelism(Repartition Streams, PARTITION COLUMNS[C2].[Phone]), WHEREPROBE([Bitmap1010])=TRUE)) 12 16 12 Parallelism Repartition Streams PARTITION COLUMNS[C2].[Phone]), WHEREPROBE([Bitmap1010])=TRUE) NULL 2.2408E+7 0.0 134.71178 128 727.64935 [C2].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed] NULL PLAN_ROW -1 1.0
|--Clustered Index Scan(OBJECT[Cold_Leads].[dbo].[Cold2].[IX_Phone2] AS [C2])) 12 17 16 Clustered Index Scan Clustered Index Scan OBJECT[Cold_Leads].[dbo].[Cold2].[IX_Phone2] AS [C2]) [C2].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed] 2.2408E+7 580.61316 12.324439 128 592.93756 [C2].[Phone], [C2].[LC_ADSA], [C2].[Attempts], [C2].[LastDistributed] NULL PLAN_ROW -1 1.0
June 1, 2006 at 9:33 am
It looks like there is a lot of Parallelism going on. Try adding OPTION (MAXDOP 1) to the end of the query and see if that changes anything.
June 1, 2006 at 9:45 am
The maxdop option didn't do anything for me, however I did notice as I look at the table structure itself that one table has a phone field as char(10) and the other is varchar(10) so I'm modifying the structure currently to see if that might be the root of my problem.
Although if anyone has another idea to look at, let me know.
June 1, 2006 at 9:46 am
Try using SET SHOWPLAN_TEXT ON - it will make the output less verbose an easier to interpret.
One immediate observation is:
Clustered Index Scan
Ideally you'd like to see an index seek. Are statistics up to date on the tables ?
June 1, 2006 at 9:49 am
Also check nullability.
June 1, 2006 at 9:50 am
The statistics should be up to date, both of the indexes were dropped and recreated earlier today as I was trying to troubleshoot the delay.
If my issue with char comparing to varchar doesn't resolve the problem, I'll post the new execution plan. But currently the database is in the process of altering that field.
June 1, 2006 at 10:06 am
Altering a column has the effect of setting ANSI_PADDING on for that column. I've experienced problems where this caused an INDEX SCAN instead of an INDEX SEEK.
Check the value of the STATUS column in SYSCOLUMNS for the 2 columns. The status is a bitmask column that where 0 is the default (NOT NULL, ANSI_PADDING OFF), 8 is NULLABLE and 16 is ANSI_PADDING ON, among other values.
June 1, 2006 at 11:14 am
Run the SELECT part only. Does that have the same performance problem? If not then perhaps the problem could be your clustered index rebuild. Or have you tried removing all indexes including the clustered one?
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 1, 2006 at 11:45 am
If I run a select query it's almost instant doing the comparison, but when I try to run the update it doesn't matter if I try to update a thousand records or a single record, it quite simply just won't run.
However I did look at the syscolumns table and their status for both fields is set to 16. How would I go about altering them to 0 (default) which is what I assume might help with this?
June 1, 2006 at 11:53 am
Have you tried without the clustered index on the target table? Tried selecting the records into a staging table before inserting? I think the status is set based on the ANSI_PADDING setting when the column is created, so you'd have to drop the column , issue the set statement, then recreate. I'm not sure of this though. And I don't think that is the root of your problem.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 1, 2006 at 11:55 am
I'm confused Tim, what do you mean with a staging table between?
Are you thinking I should take the clustered index off the table I'm trying to update in order to see if that will make a difference in the speed?
June 1, 2006 at 11:59 am
Yes - try creating a table to hold teh output of the SELECT, then do the UPDATE from that table (the staging table). And yes, the other thing to try is dropping the clustered index (on a test copy of course). I'm not suggesting that you make these changes part of the solution, but just use them for diagnostics.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 1, 2006 at 12:00 pm
I think what's being suggested is select the primary key, plus the subset of columns that are to be updated, out of 1 table ("Cold2") and into a 'temporary' or staging table. Then use this smaller table to join to for the update of "Cold3".
If the staging table is not indexed, the optimizer may change it's approach, and table-scan the staging table while index-seeking into the larger table being updated. Emphasis on "may".
Viewing 15 posts - 1 through 15 (of 28 total)
You must be logged in to reply to this topic. Login to reply