September 11, 2006 at 11:45 pm
I have a dts package that runs in 4 steps.
The 1st step is to copy data from multiple columns into a single table. This gets done in 15 minutes. The next step is to update all the missing data with an update script. This is where the dts package slows down to an eternity.
It use to run anywhere from 30-45 minutes, now if I am lucky it finishes after 65 hours. We moved from an Intel server to an IBM 345 server. I would like to know where I can start to look for the reason why it takes so long.
Below is a copy of the update script:
-- Update Blood Group Information of each Unit in zcusIss
update zcusIss
set
Blood_Group = Case
when left(sbhx.blood_type,3) = 'APO' then 'Group A'
when left(sbhx.blood_type,3) = 'ANE' then 'Group A'
when left(sbhx.blood_type,3) = 'A1N' then 'Group A'
when left(sbhx.blood_type,3) = 'A1P' then 'Group A'
when left(sbhx.blood_type,3) = 'A2N' then 'Group A'
when left(sbhx.blood_type,3) = 'A2P' then 'Group A'
when left(sbhx.blood_type,3) = 'APO' then 'Group A'
when left(sbhx.blood_type,3) = 'AWP' then 'Group A'
when left(sbhx.blood_type,3) = 'AWN' then 'Group A'
when left(sbhx.blood_type,3) = 'BNE' then 'Group B'
when left(sbhx.blood_type,3) = 'BPO' then 'Group B'
when left(sbhx.blood_type,3) = 'OPO' then 'Group O'
when left(sbhx.blood_type,3) = 'ONE' then 'Group O'
when left(sbhx.blood_type,3) = 'A1B' then 'Group AB'
when left(sbhx.blood_type,3) = 'A2B' then 'Group AB'
when left(sbhx.blood_type,3) = 'ABN' then 'Group AB'
when left(sbhx.blood_type,3) = 'ABP' then 'Group AB'
when left(sbhx.blood_type,3) = 'AWB' then 'Group AB'
when left(sbhx.blood_type,3) = 'INC' then 'Other'
when left(sbhx.blood_type,3) = 'NAT' then 'Other'
when left(sbhx.blood_type,3) = 'UNK' then 'Other'
else 'Unk Group'
end,
Blood_Type = coalesce(
case
when sbhx.blood_type <> '' then sbhx.blood_type
else 'No Group'
end, 'No Group')
from
zcusIss (nolock)
left join sabts_lab_b_hx_main sbhx (nolock)
on hx_urn = sbhx.urn
where
Blood_Group is null
The joins used in both tables are indexed.
Any help in this regard would be greatly appreaciated.
September 12, 2006 at 4:02 am
Have u looked at query execution plan? That should point you in the right direction.Do you have Index on bloodType?
September 12, 2006 at 6:29 am
Put an index on bloodtype. Even if its only temporary to get your update done. You might then try updating everything to GroupA, then test and update the other groups. You'd have to play around with your updates. Maybe after indexed, seperate updates for each group would be quicker than a Case statement.
September 12, 2006 at 7:39 am
I have an index for the bloodgroup setup. As for the query execution plan I have run it but it does not make any sense to me. It gives a breakdown with cost allocated to each.
September 13, 2006 at 12:10 am
Hi
Do you have any foreign key relationships on the table. If so try remove the relationship for the update
Mike
September 13, 2006 at 12:34 am
Hi there,
Below is the table structure and indexes used on table:
Name | Value |
QUOTED_IDENTIFIER | ON |
ANSI_NULLS | ON |
ANSI_PADDING | OFF |
Columns
Name | Description | Data Type | Max Length | Nullable | Default | IsGUID | |
unit_number | varchar | 20 | |||||
Unit_Urn | varchar | 20 | |||||
Donor_Urn | varchar | 20 | |||||
hx_urn | varchar | 20 | |||||
root_unit | varchar | 20 | |||||
patient | varchar | 50 | |||||
Collection_Date | datetime | 8 | |||||
Product_Group | varchar | 27 | |||||
Product_Type | varchar | 22 | |||||
Product_Category | varchar | 25 | |||||
Product | varchar | 25 | |||||
Status | varchar | 13 | |||||
Issued_to_Spec | varchar | 20 | |||||
Iss_Time_Frame | varchar | 7 | |||||
Issue_Hour | varchar | 7 | |||||
Issue_Time | varchar | 14 | |||||
Blood_Group | varchar | 15 | |||||
Blood_Type | varchar | 15 | |||||
IPGroup | varchar | 11 | |||||
IP | varchar | 15 | |||||
Issue_User | varchar | 20 | |||||
Issue_Type | varchar | 9 | |||||
Specimen_Priority | varchar | 25 | |||||
Collection_Zone | varchar | 21 | |||||
Collection_Zone_Branch | varchar | 17 | |||||
Collection_Zone_Site | varchar | 20 | |||||
Collection_Province | varchar | 14 | |||||
Collection_Branch | varchar | 17 | |||||
Collection_Clinic | varchar | 20 | |||||
Issue_Date | datetime | 8 | |||||
Issue_Province | varchar | 16 | |||||
Issue_Branch | varchar | 17 | |||||
Issue_Site | varchar | 20 | |||||
Issue_Zone | varchar | 21 | |||||
Issue_Zone_Branch | varchar | 19 | |||||
Issue_Zone_Site | varchar | 20 | |||||
Processing_Province | varchar | 17 | |||||
Processing_Branch | varchar | 21 | |||||
Processing_Site | varchar | 20 | |||||
Hospital_Province | varchar | 25 | |||||
Hospital_Branch | varchar | 25 | |||||
Hospital | varchar | 15 | |||||
Ward | varchar | 15 | |||||
Classification | varchar | 10 | |||||
Insurance_Group | varchar | 10 | |||||
Medical_Aid | varchar | 15 | |||||
ICD10 | varchar | 15 | |||||
Unit_Count | int | 4 |
Indexes
Index | Primary | Unique | |
blood_group_idx | |||
collection_date_idx | |||
donor_urn_idx | |||
hx_urn_idx | |||
issue_date_idx | |||
issued_to_spec_idx | |||
patient_idx | |||
root_unit_idx | |||
unit_urn_idx |
September 13, 2006 at 4:12 am
You could try updating from a table var, rather than the lengthy CASE statement:
-- CREATE TABLE VAR
DECLARE @Blood_Groups TABLE (Blood_Type varchar(3), Blood_Group varchar(8) )
-- Insert updatable values
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('APO', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('ANE', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('A1N', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('A1P', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('A2N', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('A2P', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('APO', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('AWP', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('AWN', 'Group A')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('BNE', 'Group B')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('BPO', 'Group B')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('OPO', 'Group O')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('ONE', 'Group O')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('A1B', 'Group AB')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('A2B', 'Group AB')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('ABN', 'Group AB')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('ABP', 'Group AB')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('AWB', 'Group AB')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('INC', 'Other')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('NAT', 'Other')
INSERT INTO @Blood_Groups (Blood_Type, Blood_Group) VALUES ('UNK', 'Other')
-- Update Blood Group Information of each Unit in zcusIss
-- Left join to table var to retrieve blood group names
update zcusIss
set
Blood_Group = IsNull(BG.Blood_Group,'Unk Group'),
Blood_Type = coalesce(
case
when sbhx.blood_type <> '' then sbhx.blood_type
else 'No Group'
end, 'No Group')
from
zcusIss (nolock)
left join sabts_lab_b_hx_main sbhx (nolock)
on hx_urn = sbhx.urn
LEFT JOIN @Blood_Groups BG
ON sbhx.Blood_Type = BG.Blood_Type
where
zcusIss.Blood_Group is null
September 13, 2006 at 4:43 am
Ah, darn it. Somehow lost my post.
krs has a good idea. Although I wonder about using the left([fieldname], 3) repeatedly in a case statement. That's assuming that blood_type varies beyond those 3 characters.
You don't show an index on blood_type. What if you created one and, rather than using the case statement, rewrote this as 5 different update statements? Doing that you could add lines to your where clause to find the blood_type. That would actually use an index rather than brute-forcing your way through the table.
"and sbhx.blood_type like 'APO%' and sbhx.blood_type like 'ANE%'" etc.
Another thought is to run this as an INSERT INTO a temp table and then UPDATE back into zcusIss based on the primary key. This would prevent you from SELECTing and UPDATEing one table at the same time.
My biggest concern, though, is that a change in hardware caused this problem. If you still have the old hardware around you might want to compare your old configuartion to hte new one to see what's different that might have caused the problem. Any time a change in hardware causes something to take 130x longer you can bet there's something else going on as well.
September 13, 2006 at 8:35 am
What about using your LEFT in the case only once. You are evaluating the same value to be some other value so there is no need to put the full evaluation in the WHEN.
Blood_Group = Case left(sbhx.blood_type,3)
when 'APO' then 'Group A' <-- first
when 'ANE' then 'Group A'
when 'A1N' then 'Group A'
when 'A1P' then 'Group A'
when 'A2N' then 'Group A'
when 'A2P' then 'Group A'
when 'APO' then 'Group A' <-- second.
when 'AWP' then 'Group A'
when 'AWN' then 'Group A'
when 'BNE' then 'Group B'
when 'BPO' then 'Group B'
when 'OPO' then 'Group O'
when 'ONE' then 'Group O'
when 'A1B' then 'Group AB'
when 'A2B' then 'Group AB'
when 'ABN' then 'Group AB'
when 'ABP' then 'Group AB'
when 'AWB' then 'Group AB'
when 'INC' then 'Other'
when 'NAT' then 'Other'
when 'UNK' then 'Other'
else 'Unk Group'
end
This is of course, if you do not use KRS's solution which is better than the above suggestion.
September 13, 2006 at 9:39 am
Can you post the query plan?
September 13, 2006 at 1:18 pm
How would I go about posting the query plan?
The old hardware was a 4 dual xeon processor with 8gb RAM. The new hardware is a IBM 345 2 dual processors and 4gb RAM. I ran the same query on the old server and it completes in 30 - 1.5 Hours. The reason for moving it to the IBM was that we suffered a db crash and had to recover the db from the mdf files. With the data recovered we are facing a new problem where the dts jobs takes forever to complete.
The table I am creating is in turn used as the fact table for an Analsys Server Cube. The dts package consists out of 5 steps. The first is to populate the majority of data from table using pk's. This takes 15 minutes. Then the other jobs uses custom indexes created to speed up the update, these are blood group, patient, collection date and everything below hospital province. Once the skeleton is created I use the indexes created in this table (zcusIss) to join to other master data tables.
The full blood group will look something like OPOSL, APOSL, AwBPOS, A2BPOS etc. With the left I am trying to group each blood group together for the cube to use as a dimension.
Please let me know how to go about posting the query plan.
Thanks for all the suggestions so far.
September 13, 2006 at 1:24 pm
How would I go about posting the query plan?
The old hardware was a 4 dual xeon processor with 8gb RAM. The new hardware is a IBM 345 2 dual processors and 4gb RAM. I ran the same query on the old server and it completes in 30 - 1.5 Hours. The reason for moving it to the IBM was that we suffered a db crash and had to recover the db from the mdf files. With the data recovered we are facing a new problem where the dts jobs takes forever to complete.
The table I am creating is in turn used as the fact table for an Analsys Server Cube. The dts package consists out of 5 steps. The first is to populate the majority of data from table using pk's. This takes 15 minutes. Then the other jobs uses custom indexes created to speed up the update, these are blood group, patient, collection date and everything below hospital province. Once the skeleton is created I use the indexes created in this table (zcusIss) to join to other master data tables.
The full blood group will look something like OPOSL, APOSL, AwBPOS, A2BPOS etc. With the left I am trying to group each blood group together for the cube to use as a dimension.
Please let me know how to go about posting the query plan.
Thanks for all the suggestions so far.
September 13, 2006 at 9:10 pm
Take the SQL statment that is taking forever to process and place it into the Query Analyzer. Execute the "estimated exection plan" and post the results.
September 13, 2006 at 11:16 pm
I know how to obtain the query stats, the problem is that when you're in query analyser you can't copy the query stats results. How do I get teh results to post here?
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply