August 8, 2008 at 2:47 pm
Hi All,
Please help/teach me how to tune this query. I have a SQL2k5 server on SAN. The tables in the query had proper clustered indexexes. There are 3 tables in the query have a huge amount of data since I'm in data warehouse environment:
Invoice_transaction = 94M of records
Invoice = 23M of records
Regular_invoice_transactions = 15M of records
The rest of the tables have less than 10K of records.
It took about 15min for the query to complete and returned about 578K of records. I turned on the query execution plan and did not see anything that could cause the slowness of the query.
I'd like to cut down the running time to half if possible, but I have no hope.
Please help! I really appreciate for any inputs. Below is the query:
select T1.INV_NUM,
T2.TRANSACTION_ZZ,
T3.FSC_NUMBER,
T4.PROCEDURE_CODE,
T2.ZGW_MOD_1,
T2.ZGW_MOD_2,
T2.ZGW_MOD_3,
T2.ZGW_MOD_4,
T2.CHARGE_AMOUNT,
T5.NUMBER,
T6.NUMERIC_CODE,
T7.FSC_NUMBER as FSC_NUMBER2,
T2.UNITS,
T2.UNITS_TOT,
T8.NUMBER as NUMBER2,
T2.BATCH_NUM,
convert(datetime, isnull(T2.POST_DT_1,T2.BANK_DEP_DT)) as POST_DATE,
convert(datetime,T2.BANK_DEP_DT) as BANK_DEP_DT,
T9.EXTERNAL_PERIOD,
T2.CR_AMT,
T2.DEBIT_AMT,
T2.ADJ_AMT,
T2.PAY_CODE_NUM,
T5.MNEMONIC,
T1.GUARANTOR,
T1.PATIENT,
T1.INVOICE,
T2.FSC,
T2.PROC_ZZ,
T1. BILLING_AREA,
T1.PROV,
T1.ZGW_ORIG_FSC,
T2.POST_PD,
Reg_PATNM = substring(t10.pat_nm, patindex('%,%', T10.PAT_NM)+1, 100),
T9_PatNM = case when patindex('%,%', T10.PAT_NM)>1
then substring (T10.PAT_NM, 1, patindex('%,%', T10.PAT_NM) -1)
else null end,
T2.UNITS_TIME, T2.UNITS_BASE,
T4.PLUS_SVC_SUMMARY
from invoice T1 with (nolock),
invoice_transaction T2 with (nolock),
DN19 T3,
DN1 T4,
DN202 T5,
DN3 T6,
DN19 T7,
DN100 T8,
PDMAP T9,
regular_invoice_transactionT10 with (nolock)
where T1.GROUP_ZZ = T2.GROUP_ZZ and T1.GUARANTOR = T2.GUARANTOR and T1.PATIENT = T2.PATIENT
and T1.INVOICE = T2.INVOICE and T2.FSC = T3.Record_Number
and T2.PROC_ZZ = T4.Record_Number
and T1.BILLING_AREA = T5.Record_Number
and T1.PROV = T6.Record_Number and T1.ZGW_ORIG_FSC = T7.Record_Number
and T1.LOC = T8.Record_Number and T2.POST_PD = T9.INTERNAL_PERIOD
and T9.EXTERNAL_PERIOD = 200807 and T2.group_zz = 4
and T1.patient = T10.patient
Minh Vu
August 8, 2008 at 2:58 pm
Is it possible to add covering indexes to the tables? I see you mention clustered indexes, but I don't know what (if any) other indexes you have.
It would also be helpful if you could post the table and index definitions, and some small amount of sample data for them. Also, the execution plan for the query.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 10, 2008 at 5:25 pm
I'd also recommend you rewrite this as
select x
from table a
inner join table b
on a.x = b.x
rather than your syntax. It's possible you included a cross join in there somewhere. Also, are you expecting 578k records? That seems like a lot of data, perhaps too much to even work with.
August 11, 2008 at 1:25 am
Without table structure and indexes, difficult to say much more than has been said.
What are you planning to do with those half a million records?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 11, 2008 at 3:38 pm
I'm sure the query plan will give you a percentage at each checkpoint
so based on the percentages you can determine which operation is taking more time
If it is evenly distributed then you shud look at reorganizing ur query.
But the result set is too huge for a query.
August 12, 2008 at 8:29 am
Pay a pro to remote into your box and tune the query for you, and then teach you how they did it. You will be going back and forth with questions and answers for days otherwise and almost certainly will not get sufficient results due to the complexity of the scenario. We really need to be able to get details that can be most effectively garnered only while connected to the server.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
August 12, 2008 at 7:40 pm
Thank you all for all of the inputs. I took SSChampion's suggestion and it did work for me. It ran a lot faster by doing that and put the query into stored procedure.
Minh
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply