Bad performance in TSQL

Question

Post reply

Bad performance in TSQL

Jeffrey Melloy

SSC-Addicted

Points: 470
More actions
May 31, 2006 at 4:39 pm

#113150

I have a stored procedure which executes the following SQL statement:
INSERT INTO CONTRACT_TO_IND_PARTY (COMP_CODE, LESSOR_NUM, MAIN_CONTRACT_NUM
, SUB_CONTRACT_NUM, PARTY_ID, IND_TITLE)
SELECT DISTINCT
CT_COMP_CODE
, CT_LESSOR_NUM
, CT_MAIN_CONTRACT_NUM
, CT_SUB_CONTRACT_NUM
, PTY_Party_Id
, PARSENAME(PTY_Party_Comments, 4) --Title
FROM PT_PTY_Party_View, RAW_CT_Contract_Display_View
WHERE
CAST(SUBSTRING(PTY_Party_Code, 1, 10) AS int) = CT_COMP_CODE
ANDCAST(SUBSTRING(PTY_Party_Code, 11, 10) AS int) = CT_LESSOR_NUM
ANDCAST(SUBSTRING(PTY_Party_Code, 21, 10) AS int) = CT_MAIN_CONTRACT_NUM
ANDPTY_Party_Code IS NOT NULL
ANDPTY_Party_Comments IS NOT NULL
ORDER BY CT_COMP_CODE, CT_LESSOR_NUM, CT_MAIN_CONTRACT_NUM, CT_SUB_CONTRACT_NUM
This query takes on the order of six hours to complete. If I run it manually, it takes about a minute. The explain plan for the select statement isn't terrible. It overestimates the amount of rows being returned by about 4x, but like I said, normally completes within a minute. None of the tables seems to have any locks on them.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

JeffB Hall of Fame Points: 3541 More actions · Answer 1

A couple of things:

How many rows are in each table? Are being inserted?

How are you running it when it takes 6 hours to run? 1 minute to run?

Remove 'AND PTY_Party_Code IS NOT NULL' because when parsing the field, NULL will never equal CT_COMP_CODE, CT_LESSOR_NUM or CT_MAIN_CONTRACT_NUM

Change the join syntax to:

FROM PT_PTY_Party_View

INNER JOIN RAW_CT_Contract_Display_View

ON CAST(SUBSTRING(PTY_Party_Code, 1, 10) AS int) = CT_COMP_CODE

AND CAST(SUBSTRING(PTY_Party_Code, 11, 10) AS int) = CT_LESSOR_NUM

AND CAST(SUBSTRING(PTY_Party_Code, 21, 10) AS int) = CT_MAIN_CONTRACT_NUM

AND PTY_Party_Comments IS NOT NULL

Why are you using SELECT DISTINCT? The need for DISTINCT normally indicates incomplete join qualification.

Why are you ordering the the INSERT? There is no need for this.

Does the show plan include 'parallelism'? If so, try it with OPTION (MAXDOP 1) at the end.

Parsing PTY_Party_Code is a terrible idea. Considered storing it as 3 separate integer columns so that indexes can be used.

Sergiy SSC Guru Points: 110208 More actions · Answer 2

Functions applied to a column in WHERE cause cause fulll table scan. Such queries are slow by design.

You need to dismantle PTY_Party_Code to the parts you need and have them in separate indexed integer columns.

I know, database design is boring subject, but sometimes it's really helpful. Read some rules about it.

_____________
Code for TallyGenerator

Jeffrey Melloy SSC-Addicted Points: 470 More actions · Answer 3

How many rows are in each table? Are being inserted?

About 267,000 rows are being inserted.

How are you running it when it takes 6 hours to run?

Inside a stored procedure which does a bunch of similar stuff.

1 minute to run?

In a Query Analyzer window.

Remove 'AND PTY_Party_Code IS NOT NULL' because when parsing the field, NULL will never equal CT_COMP_CODE, CT_LESSOR_NUM or CT_MAIN_CONTRACT_NUM

Good point.

Change the join syntax to:

FROM PT_PTY_Party_View

INNER JOIN RAW_CT_Contract_Display_View

ON CAST(SUBSTRING(PTY_Party_Code, 1, 10) AS int) = CT_COMP_CODE

AND CAST(SUBSTRING(PTY_Party_Code, 11, 10) AS int) = CT_LESSOR_NUM

AND CAST(SUBSTRING(PTY_Party_Code, 21, 10) AS int) = CT_MAIN_CONTRACT_NUM

AND PTY_Party_Comments IS NOT NULL

When I did this the explain plans were exactly the same. The explain plan expects to end up with about 850k records; could it be throttling?

Why are you using SELECT DISTINCT? The need for DISTINCT normally indicates incomplete join qualification.

This is inserting into a temp table. After that, we look up various IDs for each combination of comp, lessor and main. You may be right, though. Generally there's a fourth element of the key.

Why are you ordering the the INSERT? There is no need for this.

I wondered that too, but since I got handed this code this afternoon figured there might be a reason. We do have some stuff that cares about order of inserts. And it doesn't affect the one-minute plan by much without it, since it still has to do the distinct.

Does the show plan include 'parallelism'? If so, try it with OPTION (MAXDOP 1) at the end.

Parsing PTY_Party_Code is a terrible idea. Considered storing it as 3 separate integer columns so that indexes can be used.

Agreed.

I figured that since my main problem was that it had such drastically different run times inside and outisde of a stored procedure, niggling stuff like the order by and parsing pty_party_code would come out in the wash.

brendt hess Hall of Fame Points: 3256 More actions · Answer 4

Try adding "WITH RECOMPILE" to the stored procedure definition, and see if that corrects the issue.

stax68 SSChampion Points: 11711 More actions · Answer 5

[EDIT: THIS FIRST PARA WAS WRITTEN WITHOUT READING THE PREVIOUS POSTS PROPERLY] If that doesn't do it, I'd be surprised, but if not, can you see any difference between the showplan outputs, and if so post them? A theoretical possibility, especially if your views are complex, is that your free sql is reusing a 'lucky' query plan that somehow the sp doesn't pick up on, even when looking to generate a fresh execution plan. The optimiser does not always work very well, even with fresh stats, especially with very complex queries. You could clear the plan buffers with DBCC FREEPROCCACHE to create a level playing field, but only after you save the two different versions of the query plan.

Alternatively, any possibility of schema ambiguities in your table names being resolved differently? The stored proc may have an owner other than you, possibly with different permissions. relatedly, I seem to remember there is also the issue of recompiles being caused by non-schema-modified table names - so perhaps the problem is not that the sp is reusing an old plan, but that it is recompiling evry time? You could run a trace for recompiles, or just try schema-qualifying all your object references (I don't think referenced views have to do the same internally for the sp to reuse plans, but I may be wrong).

The apparently independent problem with the non-1nf data could perhaps be temporarily resolved with indexed calculated columns or views in v8+. Even 1 minute is quite a long time, after all. Of course I don't know how many tables of what size are in your views, and what optimisation they might bear, or their parents, or their parents' parents...

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Sergiy SSC Guru Points: 110208 More actions · Answer 6

Jeffrey, because you have 3 "Full Table" joins in your query you've got 267,000 * 3 = 800k lines in Hash Join. Plus some other minor duplicates you must have gives you those 850k. SQL Server has to record it to a table in temdb and work out the data in this table. Take into consideration that there are no indexes, no statistics on those 850k rows, your DISTINCT, your ORDER go with full table scan.

You cannot do anything with it unless you have those 3 indexed columns.

And get rid of that ORDER BY. INSERT applies its own order without paying any attention to the order of data being inserted. Unless the targeted table does not have any index on it.

_____________
Code for TallyGenerator

stax68 SSChampion Points: 11711 More actions · Answer 7

You'd only be interested in the INSERT order if there was a identity column. Sergiy, are you saying that this won't work and new records might be 'incorrectly' numbered?

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Sergiy SSC Guru Points: 110208 More actions · Answer 8

If you use identity column you don't really care which line is assigned to which number.

Identity is just a "shortcut" for natural key for data in table. And those columns making this key must be invluded into unique clustered index.

If you need to use identity as only unique constraint then your data is a mess.

_____________
Code for TallyGenerator

stax68 SSChampion Points: 11711 More actions · Answer 9

Sergiy, some data is 'a mess'. Some data refers to its own processing. Someone somewhere has a legitimate reason for doing what I describe. So the answer to the question is...?

>And those columns making this key must be included into unique clustered index.
What on earth gives you that idea?

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Sergiy SSC Guru Points: 110208 More actions · Answer 10

Answer is "There is no point".

_____________
Code for TallyGenerator

stax68 SSChampion Points: 11711 More actions · Answer 11

No point in unique clustered index on composite key with a surrogate? Don't be too hard on yourself, there might be a point sometimes. It's just not necessarily a good idea, and certainly not mandatory.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Jeffrey Melloy SSC-Addicted Points: 470 More actions · Answer 12

I ended up making a temp table and adding an index to it. It sped the query up enormously.

It doesn't quite solve my underlying problem, though. What's the best way to see explain plans for the middle of a stored procedure? Run explain on the whole thing?

I think part of the problem was using pt_party_view with old stats information.

The MAXDOP 1 thing worked extremely well, too. Thanks for the help.

JeffB Hall of Fame Points: 3541 More actions · Answer 13

Also, check the definition of the views. I've worked on systems with views nesting views many levels deep, with each level using the same tables, causing them to appear in the plan many times.

stax68 SSChampion Points: 11711 More actions · Answer 14

Ah yes, nested views:

>Of course I don't know how many tables of what size are in your views, and what optimisation they might bear, or their parents, or their parents' parents...

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant