December 21, 2016 at 3:30 am
Hi all,
A colleague of mine has just asked me if I can explain why a SQL statement he's written that uses UNION to join together 5 different SQL statements is running quicker than if he runs each of the separate statements individually.
Is there a general reason for this, or would it be specific to his query?
The SQL he's written isn't great (he's got a CASE statement in the JOINS in each of the 5 statements he's unioning together), but fundamentally I cannot see how under any circumstance why one part of an UNION statement would run slower than the whole thing because it's got to run that one part as part of the whole thing anyway.
Or am I missing something?
I can get details of the SQL code and tables being used if necessary.
Thanks
Steve
Regards
Steve
December 21, 2016 at 3:37 am
There could be a number of reasons. Have you had a look at the query plan? This will more than likely give you the answer quite promptly 😎
If you're not sure after looking at them, you can save and upload the query plans in a reply, and then many of us here will be more than happy to see where the performance changes are.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
December 21, 2016 at 3:41 am
Steve
DDL, code and execution plans would indeed be helpful, please.
When you do a UNION, the result sets all have to be sorted together to eliminate duplicates, so it does seem surprising that the UNIONed query goes faster. Maybe the query optimizer is making some efficiency savings in the background somewhere. Hard to say without seeing the plans.
John
December 21, 2016 at 3:48 am
Thanks guys, I've asked him to get the execution plan.
If I struggle to see the issue, I'll get some DDL together.
Regards
Steve
December 21, 2016 at 4:36 am
I'd love to see the execution plan for this.
Guessing. If he's doing UNION and not UNION ALL, the UNION acts as an aggregator and it might work to filter differently than when running each of the queries separately.
Get us the plans, please.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
December 21, 2016 at 4:51 am
I think i've figured this out. The execution plan suggested the JOINs were the issue so I asked for the TSQL code.
He's got the following no no's as far as I can see.
Mix of LEFT and RIGHT JOINS
He's converting the datatypes in the joins
He has a CASE statement in one of the joins.
And I've also found out that 2 of the 4 tables being referenced have no indexes and have no foreign key link back to the tables they are being joined to!!! (hence the data type conversion taking place)
SELECT
__clients.id [XPLAN CLIENT ID]
, Clients.ClientRef AS [AO CLIENT ID]
, Policies.PolicyRef AS [AO POLICY REF]
, __entity_asset.id AS [XPLAN POLICY ID]
, 'Asset' AS [POLICY TYPE]
FROM
__clients
LEFT JOIN Clients ON CONVERT(VARCHAR, Clients.ClientRef) + '_100' = otherid OR CONVERT(VARCHAR, Clients.ClientRef) + '_0' = __clients.otherid
RIGHT JOIN Policies ON
CASE
WHEN Policies.Owner <> 50 THEN CONVERT(Varchar,Policies.ClientRef) + '_' + CONVERT(VARCHAR, Policies.Owner)
ELSE CONVERT(VARCHAR, Policies.ClientRef) + '_100'
END = __clients.otherid
LEFT JOIN __entity_asset ON (__entity_asset.section like '%Policies%') AND (CONVERT(VARCHAR,__entity_asset.otherid) = CONVERT(VARCHAR,Policies.PolicyRef))
WHERE
__entity_asset.id IS NOT NULL
:crazy:
Regards
Steve
December 21, 2016 at 5:00 am
smw147 (12/21/2016)
I think i've figured this out. The execution plan suggested the JOINs were the issue so I asked for the TSQL code.He's got the following no no's as far as I can see.
Mix of LEFT and RIGHT JOINS
He's converting the datatypes in the joins
He has a CASE statement in one of the joins.
And I've also found out that 2 of the 4 tables being referenced have no indexes and have no foreign key link back to the tables they are being joined to!!! (hence the data type conversion taking place)
SELECT
__clients.id [XPLAN CLIENT ID],
, Clients.ClientRef AS [AO CLIENT ID],
, Policies.PolicyRef AS [AO POLICY REF],
, __entity_asset.id AS [XPLAN POLICY ID],
,'Asset' AS [POLICY TYPE]
FROM
__clients
LEFT JOIN Clients ON CONVERT(VARCHAR, Clients.ClientRef) + '_100' = otherid OR CONVERT(VARCHAR, Clients.ClientRef) + '_0' = __clients.otherid
RIGHT JOIN Policies ON
CASE
WHEN Policies.Owner <> 50 THEN CONVERT(Varchar,Policies.ClientRef) + '_' + CONVERT(VARCHAR, Policies.Owner)
ELSE CONVERT(VARCHAR, Policies.ClientRef) + '_100'
END = __clients.otherid
LEFT JOIN __entity_asset ON (__entity_asset.section like '%Policies%') AND (CONVERT(VARCHAR,__entity_asset.otherid) = CONVERT(VARCHAR,Policies.PolicyRef))
WHERE
__entity_asset.id IS NOT NULL
:crazy:
This query isn't viable, the SELECT list isn't properly formed. Just as a matter of interest, what is the table source __clients derived from?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 21, 2016 at 5:06 am
OOPs, that's my fault, ignore the commas at the end (i've edited my above post now)
The __clients and __entity_asset tables have no indexes. They get created by a python tool which inserts CSVs into the SQL database as tables.
Regards
Steve
December 21, 2016 at 5:14 am
smw147 (12/21/2016)
OOPs, that's my fault, ignore the commas at the end (i've edited my above post now)The __clients and __entity_asset tables have no indexes. They get created by a python tool which inserts CSVs into the SQL database as tables.
Ask your python to create these tables with ClientRef and '_100' or whatever in different columns and with the same datatype as the native tables, so that __clients can join to clients on two columns, ClientRef and something else.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 21, 2016 at 5:20 am
That's exactly what I'm going to ask the Python developers to do.
Essentially, the whole process that gets us to this stage is a SQL databse being prepared and migrated from one system to another. The python program outputs the results of records it creates in the target database as CSV files. It then inserts these CSVs into the source databse as tables. The trouble is the outpu CSVs have no ClientRef column to link back to the client, or a PolicyRef column to link back to the Policies table.
It's a legacy system that we're slowly trying to "fix" and improve.
Regards
Steve
December 21, 2016 at 5:38 am
Joy. That's going to run great... For certain values of great.
Still, curious about how the UNION resolves vs. the individual queries.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply