nested subqueries vs separate queries performance difference

Question

nested subqueries vs separate queries performance difference

water490

SSC Eights!

Points: 983
More actions
March 14, 2024 at 8:43 pm

#4395488
Hi everyone
I am looking at improving the performance of my queries. There is an approach I am considering but I don't know if it will actually improve things or not. I won't know until I try (which I will) but I am interested in getting some feedback from the community.
Query 1
```
select *
from (
select field1,field2
from table1
     )
```
Query 2
```
select field1,field2
into #temp1
from table1

select *
from @temp1
```
Query 1 has a nested sub-query but Query 2 does not. Query 2 uses a #temp table to store the results. Which one should perform faster?
I have a lot of Query 1 type of queries that I am considering to moving to Query 2 format. I would like to get some feedback from the community to see if either one is "better".
Thank you

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply

ratbak Ten Centuries Points: 1385 More actions · Answer 1

For some reason, you didn't ask -- perhaps deliberately -- about the simplest and probably the best option, which is just:

SELECT field1,field2
FROM table1

Because there is no where clause, none of the queries are likely to benefit disproportionately, if at all, from an index.

For this simple example, Query 1 will probably perform as well as the simple select w/o subquery.

Query 2 has the overhead of writes plus extra reads vs. Query 1.

water490 SSC Eights! Points: 983 More actions · Answer 2

ratbak wrote:

For some reason, you didn't ask -- perhaps deliberately -- about the simplest and probably the best option, which is just:
SELECT field1,field2
FROM table1
Because there is no where clause, none of the queries are likely to benefit disproportionately, if at all, from an index.
For this simple example, Query 1 will probably perform as well as the simple select w/o subquery. Query 2 has the overhead of writes plus extra reads vs. Query 1.

I see what you mean. I wasn't clear so my apologies for that.

The examples I provided are just simple examples to illustrate the two types of queries. One is nested (with no use of #temp) and other isn't nested (but uses #temp). The actual query has many layers of nested queries (some have close to 4 to 5 layers of nested queries). I am thinking of moving them all out of the nested logic and then put them into separate #temp tables. The end result will be much simpler to read. The only thing is that its going to take a bit of time (which I can definitely do no problem). I just want to see if one approach is theoretically more faster than the other or not that way I know what to expect. Members here have alot of experience so any guidance would be much appreciated. In theory, which approach should be faster? or is there no real performance gain?

Ideally, I only want each sub-query to be run once only. If the current layout of having many layers is forcing a particular sub-query to be re-run then that would be a major performance issue which needs to be addressed.

I am fairly new to SS. I write queries for a hobby so I don't have that kind of professional experience using real-life DB that alot of members here have. I learn by doing (which is ok).

Does that help explain my question? Again, my apologies for not being clear earlier in my post.

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 3

The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.

One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.

water490 SSC Eights! Points: 983 More actions · Answer 4

Jonathan AC Roberts wrote:

The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.

I can try the index approach to see how it impacts performance. What type of index do you suggest?

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 5

water490 wrote:

Jonathan AC Roberts wrote:
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?

As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.

I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.

water490 SSC Eights! Points: 983 More actions · Answer 6

Jonathan AC Roberts wrote:

water490 wrote:
Jonathan AC Roberts wrote:
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.

Thanks for that. I completely understand.

From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 7

water490 wrote:

Jonathan AC Roberts wrote:
water490 wrote:
Jonathan AC Roberts wrote:
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?

If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.

water490 SSC Eights! Points: 983 More actions · Answer 8

Jonathan AC Roberts wrote:

water490 wrote:
Jonathan AC Roberts wrote:
water490 wrote:
Jonathan AC Roberts wrote:
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.

This is very useful information. I didn't know that. Thank you!

The temp tables will have the following format:

Col1 = company

Col2 = date

Col3 etc will be a result from a calculation that is used elsewhere

The various tables are always joined on company and date. The rest of the columns are either used in a select statement(most of the time) or sometimes in the where clause. The where clause is quite limited.

Does this help to narrow down on how the index should be setup? Or do you need to see the actual queries?

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 9

water490 wrote:

Jonathan AC Roberts wrote:
water490 wrote:
Jonathan AC Roberts wrote:
water490 wrote:
Jonathan AC Roberts wrote:
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.
This is very useful information. I didn't know that. Thank you!
The temp tables will have the following format:
Col1 = company
Col2 = date
Col3 etc will be a result from a calculation that is used elsewhere
The various tables are always joined on company and date. The rest of the columns are either used in a select statement(most of the time) or sometimes in the where clause. The where clause is quite limited.
Does this help to narrow down on how the index should be setup? Or do you need to see the actual queries?

Clustered index, ideally unique if possible on (company, date)

water490 SSC Eights! Points: 983 More actions · Answer 10

Jonathan AC Roberts wrote:

water490 wrote:
Jonathan AC Roberts wrote:
water490 wrote:
Jonathan AC Roberts wrote:
water490 wrote:
Jonathan AC Roberts wrote:
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.
This is very useful information. I didn't know that. Thank you!
The temp tables will have the following format:
Col1 = company
Col2 = date
Col3 etc will be a result from a calculation that is used elsewhere
The various tables are always joined on company and date. The rest of the columns are either used in a select statement(most of the time) or sometimes in the where clause. The where clause is quite limited.
Does this help to narrow down on how the index should be setup? Or do you need to see the actual queries?
Clustered index, ideally unique if possible on (company, date)

Thank you for this. Very helpful.

water490 SSC Eights! Points: 983 More actions · Answer 11

posting error

This reply was modified 10 months ago by water490.
This reply was modified 10 months ago by water490.
This reply was modified 10 months ago by water490.
This reply was modified 10 months ago by water490.

Jonathan AC Roberts SSCoach Points: 18380 More actions · Answer 12

You need to remove the drop existing bit, as there is no existing index on a new table.

water490 SSC Eights! Points: 983 More actions · Answer 13

Jonathan AC Roberts wrote:

You need to remove the drop existing bit, as there is no existing index on a new table.

Thank you!

water490 SSC Eights! Points: 983 More actions · Answer 14

I added the clustered indexes to the temp tables. the run time is close to 2.5 minutes. what! before the change the query was running in 20 seconds. what happened??

CREATE TABLE #NextTermStrikeContribution(
[UNDERLYING_SYMBOL] [nvarchar](10) COLLATE Latin1_General_CI_AS NOT NULL,
[QUOTE_DATE] [date] NOT NULL,
[STRIKE_CONTRIBUTION] [numeric](28, 27) NULL
) ON [PRIMARY]

CREATE UNIQUE CLUSTERED INDEX [ClusteredIndexNextTermStrikeContribution] ON #NextTermStrikeContribution
(
[UNDERLYING_SYMBOL] ASC,
[QUOTE_DATE] ASC
)