March 14, 2024 at 8:43 pm
Hi everyone
I am looking at improving the performance of my queries. There is an approach I am considering but I don't know if it will actually improve things or not. I won't know until I try (which I will) but I am interested in getting some feedback from the community.
Query 1
select *
from (
select field1,field2
from table1
)
Query 2
select field1,field2
into #temp1
from table1
select *
from @temp1
Query 1 has a nested sub-query but Query 2 does not. Query 2 uses a #temp table to store the results. Which one should perform faster?
I have a lot of Query 1 type of queries that I am considering to moving to Query 2 format. I would like to get some feedback from the community to see if either one is "better".
Thank you
March 14, 2024 at 9:38 pm
For some reason, you didn't ask -- perhaps deliberately -- about the simplest and probably the best option, which is just:
SELECT field1,field2
FROM table1
Because there is no where clause, none of the queries are likely to benefit disproportionately, if at all, from an index.
For this simple example, Query 1 will probably perform as well as the simple select w/o subquery.
Query 2 has the overhead of writes plus extra reads vs. Query 1.
March 14, 2024 at 9:52 pm
For some reason, you didn't ask -- perhaps deliberately -- about the simplest and probably the best option, which is just:
SELECT field1,field2
FROM table1Because there is no where clause, none of the queries are likely to benefit disproportionately, if at all, from an index.
For this simple example, Query 1 will probably perform as well as the simple select w/o subquery. Query 2 has the overhead of writes plus extra reads vs. Query 1.
I see what you mean. I wasn't clear so my apologies for that.
The examples I provided are just simple examples to illustrate the two types of queries. One is nested (with no use of #temp) and other isn't nested (but uses #temp). The actual query has many layers of nested queries (some have close to 4 to 5 layers of nested queries). I am thinking of moving them all out of the nested logic and then put them into separate #temp tables. The end result will be much simpler to read. The only thing is that its going to take a bit of time (which I can definitely do no problem). I just want to see if one approach is theoretically more faster than the other or not that way I know what to expect. Members here have alot of experience so any guidance would be much appreciated. In theory, which approach should be faster? or is there no real performance gain?
Ideally, I only want each sub-query to be run once only. If the current layout of having many layers is forcing a particular sub-query to be re-run then that would be a major performance issue which needs to be addressed.
I am fairly new to SS. I write queries for a hobby so I don't have that kind of professional experience using real-life DB that alot of members here have. I learn by doing (which is ok).
Does that help explain my question? Again, my apologies for not being clear earlier in my post.
March 14, 2024 at 11:01 pm
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
March 14, 2024 at 11:51 pm
The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
March 15, 2024 at 12:04 am
Jonathan AC Roberts wrote:The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
March 15, 2024 at 12:10 am
water490 wrote:Jonathan AC Roberts wrote:The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
March 15, 2024 at 1:00 am
Jonathan AC Roberts wrote:water490 wrote:Jonathan AC Roberts wrote:The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.
March 15, 2024 at 2:23 am
water490 wrote:Jonathan AC Roberts wrote:water490 wrote:Jonathan AC Roberts wrote:The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.
This is very useful information. I didn't know that. Thank you!
The temp tables will have the following format:
Col1 = company
Col2 = date
Col3 etc will be a result from a calculation that is used elsewhere
The various tables are always joined on company and date. The rest of the columns are either used in a select statement(most of the time) or sometimes in the where clause. The where clause is quite limited.
Does this help to narrow down on how the index should be setup? Or do you need to see the actual queries?
March 15, 2024 at 2:51 am
Jonathan AC Roberts wrote:water490 wrote:Jonathan AC Roberts wrote:water490 wrote:Jonathan AC Roberts wrote:The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.
This is very useful information. I didn't know that. Thank you!
The temp tables will have the following format:
Col1 = company
Col2 = date
Col3 etc will be a result from a calculation that is used elsewhere
The various tables are always joined on company and date. The rest of the columns are either used in a select statement(most of the time) or sometimes in the where clause. The where clause is quite limited.
Does this help to narrow down on how the index should be setup? Or do you need to see the actual queries?
Clustered index, ideally unique if possible on (company, date)
March 15, 2024 at 2:53 am
water490 wrote:Jonathan AC Roberts wrote:water490 wrote:Jonathan AC Roberts wrote:water490 wrote:Jonathan AC Roberts wrote:The optimiser should be able to simplify the query to remove the nesting so it might not make much difference.
One of the big advantages of putting you results into a temporary table, if you have a lot of rows, and then using that within another query is that you can index the temporary table to improve performance.
I can try the index approach to see how it impacts performance. What type of index do you suggest?
As I haven't seen the table I can't say more than one which will help the next query the temporary table is used in.
I normally create the temporary table with a clustered index on before rows are inserted into it. If you want more information you will have to provide your queries.
Thanks for that. I completely understand.
From a performance perspective, does it matter if the temp table is explicitly created using the create table command or implicitly created using the INTO command?
If you are using INTO you cannot create the index before rows are inserted. It is generally much slower to create a clustered index after rows have been inserted into the table.
This is very useful information. I didn't know that. Thank you!
The temp tables will have the following format:
Col1 = company
Col2 = date
Col3 etc will be a result from a calculation that is used elsewhere
The various tables are always joined on company and date. The rest of the columns are either used in a select statement(most of the time) or sometimes in the where clause. The where clause is quite limited.
Does this help to narrow down on how the index should be setup? Or do you need to see the actual queries?
Clustered index, ideally unique if possible on (company, date)
Thank you for this. Very helpful.
March 15, 2024 at 4:39 am
You need to remove the drop existing bit, as there is no existing index on a new table.
March 15, 2024 at 5:26 am
I added the clustered indexes to the temp tables. the run time is close to 2.5 minutes. what! before the change the query was running in 20 seconds. what happened??
CREATE TABLE #NextTermStrikeContribution(
[UNDERLYING_SYMBOL] [nvarchar](10) COLLATE Latin1_General_CI_AS NOT NULL,
[QUOTE_DATE] [date] NOT NULL,
[STRIKE_CONTRIBUTION] [numeric](28, 27) NULL
) ON [PRIMARY]
CREATE UNIQUE CLUSTERED INDEX [ClusteredIndexNextTermStrikeContribution] ON #NextTermStrikeContribution
(
[UNDERLYING_SYMBOL] ASC,
[QUOTE_DATE] ASC
)
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply