July 7, 2010 at 9:02 am
I'm trying to write a script that returns rows in one table that are not in another table. One table is supplier data plstat and some example for one supplier for the year 2008 is:
[pls_supplier] [pls_year] [pls_period]
A1S001 2008 1
A1S001 2008 2
A1S001 2008 3
A1S001 2008 4
A1S001 2008 5
A1S001 2008 6
A1S001 2008 7
A1S001 2008 8
A1S001 2008 9
A1S001 2008 10
A1S001 2008 11
A1S001 2008 12
My other table is a table variable called @pltransactions and has data like this in for the same supplier for 2008:
[supplier] [year] [month] [type] [purchases] [others] [cash]
A1S001 20083c178.400031.2200.0000
A1S001 20087c.0000 .0000 -209.6200
The best code I have come up with so far is:
SELECT DISTINCT pls_supplier, pls_year, pls_period
FROM plstat LEFT OUTER JOIN @pltransactions
ON [supplier] = pls_supplier AND [year] = pls_year AND [month] = pls_period
WHERE [purchases] IS NULL AND [others] IS NULL AND [cash] IS NULL
ORDER BY pls_supplier, pls_year, pls_period
In other words, I want to know which periods in plstat are not found in the table variable for the supplier. The above code gives me the right answer which is:
pls_supplier pls_year pls_period
A1S001 2008 1
A1S001 2008 2
A1S001 2008 4
A1S001 2008 5
A1S001 2008 6
A1S001 2008 8
A1S001 2008 9
A1S001 2008 10
A1S001 2008 11
A1S001 2008 12
However! On the data I have, that code takes nearly 100 seconds to run, which is far too long for a production report. The full data is around 21,000 rows in plstat and 16,000 rows in the table variable. plstat has one row per year/period per supplier, but the table variable can have several rows per year/period per supplier.
It's not an enormous amount of data, so it must be my poor coding skills that's holding it back. Any great ideas out there for improving this?
Regards,
Mark
July 7, 2010 at 9:17 am
Is there any way you could replace the table variable with a temp table? The table variable is no good for so much data. It would not be able to use statistics and this can result in bad performance.
Also what (if any) indexes do you have on your table?
July 7, 2010 at 9:35 am
Have a look at the actual execution plan, this will show you where the performance bottlenecks are. Post them here if you need help analysing them.
Also you should consider using a temp table or perm table so that you can add indexes.
July 7, 2010 at 9:39 am
steveb. (7/7/2010)
Have a look at the actual execution plan, this will show you where the performance bottlenecks are. Post them here if you need help analysing them.Also you should consider using a temp table or perm table so that you can add indexes.
Hi Rihojal and Steveb,
I was hoping to use a table variable because a) I was a bit temp table-happy when I first started with SQL and realised later that it's best to avoid them; and b) the data in the table variable has quite a few duplicates.
I don't have an index on the table variable as even with an index that includes every field in the table, it's still not unique.
I'll do as you suggest and look at the execution plan and also compare the speed with a temp table.
Cheers for your help.
Regards,
Mark
July 7, 2010 at 9:47 am
Why did you think it was best to avoid temp tables and use table variables instead?
July 7, 2010 at 10:23 am
mmcardle (7/7/2010)
I was hoping to use a table variable because a) I was a bit temp table-happy when I first started with SQL and realised later that it's best to avoid them; and b) the data in the table variable has quite a few duplicates.
You might want to check out this article on Comparing Table Variables to Temporary Tables[/url]
Just a point to make: many of the gurus here only use temp tables - mainly because of the statistics issue on table variables causing inadequate execution plans that are avoided by the use of temporary tables.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 7, 2010 at 2:59 pm
If your supplier data is unique the you don't need distinct when looking for records that don't match since there should only be one of each to begin with. And if your transaction data does not allow nulls then you only need to look for one field to be null instead of all 3. Putting the transaction data in a temp table with an index on the supplier code,year and month would help with the query but may be a trade off for the time taken to build the index on a temp table in the first place. Also all tables , temp or not, should have a unique key on them.
July 8, 2010 at 2:12 am
mrpolecat (7/7/2010)
If your supplier data is unique the you don't need distinct when looking for records that don't match since there should only be one of each to begin with. And if your transaction data does not allow nulls then you only need to look for one field to be null instead of all 3. Putting the transaction data in a temp table with an index on the supplier code,year and month would help with the query but may be a trade off for the time taken to build the index on a temp table in the first place. Also all tables , temp or not, should have a unique key on them.
Many thanks for your and everyone's input above. I've decided to go down the temp table route. I find table variables useful, but I can see now that they are best suited to small amounts of data. This is the biggest table variable I've ever used; most of them have far fewer than 100 rows in them, so that is ideal.
A lot of the books, training courses, and online articles I have read have said that temp tables should be a last resort. I'd been using them for years on Informix SQL. I guess there are times when it could be deemed to be lazy coding to use temp tables instead of thinking through a requirement, but temp tables do have their uses.
Thanks again for all your help and for helping me continue my journey of learning.......
July 8, 2010 at 2:38 am
Just reporting back......
OK, the temp table route runs in less than 2 seconds......an acceptable speed given the amount of data and processing involved (of which I only showed you a snippet of because it was the main time-lag). Brilliant!
I used a temp table with an index on the relevant columns for the query links and adding an identity column for uniqueness.
It works great and I have learned my lesson about table variables Vs. temp tables. Both have their uses depending on circumstances.
Thanks for all your help.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply