SQL - Derived Tables

Question

SQL - Derived Tables

Prashant Pandey

SSChasing Mays

Points: 657
More actions
January 15, 2008 at 9:44 pm

#182800

Comments posted to this topic are about the item SQL - Derived Tables
Thanks,
Prashant

Viewing 15 posts - 1 through 15 (of 35 total)

You must be logged in to reply to this topic. Login to reply

Rob Hershfield SSC-Addicted Points: 490 More actions · Answer 1

I try to use derived tables whenever possible, but I recently improved performance by changing a derived table to a temp table. It was large and joined on a non-index column so I found that a temp table with an index on the join fields.

David A. Long SSCrazy Eights Points: 8421 More actions · Answer 2

Great article.

We have been using the derived table as a great complex query design tool:

Break the complex query into derived tables, generally by the specific table's restrictions.

Then JOIN the individual derived table queries into the complex query.

BTW here id the fix for the original query:

SELECT C.CustomerID, C.CompanyName

, COUNT(O.OrderID) AS TotalOrders

FROM Customers C

LEFT OUTER JOIN Orders O ON C.CustomerID = O.CustomerID

AND O.OrderDate >= '19960101'

AND O.OrderDate < '19970101'

GROUP BY C.CustomerID, C.CompanyName

The last step is to evaluate for optimization.

A suggestion is no try and not use functions in the where clause as that can cause any available index to be skipped.

Andy

corey lawson Hall of Fame Points: 3773 More actions · Answer 3

"Theta Joins" also can be used.

For the given example in the article, the results should match for this query:

select c.*, o.*

from customers as c

left join orders as o

on c.customerid = o.customerid and year(o.orderdate) = 1996

--where

-- ...

Jeff Moden SSC Guru Points: 1004701 More actions · Answer 4

...and everything happens in memory instead of a combination of memory and disk.
...while a temporary table exists until the server is brought down and also it uses lot of disk space than a derived table in the temdb database.

You wrote a very good introduction to the use of Derived Tables... but you really need to check your resources on the two statements above....

First, Derived Tables are NOT memory only constructs... just like any query, if they run out of room in memory, they WILL make a thing called a "Working" table and you can frequently see them in the text version of execution plans. Guess where those bad boys live... you guessed it... memory if they fit and if they don't, TEMPDB!

Also, your information about Temp Tables living only on disk is dead wrong. Please read the following Microsoft provided URL...

http://support.microsoft.com/default.aspx?scid=kb;en-us;305977&Product=sql2k

... and pay particular attention to where it states "If memory is available, both table variables and temporary tables are created and processed while in memory (data cache). "

You also need to check Books Online for what the persistance of a Temp Table actually is... you DON'T need to bring the server down to get rid of a Temp Table. 😉

Don't let this dampen your writing spirit... you provided a really good intro to derived tables... I just had to correct your statements about temp tables. In fact, there are some occasions where a Temp Table will blow the doors off of derived tables... usually in a multi-table-joined update when aggragates are necessary. They work pretty well as a substitute for CTE's, too!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

tony rogerson SSC Veteran Points: 242 More actions · Answer 5

Derived tables are a great performance optimisation tool but always check because the derived table approach is not always the best and temporary tables are sometimes a necessary approach.

The derived table itself is just expanded into the main query (just like a view) this means that there are no statistics available for the derived table.

Statistics are held for the temporary table; you can also create indexes on temporary tables.

I use derived tables a lot, well, one hell of a lot to be honest and they are a good structure but also look at CTE's which is what I'm using more - they can be referenced more like a virtual table in your queries.

Tony.

pain_killer SSC Enthusiast Points: 142 More actions · Answer 6

David A. Long (1/16/2008)
Great article.
BTW here id the fix for the original query:
SELECT C.CustomerID, C.CompanyName
, COUNT(O.OrderID) AS TotalOrders
FROM Customers C
LEFT OUTER JOIN Orders O ON C.CustomerID = O.CustomerID
AND O.OrderDate >= '19960101'
AND O.OrderDate < '19970101'
GROUP BY C.CustomerID, C.CompanyName
Andy

I think it also do the job:

SELECT C.CustomerID, C.CompanyName

, COUNT(O.OrderID) AS TotalOrders

FROM Customers C

LEFT OUTER JOIN Orders O ON C.CustomerID = O.CustomerID AND

YEAR(O.OrderDate) = 1996

--AND O.OrderDate >= '19960101'

--AND O.OrderDate < '19970101'

GROUP BY C.CustomerID, C.CompanyName

OR:

SELECT C.CustomerID, C.CompanyName, COUNT(O.OrderID) AS TotalOrders

FROM Customers C LEFT OUTER JOIN Orders O ON

C.CustomerID = O.CustomerID

WHERE YEAR(O.OrderDate) = 1996

GROUP BY ALL C.CustomerID, C.CompanyName

daninmanchester Ten Centuries Points: 1204 More actions · Answer 7

I would tend to put some of the criteria in the FROM clause in a lot of cases rather than move straight to derived tables. Although in complex situations derived tables can help simplify things. But if the derived table is of particular value and reusable i will often put it in a view anyway.

pain_killer , the point about NOT using the Year() function is that it will make better use of any index.

Dan Bayley
affordable website design UK
Free Google and Yahoo sitemap generator

humbleDBA Hall of Fame Points: 3540 More actions · Answer 8

I'm afraid I'm with Jeff Moden and Tony Rogerson on this.

I think this is a good example of needing to be careful and understand the implications of what you are saying, and to have covered your research well. There is plenty of info out there on this area from 'good' sources to allow you to draw a conclusion(s) that you can back up (right or wrong, subjectively of course 😉 )

That said though, I would like to mirror Jeff's comments of not letting the replys dampen your spirit. You have put yourself forward for peer review and from my own experiences its a great way to gain experience and knowledge, and to improve, though it can seem quite painful along the route :crying: .

Josep SSCommitted Points: 1518 More actions · Answer 9

And why not using this sentence:

SELECT C.CustomerID, C.CompanyName,

COUNT(O.OrderID) AS TotalOrders

FROM Customers C LEFT OUTER JOIN Orders O ON

C.CustomerID = O.CustomerID AND (YEAR(O.OrderDate) = 1996 OR O.OrderDate IS NULL)

GROUP BY C.CustomerID, C.CompanyName

pain_killer SSC Enthusiast Points: 142 More actions · Answer 10

daninmanchester (1/16/2008)
pain_killer , the point about NOT using the Year() function is that it will make better use of any index.

Sorry, my english is very poor, and sometimes I don't understand what is the point of the problem, so please be patient to me:). In this case I had compare IO, TIME statistics and execution plans. YEAR() function and explicit date comparing in this case looks identical. So the question is: why use YEAR() function? I think that this looks clearer to me, and it uses only one parameter (2006 in this example).

tony rogerson SSC Veteran Points: 242 More actions · Answer 11

Using columns in functions means SQL Server can no longer do a seek; instead it must do a scan.

If there was a unique index on CustomerID, orddate.

Say you had 1 million rows per year between 2000 and 2007 for customerid = 1234

Say you wanted the counts for customerid 1234 for year 2007

Using the function Year( orddate ) = 2007 means that all the rows from 2000 for that customerid in the index need to be searched because SQL Server does not know the result until it has read the data and performed the function on the column.

Using >= '20070101' and < '20080101' allows SQL Server to just seek straight to the start 20070101 and read from there.

Jimi Meyer SSCommitted Points: 1764 More actions · Answer 12

As others have said, this is generally a good article and the technique is very valuable, but you have many errors. The characterization of the locking in the temp table example is simply wrong. It implies that access to tempdb is single threaded. It fails to mention that there are locks in the database providing the data (shared, but still locks). "Everything happens in memory" only if the data being referenced is already there.

mkheli@gmail.com SSC Journeyman Points: 81 More actions · Answer 13

Hi

You could update on driven table but in other way,

You were trying :

Update T SET Id=0 FROM (SELECT * FROM tt1) AS T

but you could use this as follow , will work fine :

update t set id =0

from tt1 as T where ...

You could add you where clause too, you could do that with complex driven tables too.

Mnaouar Ben Khelifa

http://www.mnaouar.bestilan.com

Jeff Moden SSC Guru Points: 1004701 More actions · Answer 14

Hey Prashant...

You remember I said "don't let it dampen your writing spirit"? Here's why... like I said, you have some good information in your article and people have already cited your article as an example in at least one of the posts... see the following URL for what I'm talking about...

http://www.sqlservercentral.com/Forums/Topic441707-5-1.aspx#bm443531

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)