December 20, 2004 at 2:51 am
I want to split a SQL-result in two parts, the first 3 rows, and the others.
The first 3 is easy:
SELECT TOP 3 Field1, Field2, Field3
FROM (HugeDerivedTable) Tmp
WHERE Field1 = @Field1
ORDER BY Field2
But finding the others (all the rows after the third row) is not that easy.
I think it's is possible using EXIST or IN, but I can't find out.
And is it possible without repeating the HugeDerivedTable (lots of JOINs and UNIONS)?
I hope someone can give me a hint, how to deal with this problem.
December 20, 2004 at 5:31 am
One way is put the results into a temp table first with an IDENTITY column and then select based on the value of the IDENTITY column.
Far away is close at hand in the images of elsewhere.
Anon.
December 20, 2004 at 5:57 am
Also, by placing the hugeDerivedTable into a temptable, you won't have to repeat the expression for it, which I seriously think would be very difficult to avoid otherwise, since the question was to separate into two different resultsets.
If hugeDerivedTable is the 'base' for the query, then it has to be repeated for each SELECT that should be based on it.
(eg once for TOP 3 and once for the rest)
Whichever is most efficient can't be answered without empirical testing though. Sometimes creating temptables can be better that derived tables, other times not Only way to find out is to test it.
/Kenneth
December 20, 2004 at 7:37 am
Kenneth and David thanks for your answers.
I see that I didn't explain my question well enough.
The point is, that I want two Queries:
1. one giving the top 3 rows (the easy one)
2. one giving the next rows, number 4, 5, etc.
The problem is the sql for the second table.
There must be a way doing this without using a temp table and without repeating the hugeDerivedTable in the WHERE part of the Sql.
I did see the solution in this Forum, but how hard I tried, I can't find it back. There was a trick that I can't remember.
Hoping someone does?
Btw, adding a column and split the sql on that value is a posibility I'll explore.
December 20, 2004 at 12:38 pm
Not sure if I completely understood you or have missed something, but would this help?
use northwind
declare @a datetime
select top 3 @a= shippeddate
from orders
where shippeddate is not null
order by shippeddate
select * from orders where shippeddate<=@a or shippeddate is null
union all
select * from orders where shippeddate>@a
If there are no NULL values, something like this probably will also work:
select * from orders where shippeddate in
(select top 3 shippeddate from orders where shippeddate is not null)
union all
select * from orders where shippeddate not in
(select top 3 shippeddate from orders where shippeddate is not null)
Are you after some kind of paging?
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
December 21, 2004 at 2:39 am
I did see the solution in this Forum |
If there is a solution then I'd be interested.
As I see it, to find the top 3 (to be able to ignore them) you have to do the 'HugeDerivedTable' first, then apply the results against the same 'HugeDerivedTable' to get what you want. Without repeating the query or using temp table, I cannot see how this can be done.
Good luck with your continued search,
Far away is close at hand in the images of elsewhere.
Anon.
December 21, 2004 at 3:34 am
Frank,
Thank you for your answer.
I tried it, but it is not what I wanted. I'm impressed by your sql but it is to difficult for me to understand the way it works.
It is a good idea to use Northwind.
I'll try to explain what I want using this example.
I want two queries:
USE northwind
SELECT TOP 3 * FROM Orders
ORDER BY orderID
SELECT * FROM Orders
WHERE NOT OrderID IN (SELECT TOP 3 OrderID FROM Orders ORDER BY OrderID)
Is this what you call paging?
In stead of Orders I have a HugeDerivedTable.
What I want, is the second sql in a version that does not use twice the Orders table.
I know that it is possible. I saw the solution this month(?), somewhere on this forum. But can't find it any more. I (and David) hope someone can remember where or know how to. 🙂
December 21, 2004 at 3:51 am
Sorry, after posting it I became aware that it might not be the best example to illustrate it. The basic idea is to assign the third highest or lowest value to a variable and then query for the rows below or equal and in a second query above this value. Either way you need to repeat that what you call HugeDerivedTable twice. Off-hand I don't know a workaround for this. But I would be _very_ interested if there is one.
A workaround (at least less to type) might be to do:
SELECT TOP 3 orderid INTO #t
FROM Orders
ORDER BY orderID
SELECT * FROM Orders
WHERE NOT OrderID IN (SELECT * FROM #t)
DROP TABLE #t
However, that way you will have to deal with stored procedure recompiles and since SELECT INTO is a minimally logged operation it will also invalidate your log, forcing you to do a full or differential backup to keep consistency
I was asking for paging, since this is a common requirement and the NOT IN technique is quite common for this.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
December 21, 2004 at 6:10 am
Frank,
Thank you for your clear explanation.
See, if I can use your way of handling this problem.
But, I will look futher on, trying to find the solution again.
Maybe it was just a dream 😉
I will let you know if find it!
December 21, 2004 at 7:28 am
Could you select the TOP COUNT(*)-3 using the reverse order by? Might have to use a variable and an extra query to get the count... I doubt SQL would like "SELECT TOP (SELECT COUNT(*)...." I don't even know if it will like "SELECT TOP @count..." just a WAG.
December 21, 2004 at 9:29 pm
Hi Henk
You can try this option. It should be faster as Join is more efficient then Where condition.
select a.*
from YOURTABLE a
left outer join
(
select top 3 * from YOURTABLE
) b
on a.PK_FIELD = b.PK_FIELD
where
b.PK_FIELD is null
Hope it will help.
Gregory
December 22, 2004 at 2:13 am
@tab Alleman:
Nice trick, but I am sorry to tell you that it doesn't work.
It gives a "Incorrect syntax error" in the Enterprise Manager.
@gregory Grager
I think that this is the way I am searching for.
But as the YOURTABLE is a HugeDerivedTable and used twice,
there is the problem that it has to be executed twice (speed)
and I have to update two tables on each change.
There must be a solution using the table only once.
I have seen it on this forum but can't find it back.
Thank you both for your help
December 23, 2004 at 1:01 am
I'm having a really hard time figuring out how this can be done.
You want two querys (that is two separate select statements) both using the same "table", in this case a derived table which itself is constructed by a select statement, right?
You also want the same derived table to be used by both querys.
I might be missing something here, but this simply cannot be done.
The trick you're looking for might be how to generate identities on the fly? (a column with a 'rowcounter') It is possible to add a column to a select statement that acts like a 'rowid' of sorts, but... if speed is your concern, then those methods are guranteed not to work well for you - they are ok for small sets of data, but terrible on larger volumes.
Perhaps we should take a step back and look again? Can you expand on your reasons why you absolutely cannot repeat the hugederived table twice? How long does it take to do a select hugederived? How many rows does it contain? What is your reason for wanting two selects? (First 3 and then the rest) ..anything else helpful you may think of
/Kenneth
December 23, 2004 at 1:26 am
Kenneth,
Thanks for your explanation, it made clear that it is not an easy problem.
I think that the solution I once saw, was separating the first row from the other rows. Something with a ISNULL and/or NULLIF. And that is not the same as the top 3.
Why I don't want the HugeDerivedTable twice?
Mainly because it has a parameter in it. There is a solution for this: packing the whole in a sub-table and bringing the parameter outside the pack; I have done this before.
And because now and then there are some changes (when these have to be done twice, there is a greater chance for mistakes).
The speed is not a real problem, SQL Server is fast enough.
I think I will try another solution:
Add an extra field (called priority) with for each group the values 1, 2, 3, 4, 5, etc.
So I can use:
WHERE Priority > 3
or
WHERE Priorty <= 3.
December 23, 2004 at 1:43 am
Well, I don't quite get how the priorities would work, so I can't comment on that, but - if speed isn't the problem, why not then put the hugeDerived into a temptable, or a table variable? If it's complex enough to create, I think it may be warranted to do so (even though temptables are evil ) from the reasons you mentioned - easier to maintain and change code, easier to do regression testing etc.
/Kenneth
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply