June 12, 2007 at 7:43 am
Hi, I have a problem with two SQL statements that each work fine on their own, but when I try to union them together, I get the error,
ORDER BY items must appear in the select list if the statement contains a UNION operator.
This error doesn't look right to me since there is no ORDER BY (other than in a correlated subquery that should be processed prior to the UNION). It does though seem to be the subquery that is causing the problem, since if I remove the subquery, the UNION works fine (other than giving me the wrong data coz of the subquery being missing )
The two SQL statements both look like this:
select s.date, p.code
from products p
inner join sales s on s.productcode = p.code
where
s.productcode in
(select top 1 s_inner.productcode from sales s_inner
where s_inner.productcode = s.productcode
order by s_inner.date desc)
So they are both basically using the subquery to filter the data that goes into the JOIN (to make sure only the record with the latest date goes in). The two queries differ in some how they filter for additional columns, which I haven't shown here to keep things simple.
June 12, 2007 at 9:10 am
I don't see the problem in the statement you provided. If I take that statement and duplicate it with a "union" between them it still compiles (syntax check anyway) so I can't help based on what you've provided. Maybe someone else will spot the problem but I think I will need the complete statement (both queries and the union clause).
James.
June 12, 2007 at 10:29 am
Thanks for the help. I've done some more playing around and it looks like it's a SQL Server 2005/2000 issue. (I've also realized that although I was accessing the original database through Management Studio, it was a 2000 database).
I've set up a small test database in SQL Server 2000 that would allow me to run the query exactly as I quoted it in my first post, UNION'ed with itself, and got the same error message. Then I imported the database to SQL Server 2005 and ran the exact same UNION'ed query and it worked fine.
So I guess my question now is, what is it about that query that stops it working in 2000? (Which possibly means I'm now in the wrong forum - sorry!) Unfortunately, upgrading the live database to 2005 isn't an option.
June 12, 2007 at 11:46 am
Hmm, I was able to duplicate the problem, though I can not explain why. It appears to be a parsing error by the 2000 engine. Here is code that will duplicate your problem using a common database (in case anyone else would like to play with it):
select o.orderdate, p1.productid
from orders o join [order details] od on (o.orderid = od.orderid)
join products p1 on (od.productid = p1.productid)
where p1.productid in (select top 1 p2.productid from products p2 where p2.productid = p1.productid order by p2.productname) and
orderdate between '07/01/1996' and '7/30/1996'
union
select o.orderdate, p1.productid
from orders o join [order details] od on (o.orderid = od.orderid)
join products p1 on (od.productid = p1.productid)
where p1.productid in (select top 1 p2.productid from products p2 where p2.productid = p1.productid order by p2.productname)and
orderdate between '07/01/1996' and '7/30/1996'
order by o.orderdate, p1.productid
While the subquery in my example above does not actually accomplish anything it does cause the error you reported and by my reconning should work in 2000 as the order by is in the sub-query and not part of the unioned query.
You may have to consider rewritting the query with possibly an intermediate table that would allow you eliminate the order by in the sub-query.
--James.
If you would like an extra pair of eyes to see if a rewrite of the query will work post the definition of the tables involved and the complete statement or procedure involved. Also if there is a parent table for sales, post that also as your sales table (from the little I've seen) doesn't appear to be normalized fully. A related parent or child table might help make the query easier to write.
June 12, 2007 at 12:23 pm
Maybe I am missing something... but why even have that order by in there?
June 12, 2007 at 12:51 pm
Anders: Without knowing the full design or his reasoning I can't answer for the original post, in mine I included it just to create the error. From the data given in the original post I would think the where clause could be changed to "where s.productcode is not null" and the result set would be the same since the subquery will not result in the elimination of any rows from the outer-query result set. So the sub-query "order by" problem is really moot.
I was more curious about the error reported by SQL 2000 as it should be possible to have a corrolated sub-query using TOP and order by with a union clause.
James.
June 13, 2007 at 8:06 am
Thanks for the offer, Enthusiast. You are correct that the sales table is not normalized, but that's outside my control; I can only query the database, not change it.
In terms of what I'm doing, essentially I'm trying to join the sales and product tables (Well, really I'm trying to do more than that but the additional stuff isn't relevent to this problem). However, there are multiple sales for each product but I only want the resultset to include the latest sale for each product. That's what the ORDER By thing in the subquery is for: It's there to exclude all but the sales row with the latest date for each product (I guess that should answer Anders' question too). If you know of another way of achieving that without using the correlated subquery (and that'll work on SQL Server 2000), I'd be interested. Offhand I can't think of one.
The tables I put together to reproduce the problem for this forum with the code from my first post are:
CREATE TABLE [dbo].[Products](
[Code] [int] IDENTITY(1,1) NOT NULL
)
CREATE TABLE [dbo].[Sales](
[Code] [int] NULL,
[ProductCode] [int] NULL,
[Date] [datetime] NULL
)
Sales.ProductCode is intended to be a foreign key that links to the Products.Code primary key (though as it happens I didn't set it up as a primary key in the test database)
And - parsing error in SQL Server 2000? Hmmm, that would certainly explain the issues I've been having with the query 🙂
June 13, 2007 at 9:28 am
Unless I'm missing something the following should do exactly what you asked for "sales row with the latest date for each product":
select productcode, max(date)
from sales
group by productcode
order by productcode
HTH,
James.
June 13, 2007 at 9:57 am
After rereading your last post I saw that you wanted to be able to join the sales and product tables, I'm assuming you want the get additionall information about the product after you determin the last sales date. Here I'm using the same query I provided above but as a computed table that is then used in the query that returns the final result set.
I thought I would through in some test code also, but the final query is the one you are interested in.
if object_id('products') is not null drop table products;
if object_id('sales') is not null drop table sales;
--CREATE TABLE [dbo].[Products](
-- [Code] [int] IDENTITY(1,1) NOT NULL
--);
--CREATE TABLE [dbo].[Sales](
-- [Code] [int] NULL,
-- [ProductCode] [int] NULL,
-- [Date] [datetime] NULL
--);
--CREATE some random product test data (100 products with random names)
SELECT TOP 100
Code = IDENTITY(INT,1,1),
Name = CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))
INTO dbo.products
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2 ;
--CREATE some random sales data (1000 sales limited to the 100 products)
SELECT TOP 1000
RowNum = IDENTITY(INT,1,1),
ProductCode = CAST(RAND(CAST(NEWID() AS VARBINARY))*100+1 AS INT),
[Date] = CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME)
INTO dbo.sales
FROM dbo.products p,
Master.dbo.SysColumns sc ;
--select * from products;
--select * from sales;
--select distinct productcode from sales
--Now get the code, name and lastest sale date for every product ever sold
--NOTE: If a product is not sold it won't appear here.
select p.code, p.name, t.lastsaledate
from products p join (
select productcode, max(date) lastsaledate
from sales
group by productcode
) t on p.code = t.productcode
order by p.code
;
--James
June 15, 2007 at 9:42 am
Wow, thanks James! That's very useful. And yep, I'd forgotten you could use max() in conjunction with group by to get the set of latest dates. Much neater a way of doing it.
June 15, 2007 at 9:53 am
Your welcome. Glad I could help.
James.
June 22, 2007 at 9:41 am
I'm running into the same problem... only I can't use Max because I need the 15 most recent records for each EmployeeId. The following query works in SQL 2005, not in SQL 2000: Obviously it is just an example.
SELECT NULL CompanyName, NULL OrderDate
UNION ALL
SELECT c.CompanyName, o.OrderDate
FROM Orders o INNER JOIN Customers c ON o.CustomerId = c.CustomerId
WHERE c.CustomerID = 'VINET' AND o.OrderId IN
(SELECT TOP 3 o2.OrderID
FROM Orders o2
WHERE o2.CustomerID = o.CustomerID
ORDER BY o2.OrderDate DESC)
To get around this right now, I've created a table variable to get the information I need using the correlated subquery. Later I select from the table variable and UNION the results to other Select Statements needed for my query. If anyone knows of an easier way to do this that allows UNION I'd like to know.
Thanks!
June 22, 2007 at 11:07 am
Ryan: I don't have a better solution of the top of my head. However you would be better off posting this as a new question. Most people won't read this deep into an old message unless they were already invested (like me) in the solution of the original question.
I suggest reposting this as a new question similiar to the following:
Title: "Help with error involving UNION operator and ORDER BY clause"
Describe the problem and the solution you came up with and ask if anyone knows how to do it as a single SQL statement that will work in SQL 2000.
Good Luck
James.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply