January 4, 2010 at 8:53 am
I have written a query, but im not particularly happy with it. I was hoping that i could get some ideas as to how it could be written differently from this forum.
The requirements are as follows:
Find the OrderID, BillType, BillingFlag and count of OrderLines for that BillType, for all orders where the order has both orderlines with a BillType of 'INI' AND 'SCH', and any orderLines with a BillType of 'INI' has a billingFlag of 'F' AND any OrderLines with a BillType of 'SCH' has a billingFlag of 'F'. Only orderLines with an OrderLineStatus of 'INV' should be returned.
I have supplied a sample table with sample data, and my query for returning the required data. The correct query results are supplied also.
Im just looking to see if it could be written more efficiently.
--sample table and data
create table orderLines
(
orderlineID int,
orderID int,
BillingType char(3),
BillingFlag char(1),
orderlineStatus char(3)
)
insert into orderlines
values(1,1,'INI','F','INV')
insert into orderlines
values(2,1,'SCH','T','INV')
insert into orderlines
values(3,1,'SCH','T','INV')
insert into orderlines
values(4,2,'INI','T','INV')
insert into orderlines
values(5,2,'INI','T','INV')
insert into orderlines
values(6,2,'SCH','F','INV')
insert into orderlines
values(7,3,'INI','F','INV')
insert into orderlines
values(8,3,'SCH','F','INV')
insert into orderlines
values(9,3,'SCH','T','INV')
insert into orderlines
values(10,3,'SCH','T','HIS')
select * from orderlines
--my query
select distinct ol.orderid,ol.billingType,ol.BillingFlag,count(ol.billingType) as 'Num order Lines'
from orderlines ol
join
(
select distinct ol1.orderid
from orderlines ol1 join orderlines ol2 on ol1.orderid = ol2.orderid
and ol1.billingtype = 'INI'
and ol1.BillingFlag = 'F'
and ol1.OrderlineStatus <> 'HIS'
and ol2.billingType = 'SCH'
and ol2.BillingFlag = 'T'
and ol2.OrderLineStatus <> 'HIS'
) as olist on ol.orderid = olist.orderid
where ol.OrderlineStatus <> 'HIS'
and ((ol.billingtype = 'INI' and ol.BillingFlag = 'F') or (ol.billingtype = 'SCH' and ol.BillingFlag = 'T'))
group by ol.orderid,ol.billingtype,ol.BillingFlag
order by ol.orderid
--correct results
orderidbillingTypeBillingFlagNumorder Lines
1INIF1
1SCHT2
3INIF1
3SCHT1
thanks in advance for any help.
January 4, 2010 at 9:02 am
Something like this?
SELECT ol.orderid,ol.billingType,ol.BillingFlag,count(ol.billingType) as 'Num order Lines'
FROM orderlines ol
WHERE OrderLineStatus<>'HIS'
AND (
(ol.billingtype = 'INI'
and ol.BillingFlag = 'F'
)
OR
(
ol.billingType = 'SCH'
and ol.BillingFlag = 'T'
)
)
GROUP BY ol.orderid,ol.billingType,ol.BillingFlag
January 4, 2010 at 9:03 am
You don't need the distinct in the outer query.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 4, 2010 at 9:15 am
lmu92 (1/4/2010)
Something like this?
SELECT ol.orderid,ol.billingType,ol.BillingFlag,count(ol.billingType) as 'Num order Lines'
FROM orderlines ol
WHERE OrderLineStatus<>'HIS'
AND (
(ol.billingtype = 'INI'
and ol.BillingFlag = 'F'
)
OR
(
ol.billingType = 'SCH'
and ol.BillingFlag = 'T'
)
)
GROUP BY ol.orderid,ol.billingType,ol.BillingFlag
hi.
Thanks for the response, if there is an order with only orderlines with a billingtype of INI, or an order only with orderlines of billingType SCH, it will return them. it must only return lines from an order that has both orderlines with a billingtype of SCH and also lines with a billingtype of INI.
Sorry for not specifying that and providing test data for that scenario. adding the following lines will add order 4 to the table, and although order 4 has lines with a BillType of INI and a flag of F, it has no lines in SCH billType. no lines for order 4 should be returned by the query:
insert into orderlines
values(11,4,'INI','F','INV')
insert into orderlines
values(12,4,'INI','F','INV')
January 4, 2010 at 10:53 am
Here's another version using a CTE to check that both billing types exist.
Assuming the required indexes are available, it is slightly faster than your solution (about 20% as per execution plan). Assuming no indexing at all, it's about 30% faster.
;with cte as
(
SELECT orderid FROM orderlines WHERE OrderLineStatus<>'HIS'
AND (
(billingtype = 'INI' and BillingFlag = 'F')
OR
(billingType = 'SCH' and BillingFlag = 'T')
)
GROUP BY orderid
HAVING count(distinct billingtype)=2
)
SELECT ol.orderid,billingType,BillingFlag,count(ol.billingType) as 'Num order Lines'
FROM orderlines ol INNER JOIN cte ON cte.orderID=ol.orderID
WHERE ol.OrderLineStatus<>'HIS'
AND (
(ol.billingtype = 'INI' and ol.BillingFlag = 'F')
OR
(ol.billingType = 'SCH' and ol.BillingFlag = 'T')
)
GROUP BY ol.orderid,ol.billingType,ol.BillingFlag
January 4, 2010 at 1:40 pm
awesome! Thank you for the help on this. I like to see alternative ways of writing queries as I have a problem that once i come up with a solution, i have a difficult time thinking outside the boundaries of the solution i have devised, so seeing alternatives answers to the same problem will hopefully help me with thinking a little more broadly.
Your solution is much more elegant than my own, and although i regularly use CTE's, it hadnt crossed my mind to use one here. thanks again.
January 4, 2010 at 1:58 pm
You're very welcome!
Would you mind sharing the results for both versions when applied to your "real data" for both solutions? (including total number of rows in table and indexes on that table, too). I'm just curious...
January 5, 2010 at 3:05 am
lmu92 (1/4/2010)
You're very welcome!Would you mind sharing the results for both versions when applied to your "real data" for both solutions? (including total number of rows in table and indexes on that table, too). I'm just curious...
So, on running both versions my query consumes 75% and lutz's query takes 25% of the batch. The combined total time for both queries(entire batch) is 34 seconds.
The table in question contains 36914654 orderlines in total, with 12392964 distinct orderID's.
For both queries, the query analyzer recommended a non clustered index on the OrderLineID (which is part of the clustered index of the table) and the OrderStatus, and includes the OrderID, BillingType and BillingFlag.
The table also has a non clustered index on the OrderID, OrderLineStatus and a few other columns that were not mentioned or relivant to this exercise.
Thanks again for your help and tuition.
January 5, 2010 at 3:30 am
Do you have an index on your table that is comparable to the one I would use?
And if so, are the timings taken before or after applying the index?
Important side note:
Since we're talking about a rather large table: If you don't have a similar index like below, you shouldn't just create it right away. It needs to be considered how often this query will be used, how often inserts/updates will take place and if the current duration is acceptable. The addtl. index will consume disk space and will influence (slow down) inserts/updates since one more index needs to be updated.
So: be careful!
CREATE nonclustered INDEX ix_orderLines1 ON orderLines (orderid,billingtype,billingflag) include (orderlinestatus)
January 5, 2010 at 3:38 am
No, there is no index similar to the one you created. I couldnt create this index straight away, regardless of how much it boosts performance as there are strict rules about changes that go into production and how they have to go through Test systems first, so they are also included in the code source safe.
January 5, 2010 at 4:02 am
winston Smith (1/5/2010)
... as there are strict rules about changes that go into production and how they have to go through Test systems first, so they are also included in the code source safe.
Sounds like an appropriate business rule for a production environment.
But we've been talking about the test system anyway, didn't we? 😉
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply