March 30, 2015 at 3:01 pm
We sell & ship packages that contain multiple items within them. The actual package (we call it the "parent item") is in the same table as the items within it ("child items"). If the record is a child item within a package, its "ParentId" field will contain the ItemId of the package.
So some sample records of a complete package would look like this:
ItemId | ParentId | Name | QtyAvailable
----------------------------------------
1 | NULL | Package A | 10
2 | 1 | Item 1 | 2
3 | 1 | Item 2 | 3
ItemId's 2 & 3 are items contained within the ItemId 1 package.
Now however, the client wants us to build a report showing all packages (all items where ParentId is NULL) however, they want to see the QtyAvailable of not only the package but the items as well (a total of 15 when using the example above), all grouped into a single line. So a sample report line would look like this:
Name | Available Qty
--------------------------
Package A | 15
Package B | 100
How can I do a SELECT statement that SUMS the "QtyAvailable" of both the parent & child items and displays them along with the package name?
Thanks
March 30, 2015 at 3:29 pm
I would probably do it this way:
Package---(1,M)--Contains--(M,1)--Items
CREATE TABLE Contents (
PackageID INT NOT NULL,
ItemID INT NOT NULL,
Qty INT
FOREIGN KEY fkItemID ItemID REFERENCES Item(ItemID),
FOREIGN KEY fkPkgID PackageID REFERENCES Package(PackageID));
I'm assuming that the Items are generic/non-unique. If you do it this way, getting counts is stupid easy.
March 30, 2015 at 3:43 pm
Quick question.
Can the packages contain packages?
March 30, 2015 at 3:47 pm
This is one query that should at least get the data as you want it.
It's pretty inefficient, as it requires a couple table scans, so you might want to look for alternate ways to implement the logic that would perform better.
At any rate, this should give you something to start with (I put about 20 million rows of data similar to yours in a temp table to test; without the exact DDL for your table and indexes that's about as much as I can do):
SELECT Packages.Name, Packages.ItemID, Item.TotalQtyAvailable
FROM #tmp Packages
INNER JOIN (SELECT COALESCE(ParentID,ItemID) AS PackageID,
SUM(QtyAvailable) as TotalQtyAvailable
FROM #tmp
GROUP BY COALESCE(ParentID,ItemID)) Item
ON Packages.ItemID=Item.PackageID
WHERE Packages.ParentID IS NULL
That took a couple minutes to run against 20 million rows on my machine. Your mileage may vary, as always 🙂
I hope this helps!
Cheers!
EDIT: Luis posted his important question while I was typing this up. That does indeed matter, as this solution assumes all packages have a null ParentID. If that's not true, then the solution would have to change.
March 30, 2015 at 4:03 pm
create table tab(
ItemId int,
ParentId int,
Name varchar(10),
QtyAvailable int);
insert into tab (
ItemId,
ParentId,
Name,
QtyAvailable)
select
1 as ItemId,
null as ParentId,
'Package A' as Name,
10 as QtyAvailable
union all
select
2,
1,
'Item 1',
2
union all
select
3,
1,
'Item 2',
3;
with
subquery_cte as (
select
coalesce(ParentId, ItemId) as Id,
case
when
ParentId is null
then
Name
else
null
end as Name,
sum(QtyAvailable) as QtyAvailable
from
tab
group by
coalesce(ParentId, ItemId),
case
when
ParentId is null
then
Name
else
null
end)
select
min(Name) as Name,
sum(QtyAvailable) as QtyAvailable
from
subquery_cte
group by
Id;
drop table tab;
Assuming, you mean simple parent-child relation, this is just a bit different solution. I believe it's rather efficient because references the table only once.
March 30, 2015 at 6:49 pm
I'd have to go with something similar to Rafal here but no need for two GROUP BYs:
WITH SampleData AS
(
SELECT ItemID=1, ParentID=NULL, Name='Package A', QtyAvailable=10
UNION ALL SELECT 2, 1, 'Item 1', 2
UNION ALL SELECT 3, 1, 'Item 2', 3
UNION ALL SELECT 4, NULL, 'Package B', 80
UNION ALL SELECT 5, 4, 'Item 1', 10
UNION ALL SELECT 6, 4, 'Item 2', 10
)
SELECT ItemID=ISNULL(ParentID, ItemID)
,QtyAvailable=SUM(QtyAvailable)
,Name=MIN(CASE WHEN ParentID IS NULL THEN Name END)
FROM SampleData
GROUP BY ISNULL(ParentID, ItemID);
Edit: Assuming of course that the answer to Luis's question is no.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 31, 2015 at 8:28 am
Thanks for all of the responses. I went with your solution Dwain. Worked great!
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply