June 2, 2009 at 2:20 pm
Hello. I'm working on a problem where I need to find the unique combinations of suppliers that supply various items.
In my data I have Supplier, Item, Orders. An Item can be supplied by 1 or more suppliers. I'd like to find all combinations of suppliers that supply the same item(s) happens in my data.
For example: Item 1 is supplied by A and B. Item 2 is supplied by B only and Item 3 is supplied by A only. In my results I'd like to see SupplierCombinationID 1 = A, SupplierCombinationID 2 = A and B (2 records with ID value 2). SupplierCombinationID 3 = B.
It's easy enough to see for a particular item all of the suppliers that supply that item. I can't seem to figure out how to then see the distinct combination of suppliers regardless of the specific item they're supplying.
Thanks for any help.
June 2, 2009 at 3:46 pm
I'm not sure if you are looking to see multiple rows per item for each unique supplier or a distinct list of suppliers that have supplied any item or 1 row per item displaying all suppliers of that item.
The first and second ones are pretty straight forward. The last one, 1 row per item displaying all suppliers of that item, is a little more advanced.
Below are all three hopefully one of these are what you are looking for.
For each of the three examples the table preperation is the same and is next:
------------------------------------------------------------------------------------------
-- Create a test table named tblOrder
------------------------------------------------------------------------------------------
IF NOT(OBJECT_ID(N'tblOrder') is null)
DROP TABLE tblOrder
CREATE TABLE tblOrder (iOrderId INT IDENTITY(1,1),iItemId INT,vcSupplierId VARCHAR(1))
CREATE CLUSTERED INDEX idx1 ON tblOrder (iItemId,vcSupplierId)
------------------------------------------------------------------------------------------
-- Query that returns unique list of suppliers by supplied item
------------------------------------------------------------------------------------------
INSERT INTO tblOrder
SELECT 1,'A'
UNION SELECT 1,'B'
UNION SELECT 2,'B'
UNION SELECT 3,'A'
UNION SELECT 4,'C'
UNION SELECT 4,'A'
UNION SELECT 4,'B'
UNION SELECT 5,'B'
UNION SELECT 5,'D'
The first, multiple rows per item for each unique supplier, is achieved using "group by"
SELECT iItemId,vcSupplierId
FROM tblOrder
GROUP BY iItemId,vcSupplierId
ORDER BY iItemId,vcSupplierId
The results of this query are:[font="Courier New"]
iItemId vcSupplierId
----------- ------------
1 A
1 B
2 B
3 A
4 A
4 B
4 C
5 B
5 D
(9 row(s) affected)[/font]
The second, list of unique suppliers that have supplied any item, is also achieved using "Group By"
------------------------------------------------------------------------------------------
-- Query that returns unique list of suppliers)
------------------------------------------------------------------------------------------
SELECT vcSupplierId
FROM tblOrders
GROUP BY vcSupplierId
ORDER BY vcSupplierId
The results of this query are:[font="Courier New"]
vcSupplierId
------------
A
B
C
D
(4 row(s) affected)[/font]
The last, 1 row per item displaying all suppliers of that item, is achieved using the FOR XML PATH along with "Group By" to concatenate the individual rows into a single column on their respective row for the same iItemId and is a little more advanced.
------------------------------------------------------------------------------------------
-- Query that returns items with the concatenated list of suppliers (delimited by ',')
------------------------------------------------------------------------------------------
SELECT iItemId,LEFT(vcItemSuppliers,len(vcItemSuppliers)-1) vcItemSuppliers
FROM (
SELECT oq.iItemId
,(
SELECT t1.vcSupplierId+','
FROM tblOrder t1
WHERE t1.iItemId = oq.iItemId
GROUP BY t1.vcSupplierId
FOR XML PATH('')
) vcItemSuppliers
FROM tblOrder oq
GROUP BY oq.iItemId
) SQ
GROUP BY iItemId,vcItemSuppliers
-- NOTE: The subquery named SQ is responsible for removing the trailing commas, the necessity of encapsulating these into a subquery depends upon the desired results.
The results are:[font="Courier New"]
iItemId vcItemSuppliers
----------- ----------------
1 A,B
2 B
3 A
4 A,B,C
5 B,D
(5 row(s) affected)[/font]
-
June 2, 2009 at 4:18 pm
Jason - that's an interesting result, thank you. I've never used For XML Path before, or anything with XML for that matter. From your results, I'm actually looking for a table where each vcItemSuppliers record becomes a group of records (grouped by an arbitrary integer) completely removed from the items (using items only for the basis from which to group the suppliers to begin with).
Using your data, the results would be something like:
Group / SupplierID
1, A
2, B
3, A
3, B
4, A
4, B
4, C
5, B
5, D
The combination of group and supplier is unique. A supplier is included in a group if that entire group of suppliers (A, B and C) for example supply the same product.
June 2, 2009 at 4:23 pm
macwiz (6/2/2009)
Jason - that's an interesting result, thank you. I've never used For XML Path before, or anything with XML for that matter. From your results, I'm actually looking for a table where each vcItemSuppliers record becomes a group of records (grouped by an arbitrary integer) completely removed from the items (using items only for the basis from which to group the suppliers to begin with).
I read your posting in more detail and edited my post accordingly... unfortunately I didn't edit before you saw the first one. Now the original post should have what you're looking for (query 1).
-
June 2, 2009 at 4:39 pm
It's true that a basic group by gives the correct results on the sample data, but I have many more items supplied by the same group of suppliers. If I include the item in the group by I wind up with duplicate combinations of suppliers.
Let's say Item 6 is supplied by B and D. With the group by (query 1) I would end up with 2 more rows returned for item 6 in addition to the two rows for item 5. Because items 5 and 6 are both supplied by B and D (and only B and D), what I'm looking for is one record for each of B and D (2 rows total) in group "Alpha" (to avoid confusion with items).
Group Alpha is comprised of suppliers B and D. Essientially creating a new level of aggregation of common suppliers shared by one or more products.
June 2, 2009 at 5:13 pm
This should do the trick for you... This yields the results you are looking but may not be the most efficient way to get those results. I didn't look into performance or optimizations much here (I just expanded upon the third query from the original post), at the very least this should get you started.
------------------------------------------------------------------------------------------
-- Create a test table named tblOrder
------------------------------------------------------------------------------------------
IF NOT(OBJECT_ID(N'tblOrder') is null)
DROP TABLE tblOrder
CREATE TABLE tblOrder (iOrderId INT IDENTITY(1,1),iItemId INT,vcSupplierId VARCHAR(1))
CREATE CLUSTERED INDEX idx1 ON tblOrder (iItemId,vcSupplierId)
------------------------------------------------------------------------------------------
-- Populate tblOrder with expected data results
------------------------------------------------------------------------------------------
INSERT INTO tblOrder
SELECT 1,'A'
UNION SELECT 1,'B'
UNION SELECT 2,'B'
UNION SELECT 3,'A'
UNION SELECT 4,'C'
UNION SELECT 4,'A'
UNION SELECT 4,'B'
UNION SELECT 5,'B'
UNION SELECT 5,'D'
UNION SELECT 6,'B'
UNION SELECT 6,'D'
------------------------------------------------------------------------------------------
-- Query that returns items with the concatenated list of suppliers (delimited by ',')
------------------------------------------------------------------------------------------
SELECT tbl2.iGroupId,tbl1.vcSupplierId
FROM (
SELECT ROW_NUMBER() OVER(ORDER BY vcItemSuppliers) iGroupId,LEFT(vcItemSuppliers,len(vcItemSuppliers)-1) vcItemSuppliers
FROM (
SELECT oq.iItemId
,(
SELECT t1.vcSupplierId+','
FROM tblOrder t1
WHERE t1.iItemId = oq.iItemId
FOR XML PATH('')
) vcItemSuppliers
FROM tblOrder oq
) SQ
GROUP BY vcItemSuppliers
) tbl2
INNER JOIN tblOrder tbl1
ON tbl2.vcItemSuppliers LIKE '%'+tbl1.vcSupplierId+'%'
GROUP BY tbl2.iGroupId,tbl1.vcSupplierId
The results look like this:[font="Courier New"]
iGroupId vcSupplierId
-------------------- ------------
1 A
2 A
2 B
3 A
3 B
3 C
4 B
5 B
5 D
(9 row(s) affected)[/font]
-
June 2, 2009 at 5:43 pm
Jason - thanks for the help. I'm running now. As long as it works optimization shouldn't be too much of an issue because I should only need to run rarely.
June 2, 2009 at 11:06 pm
I was able to run this, although I had to essientially run the basic group by query as the insert into tblOrder for the actual query to have any chance to finish reasonably.
I'm now stuck on the next step. For each Item at each Supplier some number (N) of units were sold. I'd like to populate the final result with the percent share that the supplier (vcSupplierID) for that record gave to the overall group (iGroupID). Modifying the select into tblOrder I have created:
------------------------------------------------------------------------------------------
-- Create a test table named tblOrder
------------------------------------------------------------------------------------------
IF NOT(OBJECT_ID(N'tblOrder') is null)
DROP TABLE tblOrder
CREATE TABLE tblOrder (iOrderId INT IDENTITY(1,1),iItemId INT,vcSupplierId VARCHAR(1), iUnits INT)
CREATE CLUSTERED INDEX idx1 ON tblOrder (iItemId,vcSupplierId)
------------------------------------------------------------------------------------------
-- Populate tblOrder with expected data results
------------------------------------------------------------------------------------------
INSERT INTO tblOrder
SELECT 1,'A', 2
UNION SELECT 1,'B', 3
UNION SELECT 2,'B', 4
UNION SELECT 3,'A', 5
UNION SELECT 4,'C', 6
UNION SELECT 4,'A', 7
UNION SELECT 4,'B', 8
UNION SELECT 5,'B', 9
UNION SELECT 5,'D', 10
UNION SELECT 6,'B', 11
UNION SELECT 6,'D', 12
The added column in the results would be the decimal values in the column I've manually entered here:
iGroupId vcSupplierId decShare
-------------------- ------------ --------
1 A 2/2=1
2 A 2/5=.4
2 B 3/5=.6
3 A 7/21=.333
3 B 8/21
3 C 6/21
4 B 4/4=1
5 B 20/42
5 D 22/42
Mathematically the third column is the sum of units for that vcSupplierID across all iItemID's included in the iGroupID for that vcSupplierID divided by the sum for the iGroupID for all iItemID's in that iGroupID. I'm not sure how to get back to these units in order to accomplish the division.
Thanks again for all of the assistance.
June 3, 2009 at 9:11 am
Look for OVER() clause with PARTITION BY clause and aggregating with SUM function in BOOKS ONLINE. Something like this should do.
SELECT *, ( Units * 100 ) / SUM( Units ) OVER( PARTITION BY vcSupplierId ) AS Share FROM tblOrder
--Ramesh
June 3, 2009 at 5:15 pm
I think I've managed to get it. Here's the code I ended up using:
SELECTtbl2.iGroupId,
tbl1.vcSupplierId,
sum(tbl1.units) / tbl2.GroupUnits as share
FROM(SELECTROW_NUMBER() OVER(ORDER BY sq.vcItemSuppliers) iGroupId,
LEFT(sq.vcItemSuppliers,len(sq.vcItemSuppliers)-1) vcItemSuppliers,
sum(sq.itemunits) as GroupUnits
FROM(SELECToq.iItemId,
(SELECTt1.vcSupplierId+','
FROMtblOrder t1
WHEREt1.iItemId = oq.iItemId
FOR XML PATH('')) vcItemSuppliers,
cast(sum(units) as float) as itemunits
FROMtblOrder oq
group byoq.iItemID) SQ
GROUP BYsq.vcItemSuppliers) tbl2
INNER JOIN(SELECToq.iItemId,
(SELECTt1.vcSupplierId+','
FROMtblOrder t1
WHEREt1.iItemId = oq.iItemId
FOR XML PATH('')) vcItemSuppliers
FROMtblOrder oq
group byoq.iItemID) SQ2
ONLEFT(sq2.vcItemSuppliers,len(sq2.vcItemSuppliers)-1) = tbl2.vcItemSuppliers
INNER JOINtblOrder tbl1
ONtbl2.vcItemSuppliers LIKE '%'+tbl1.vcSupplierId+'%'
andtbl1.iItemID = SQ2.iItemID
GROUP BYtbl2.iGroupId,
tbl1.vcSupplierId,
tbl2.GroupUnits
order bytbl2.iGroupId,
tbl1.vcSupplierId
Thanks again!
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply