October 25, 2006 at 9:13 pm
Does anyone have any recommendation for a good way to code the capture of product affinities? I'd like to do it without using a cursor, but I'm not sure that it's possible.
Here's the scenario: A sales order may contain 1 to N lines. In our example, we'll say that the sales order contains lines for the sale of items 'A', 'B', and 'C'. Product affinities are defined as "When someone bought item 'A', they also bought item 'B', AND when someone bought item 'A', they also bought item 'C', AND the reverse of these is also true. That is, when someone bought 'B', they also bought 'A', for example."
We want to capture the follow data elements: 1) The date on which the affinity occurred, 2) The sum quantity of each item that sold, and 3) The number of 'hits' for the affinity pair.
I'm thinking the columns in the table might look something like:
The quantity columns would the sum quantities sold for each item in the affinity pair for the date, and AffinityHits column would be incremented by one for each occurrence of the affinity pair for the date. AffinityKey-Date would be a unique constraint on the table.
Any ideas would be appreciated. Especially if you've done something like this before.
Thanks.
Richard D. Cushing
Sr. Consultant
SYNGERGISTIC SOFTWARE SOLUTIONS, LLC
October 25, 2006 at 10:30 pm
It'll take me a bit to setup some test data and write an example (an I'm going to bed for the night so not happening tonight), but I'm thinking a "self join" of the source table where the Invoice number = the Invoice number and the ItemID <> ItemID... of course Affinity Hits would be a SUM aggragate, Date would come from either the left or right table (no outer join, though... this would be more of a semi-cross-join). Seems like the AffinityKey should actually be a combination fo the two ItemID's.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 26, 2006 at 10:24 am
Assuming the structure of the Invoice & InvoiceDetail to be as follows with the sample test data
create table Invoice (InvNum int NOT NULL identity(1,1), InvDate datetime NOT NULL)
create table InvDetl (InvNum int NOT NULL, ItemID varchar(5) NOT NULL, Qty int NOT NULL)
insert into Invoice (InvDate) select '01/01/2006'
insert into Invoice (InvDate) select '01/02/2006'
insert into Invoice (InvDate) select '01/02/2006'
insert into Invoice (InvDate) select '01/03/2006'
insert into Invoice (InvDate) select '01/03/2006'
insert into Invoice (InvDate) select '01/03/2006'
insert into Invoice (InvDate) select '01/03/2006'
insert into InvDetl select 1, 'A', 3
insert into InvDetl select 1, 'B', 2
insert into InvDetl select 1, 'C', 1
insert into InvDetl select 2, 'A', 4
insert into InvDetl select 2, 'B', 6
insert into InvDetl select 2, 'D', 2
insert into InvDetl select 2, 'E', 2
insert into InvDetl select 3, 'C', 5
insert into InvDetl select 3, 'B', 2
insert into InvDetl select 3, 'A', 1
insert into InvDetl select 4, 'A', 9
insert into InvDetl select 5, 'C', 4
insert into InvDetl select 5, 'B', 6
insert into InvDetl select 5, 'D', 2
insert into InvDetl select 5, 'E', 2
insert into InvDetl select 6, 'A', 4
insert into InvDetl select 6, 'B', 6
insert into InvDetl select 6, 'D', 3
insert into InvDetl select 6, 'E', 1
insert into InvDetl select 6, 'G', 2
insert into InvDetl select 6, 'H', 8
insert into InvDetl select 7, 'A', 4
insert into InvDetl select 7, 'B', 6
insert into InvDetl select 7, 'C', 3
insert into InvDetl select 7, 'D', 1
insert into InvDetl select 7, 'E', 5
insert into InvDetl select 7, 'F', 7
insert into InvDetl select 7, 'G', 2
insert into InvDetl select 7, 'H', 9
You can use the following query, that I have attempted as best as I can to validate & test. If you do find errors - please let me know.
select d.AffDate, d.Item1, d.Item2, sum(Qty1), sum(Qty2), count(*) as AffHits
from
(select d1.InvNum, d1.ItemID as Item1, d2.ItemID as Item2, d1.Qty as Qty1, d2.Qty as Qty2,
(select InvDate from Invoice i with (nolock)
where i.InvNum = d1.InvNum) as AffDate
from InvDetl d1 with (nolock), InvDetl d2 with (nolock)
where d1.InvNum = d2.InvNum
and d1.ItemID <> d2.ItemID
  as d
/*
where d.AffDate = '01/03/2006' --For Verification purposes
*/
group by d.AffDate, d.Item1, d.Item2
order by 2,3
I would suggest doing this in a batch process once every day, since one would assume that you have a large amount of data.
October 26, 2006 at 6:07 pm
Bingo!... Nice job, CK...
--Jeff Moden
Change is inevitable... Change for the better is not.
October 27, 2006 at 5:30 am
Yes, nice indeed. But I was surprised to see this strange difference in results between using the old full join syntax vs the new full join syntax. The new full join syntax produces 2 extra records with nulls in the invoice number and the second item column. And it's all caused by invoice 4 with its lonely single item. I think there is something subtle I'm missing.
Old style
select d1.InvNum,d1.ItemID,d2.ItemID
from InvDetl d1,InvDetl d2
where d1.InvNum=d2.InvNum and d1.ItemID<>d2.ItemID
New style
select d1.InvNum,d1.ItemID,d2.ItemID
from InvDetl d1
full join InvDetl d2 on d1.InvNum=d2.InvNum and d1.ItemID<>d2.ItemID
October 27, 2006 at 6:26 am
Old style
select d1.InvNum,d1.ItemID,d2.ItemID
from InvDetl d1,InvDetl d2
where d1.InvNum=d2.InvNum and d1.ItemID<>d2.ItemID
New style
select d1.InvNum,d1.ItemID,d2.ItemID
from InvDetl d1
INNER join InvDetl d2 on d1.InvNum=d2.InvNum and d1.ItemID<>d2.ItemID
or
select d1.InvNum,d1.ItemID,d2.ItemID
from InvDetl d1
CROSS join InvDetl d2
WHERE d1.InvNum=d2.InvNum and d1.ItemID<>d2.ItemID
October 27, 2006 at 6:36 am
Right! My confusion about 'full joins'.
October 27, 2006 at 6:59 am
Not sure Michael is set on this... FULL joins return all rows from both tables 1 time, Cross joins will produce either a partial or full cartesian join (ex. if two tables have 10 rows each, will produce 10*10 or 100 rows if full cartesian join is realized) depending on the cirteria. INNER join and old style equi-joins must have matches in both tables.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply