April 29, 2024 at 3:19 pm
That is correct Phil, Yes, I change my mind about customer being in more than one group. based on it can you please help. Thank you
April 29, 2024 at 3:29 pm
OK, then ordering of the tests becomes important. Please specify the order in which the tests should be applied.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 29, 2024 at 3:48 pm
Ok great. please with email first then phone nos second then addr&city (combined) last
please let me know if you need any more info. Thank you
April 29, 2024 at 3:52 pm
I mean in addr & city the combination has to match.
also please not all columns values are nullable so some times there is missed values in fields
Thank you Phil.
April 29, 2024 at 4:25 pm
I've done the emails, the phone numbers and unmatched in the code that follows. You can add the rest yourself.
--Rule 1 - E-mail match
WITH AllEmails
AS (SELECT c.Custno
,Email = c.email1
FROM #Customers1 c
UNION --Use instead of UNION ALL to make the results distinct
SELECT c.Custno
,Email = c.email2
FROM #Customers1 c)
,matches
AS (SELECT a.Email
,ct = COUNT (*)
FROM AllEmails a
GROUP BY a.Email
HAVING COUNT (*) > 1)
INSERT #Grouped
(
GroupId
,CustomerId
)
SELECT GroupId = 1
,c.Custno
FROM #Customers1 c
CROSS JOIN matches m
WHERE c.email1 = m.Email
OR c.email2 = m.Email;
--Rule 2 - any phone number matches
WITH AllPhones
AS (SELECT c.Custno
,Phone = c.Res_Phone
FROM #Customers1 c
UNION --Use instead of UNION ALL to make the results distinct
SELECT c.Custno
,Phone = c.Bus_Phone
FROM #Customers1 c
UNION
SELECT c.Custno
,Phone = c.Fax_Phone
FROM #Customers1 c
UNION
SELECT c.Custno
,Phone = c.Marine_Phone
FROM #Customers1 c
UNION
SELECT c.Custno
,Phone = c.Pager_Phone
FROM #Customers1 c
UNION
SELECT c.Custno
,Phone = c.Other_Phone
FROM #Customers1 c)
,matches
AS (SELECT a.Phone
,ct = COUNT (*)
FROM AllPhones a
GROUP BY a.Phone
HAVING COUNT (*) > 1)
INSERT #Grouped
(
GroupId
,CustomerId
)
SELECT GroupId = 2
,c.Custno
FROM #Customers1 c
CROSS JOIN matches m
WHERE(
c.Res_Phone = m.Phone
OR c.Bus_Phone = m.Phone
OR c.Fax_Phone = m.Phone
OR c.Marine_Phone = m.Phone
OR c.Pager_Phone = m.Phone
OR c.Other_Phone = m.Phone
)
AND NOT EXISTS
(
SELECT 1 FROM #Grouped g WHERE g.CustomerId = c.Custno
);
--Add other validation rules here
--And finally, assign group numbers to any customers who did not match any of the rules
INSERT #Grouped
(
GroupId
,CustomerId
)
--Edit the 10 in the following line to assign unused GroupIds
SELECT GroupId = 10 + ROW_NUMBER () OVER (ORDER BY c.Custno)
,CustomerId = c.Custno
FROM #Customers1 c
WHERE NOT EXISTS
(
SELECT 1 FROM #Grouped g WHERE g.CustomerId = c.Custno
);
SELECT *
FROM #Grouped g;
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 29, 2024 at 5:29 pm
Thank you Phil. You the best one quick Q for you, I have data about 1 mill plus so in this case any ideas how to reduce effect of it on processing? please
Thank you again
sita
April 29, 2024 at 5:31 pm
Actually let me rephrase the question is there any way can I do with temp tables by splitting it each set I think union taking to much for my process for more than an hour and I am cancelling it 🙁 it would be good if I can break it up insert / update each step ? can you please split it if possible. Thank you very much Phil
April 30, 2024 at 4:34 am
Dear Phil Morning,
Here I split the code loaded in temp tables but here in output I see groupid getting 1 for all customers.
What I am trying to do is combined groupid (just a number) for only matched customers,
for example cust 203454 has email1 as abc@gmail.com and cust 494033 has email2 as abc@gmaill.com then these two customers group id number is same even though customer no are different (as already exist)
so expected output is
CustID GroupID
203454 1001
494033 1001
---------- 1002
---------- 1002
---------- 1002
here groupid 1002 had three customers that shares same email.
Sorry for the confusion it may have caused in my initial request, can you please help
Thank you
April 30, 2024 at 8:41 am
Are you saying that my code does not work this way?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 1, 2024 at 12:21 pm
Good Morning Phil.
That is correct phil, it is assigning unique groupid for each customer. that has matching emails customers as well. Apologize for delay in response. Thank you for helping out
May 1, 2024 at 12:23 pm
I am trying this one Phil, as I have 1 million some how the OR clause taking too much time
---------------------------------------------------
DROP TABLE IF EXISTS #Grouped;
DROP TABLE IF EXISTS #AllEmails;
DROP TABLE IF EXISTS #Matches;
CREATE TABLE #Grouped
(
GroupId INT NOT NULL,
CustomerId varchar(50) NOT NULL,
PRIMARY KEY CLUSTERED(GroupId, CustomerId)
WITH(IGNORE_DUP_KEY = ON)
);
-- Create temporary table to store all emails
CREATE TABLE #AllEmails (Custno varchar(50), Email VARCHAR(500));
-- Populate the temporary table with emails from #Customers1
INSERT INTO #AllEmails (Custno, Email)
SELECT Custno, Email
FROM #Customers1 WHERE EMAIL LIKE '%@%';
INSERT INTO #AllEmails (Custno, Email)
SELECT Custno, Email2
FROM #Customers1 WHERE EMail2 LIKE '%@%';
-- Rule 1 - E-mail match
WITH Matches AS (
SELECT Email, COUNT(*) AS ct
FROM #AllEmails
GROUP BY Email
HAVING COUNT(*) > 1
)
INSERT INTO #Grouped (GroupId, CustomerId)
SELECT 1, c.Custno
FROM #Customers1 c
JOIN Matches m ON c.Email = m.Email
UNION
SELECT 1, c.Custno
FROM #Customers1 c
JOIN Matches m ON c.Email2 = m.Email;
-- Select data from #Grouped
SELECT GroupId, count(*) FROM #Grouped where groupid <> 1 group by groupid order by 2 desc;
INSERT #Grouped
(
GroupId
,CustomerId
)
--Edit the 10 in the following line to assign unused GroupIds
SELECT GroupId = 10 + ROW_NUMBER () OVER (ORDER BY c.Custno)
,CustomerId = c.Custno
FROM #Customers1 c
WHERE NOT EXISTS
(
SELECT 1 FROM #Grouped g WHERE g.CustomerId = c.Custno
);
-- Drop temporary tables
DROP TABLE IF EXISTS #AllEmails;
DROP TABLE IF EXISTS #Matches;
DROP TABLE IF EXISTS #Grouped;
Viewing 11 posts - 16 through 25 (of 25 total)
You must be logged in to reply to this topic. Login to reply