I need to exclude data from a select query where the value of column A is used in other rows and the values of column B are different.
Column A | ColumnB
PhoneNumber | FirstName
------------|----------
02038245161 | Jon
02038245161 | Henry
02038881121 | Jamie
02038881121 | Jamie
From the above table I want to select all records apart from the first two as the phone number has been used for more than one person. I should only get the last two rows back as the number is used twice but the person is the same.
I am currently trying the following:
SELECT cust.PhoneNumber, cust.FirstName, (SELECT COUNT(PhoneNumber) FROM Customers WHERE PhoneNumber = cust.PhoneNumber AND FirstName <> cust.FirstName)
FROM Customers cust
Group By cust.PhoneNumber, cust.FirstName
This for some reason is giving me an error stating that the sub query is returning more than one results however i'm not sure why as it should be returning a single count.
September 8, 2021 at 8:47 am
If you would like a coded solution, please provide your sample data in the form of CREATE TABLE/INSERT statements.
Please also confirm the logic for which row should be returned in the case of a duplicate phone number, remembering that there is no concept of 'first' in SQL Server without an ORDER BY.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 8, 2021 at 9:20 am
what do you want return if you had these rows
02038881121 | Jamie
02038881121 | Jamie
02038881121 | John
02038881121 | Jack
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
September 8, 2021 at 9:26 am
Hi Phil,
Thanks for the reply.
The create/ insert would be as follows:
CREATE TABLE CustomerPolicies
(
[Mobile] NVARCHAR(20),
[WorkTelephone] NVARCHAR(20),
[HomeTelephone] NVARCHAR(20),
[FirstName] NVARCHAR(30),
[LastName] NVARCHAR(30),
[PolicyNumber] INT
)
INSERT INTO CustomerPolicies VALUES ('07972234564', 'N/A', 'N/A', 'Jon', 'Evans', 32323232)
INSERT INTO CustomerPolicies VALUES ('07972234564', 'N/A', 'N/A', 'Tom', 'Davies', 34343434)
INSERT INTO CustomerPolicies VALUES ('07234322322', 'N/A', 'N/A', 'Will', 'Jones', 12121212)
INSERT INTO CustomerPolicies VALUES ('07234322322', 'N/A', 'N/A', 'Will', 'Jones', 13131313)
The SELECT query would need to return the second two rows only and not the first two as the first two have a phone number that is used for two different customers.
The second two rows have the same phone number but also the same customer so I would want them both returned as they are for two different product policies,
September 8, 2021 at 9:28 am
what do you want return if you had these rows
02038881121 | Jamie
02038881121 | Jamie
02038881121 | John
02038881121 | Jack
I would want the first two rows returned as they are the same number AND same customer.
The second two rows are the same number but different customers so I would want to exclude them.
The underlying data that I am working with has some flaws in that product resellers sometimes put their phone number against the customer data and as such I am currently getting multiple matches to a phone number for different customers.
If I can filter those records out then I don;t have to worry as much about matching the wrong customer to a phone number.
September 8, 2021 at 9:34 am
J Livingston SQL wrote:what do you want return if you had these rows
02038881121 | Jamie
02038881121 | Jamie
02038881121 | John
02038881121 | JackI would want the first two rows returned as they are the same number AND same customer.
The second two rows are the same number but different customers so I would want to exclude them.
The underlying data that I am working with has some flaws in that product resellers sometimes put their phone number against the customer data and as such I am currently getting multiple matches to a phone number for different customers.
If I can filter those records out then I don;t have to worry as much about matching the wrong customer to a phone number.
Once again, there is no such thing as FIRST here, unless there is some column we can use to ORDER BY. Just because they appear first in a SELECT * does not imply any sort of implicit ordering.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
September 8, 2021 at 2:39 pm
If I am understanding your request properly, something like this should work, right:
WITH [cte]
AS
(
SELECT
[Mobile]
, [WorkTelephone]
, [HomeTelephone]
, [FirstName]
, [LastName]
, [PolicyNumber]
, COUNT(1) OVER (PARTITION BY
[Mobile]
, [FirstName]
, [LastName]
) AS [DistinctCount]
FROM[CustomerPolicies]
)
SELECT
[cte].[Mobile]
, [cte].[WorkTelephone]
, [cte].[HomeTelephone]
, [cte].[FirstName]
, [cte].[LastName]
, [cte].[PolicyNumber]
, [cte].[DistinctCount]
FROM[cte]
WHERE [cte].[DistinctCount] > 1;
MAY need to tweak that windowing function to cover all of the phone numbers; I was just writing it to work with the data. The other problem you run into is when you have data like:
Column A | ColumnB
PhoneNumber | FirstName
------------|----------
02038245161 | Jon
02038245162 | Henry
02038881121 | Jamie
02038881121 | Jamie
Here Jon and Henry have different phone numbers, but my above query would exclude them and ONLY give you Jamie. But that is easy to correct with a second windowing function that is a count on JUST the mobile number and the final where gets that added to it when its count is 1.
I'd recommend renaming the cte to something that makes more sense to you though. I am not a big fan of CTE's named "cte" except for demo purposes.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
September 11, 2021 at 6:46 pm
If I am understanding your request properly, something like this should work, right:
WITH [cte]
AS
(
SELECT
[Mobile]
, [WorkTelephone]
, [HomeTelephone]
, [FirstName]
, [LastName]
, [PolicyNumber]
, COUNT(1) OVER (PARTITION BY
[Mobile]
, [FirstName]
, [LastName]
) AS [DistinctCount]
FROM[CustomerPolicies]
)
SELECT
[cte].[Mobile]
, [cte].[WorkTelephone]
, [cte].[HomeTelephone]
, [cte].[FirstName]
, [cte].[LastName]
, [cte].[PolicyNumber]
, [cte].[DistinctCount]
FROM[cte]
WHERE [cte].[DistinctCount] > 1;MAY need to tweak that windowing function to cover all of the phone numbers; I was just writing it to work with the data. The other problem you run into is when you have data like:
Column A | ColumnB
PhoneNumber | FirstName
------------|----------
02038245161 | Jon
02038245162 | Henry
02038881121 | Jamie
02038881121 | JamieHere Jon and Henry have different phone numbers, but my above query would exclude them and ONLY give you Jamie. But that is easy to correct with a second windowing function that is a count on JUST the mobile number and the final where gets that added to it when its count is 1.
I'd recommend renaming the cte to something that makes more sense to you though. I am not a big fan of CTE's named "cte" except for demo purposes.
Hi Brian,
Thanks for you for the example.
This works for the most part however it excludes records where there is a single unique match for mobile number, first name and last name.
I am only trying to exclude records where the same number appears against multiple different people.
Regards
Matt
Something like this? (I may be oversimplifying something...)
-- exclude when phone # refers to multiple people.
-- exclude when phone # refers to multiple people.
SELECT * FROM CustomerPolicies cp
WHERE cp.Mobile NOT IN
(SELECT Mobile
FROM CustomerPolicies
GROUP BY Mobile
HAVING COUNT(DISTINCT(FirstName))>1);
September 13, 2021 at 10:19 am
Something like this? (I may be oversimplifying something...)
-- exclude when phone # refers to multiple people.
-- exclude when phone # refers to multiple people.
SELECT * FROM CustomerPolicies cp
WHERE cp.Mobile NOT IN
(SELECT Mobile
FROM CustomerPolicies
GROUP BY Mobile
HAVING COUNT(DISTINCT(FirstName))>1);
Thanks pietlinden,
Nice and simple and does exactly what I need.
September 15, 2021 at 12:40 am
Hi Phil,
Thanks for the reply.
The create/ insert would be as follows:
CREATE TABLE CustomerPolicies
(
[Mobile] NVARCHAR(20),
[WorkTelephone] NVARCHAR(20),
[HomeTelephone] NVARCHAR(20),
[FirstName] NVARCHAR(30),
[LastName] NVARCHAR(30),
[PolicyNumber] INT
)
INSERT INTO CustomerPolicies VALUES ('07972234564', 'N/A', 'N/A', 'Jon', 'Evans', 32323232)
INSERT INTO CustomerPolicies VALUES ('07972234564', 'N/A', 'N/A', 'Tom', 'Davies', 34343434)
INSERT INTO CustomerPolicies VALUES ('07234322322', 'N/A', 'N/A', 'Will', 'Jones', 12121212)
INSERT INTO CustomerPolicies VALUES ('07234322322', 'N/A', 'N/A', 'Will', 'Jones', 13131313)The SELECT query would need to return the second two rows only and not the first two as the first two have a phone number that is used for two different customers.
The second two rows have the same phone number but also the same customer so I would want them both returned as they are for two different product policies,
pietlinden wrote:Something like this? (I may be oversimplifying something...)
-- exclude when phone # refers to multiple people.
-- exclude when phone # refers to multiple people.
SELECT * FROM CustomerPolicies cp
WHERE cp.Mobile NOT IN
(SELECT Mobile
FROM CustomerPolicies
GROUP BY Mobile
HAVING COUNT(DISTINCT(FirstName))>1);Thanks pietlinden,
Nice and simple and does exactly what I need.
Here's another possibility. The difference is that it only does a single table scan and some "Lazy Spools". The Actual Execution Plan indicates that should be about twice as fast but I never trust the execution plan to be correct there. I just don't have enough data for a performance test so, if you have the time to do so, please try it out and let us know how they both worked for you.
WITH cteEnumerate AS
(
SELECT *,NameCnt = COUNT(*) OVER (PARTITION BY Mobile,FirstName,LastName)
FROM dbo.CustomerPolicies
)
SELECT * FROM cteEnumerate WHERE NameCnt > 1
;
--Jeff Moden
Change is inevitable... Change for the better is not.
September 15, 2021 at 1:04 am
Looking a bit closer at the code, there' may be a fly in the ointment of pietlinden's code. Give the following a run and see.
[EDIT] I DISCOVERED THE CODE THAT I POSTED BELOW DOES NOT WORK FOR ALL SITUATIONS BUT THE MOD TO PIETLINDEN'S CODE DOES HANDLE ALL SITUATIONS. PLEASE SEE THE POST FOLLOWING THIS ONE!
I left my code here as an example of a mistake to avoid.
--===== Create the modified test data
DROP TABLE IF EXISTS dbo.CustomerPolicies;
CREATE TABLE dbo.CustomerPolicies
(
Mobile NVARCHAR(20)
,WorkTelephone NVARCHAR(20)
,HomeTelephone NVARCHAR(20)
,FirstName NVARCHAR(30)
,LastName NVARCHAR(30)
,PolicyNumber INT
)
;
INSERT INTO CustomerPolicies
(Mobile,WorkTelephone,HomeTelephone,FirstName,LastName,PolicyNumber)
VALUES ('07972234564','N/A','N/A','Jon' ,'Evans' ,32323232)
,('07972234564','N/A','N/A','Jon' ,'Jones' ,98755432) --Added this row
--,('07972234564','N/A','N/A','Tom' ,'Davies',34343434) --Removed this
,('07234322322','N/A','N/A','Will','Jones' ,12121212)
,('07234322322','N/A','N/A','Will','Jones' ,13131313)
;
--===== This code only looks at the first name and so produces
-- the wrong answer according to your request.
SELECT * FROM CustomerPolicies cp
WHERE cp.Mobile NOT IN
(SELECT Mobile
FROM CustomerPolicies
GROUP BY Mobile
HAVING COUNT(DISTINCT(FirstName))>1)
;
PRINT 'EDIT. UPON FURTHER TESTING, THIS CODE DOES NOT WORK FOR ALL SITUATIONS';
PRINT ' SEE THE POST BELOW FOR PROOF OF THAT AND THAT THE MOD TO PIETLINDEN"S CODE DOES!';
WITH cteEnumerate AS
(
SELECT *,NameCnt = COUNT(*) OVER (PARTITION BY Mobile,FirstName,LastName)
FROM dbo.CustomerPolicies
)
SELECT * FROM cteEnumerate WHERE NameCnt > 1
;
If it turns out that pietlinden's is faster, then you need to change his "Having" clause to include a concatenation of FirstName and LastName and it'll work just fine. Like this. Just remember to test it for performance again because concatenation can be a little expensive.
SELECT * FROM CustomerPolicies cp
WHERE cp.Mobile NOT IN
(SELECT Mobile
FROM CustomerPolicies
GROUP BY Mobile
HAVING COUNT(DISTINCT(FirstName+LastName))>1)
;
--Jeff Moden
Change is inevitable... Change for the better is not.
September 15, 2021 at 1:28 am
AND, looking a bit closer, my code doesn't work correctly using the following data but the mod to pietlinden's code does, so use his code with the mod.
Here's the code I used to discover my error and that the mod to pietlinden's code works. Use the mod to HIS code and not mine.
--===== Create the modified test data
DROP TABLE IF EXISTS dbo.CustomerPolicies;
CREATE TABLE dbo.CustomerPolicies
(
Mobile NVARCHAR(20)
,WorkTelephone NVARCHAR(20)
,HomeTelephone NVARCHAR(20)
,FirstName NVARCHAR(30)
,LastName NVARCHAR(30)
,PolicyNumber INT
)
;
INSERT INTO CustomerPolicies
(Mobile,WorkTelephone,HomeTelephone,FirstName,LastName,PolicyNumber)
VALUES ('07972234564','N/A','N/A','Jon' ,'Evans' ,32323232)
,('07972234564','N/A','N/A','Jon' ,'Jones' ,98755432) --Added this row
,('07972234564','N/A','N/A','Tom' ,'Davies',34343434)
,('07972234564','N/A','N/A','Tom' ,'Davies',34343434) --Added this row
,('07234322322','N/A','N/A','Will','Jones' ,12121212)
,('07234322322','N/A','N/A','Will','Jones' ,13131313)
;
--===== This code does NOT work correctly for this example so don't use it.
-- And my apologies for thinking that it worked before.
WITH cteEnumerate AS
(
SELECT *,NameCnt = COUNT(*) OVER (PARTITION BY Mobile,FirstName,LastName)
FROM dbo.CustomerPolicies
)
SELECT * FROM cteEnumerate WHERE NameCnt > 1
;
--===== The mod to pietlindens' code DOES work so use it instead.
SELECT * FROM CustomerPolicies cp
WHERE cp.Mobile NOT IN
(SELECT Mobile
FROM CustomerPolicies
GROUP BY Mobile
HAVING COUNT(DISTINCT(FirstName+LastName))>1)
;
--Jeff Moden
Change is inevitable... Change for the better is not.
September 15, 2021 at 3:36 am
AND, looking a bit closer, my code doesn't work correctly using the following data but the mod to pietlinden's code does, so use his code with the mod.
Here's the code I used to discover my error and that the mod to pietlinden's code works. Use the mod to HIS code and not mine.
--===== Create the modified test data
DROP TABLE IF EXISTS dbo.CustomerPolicies;
CREATE TABLE dbo.CustomerPolicies
(
Mobile NVARCHAR(20)
,WorkTelephone NVARCHAR(20)
,HomeTelephone NVARCHAR(20)
,FirstName NVARCHAR(30)
,LastName NVARCHAR(30)
,PolicyNumber INT
)
;
INSERT INTO CustomerPolicies
(Mobile,WorkTelephone,HomeTelephone,FirstName,LastName,PolicyNumber)
VALUES ('07972234564','N/A','N/A','Jon' ,'Evans' ,32323232)
,('07972234564','N/A','N/A','Jon' ,'Jones' ,98755432) --Added this row
,('07972234564','N/A','N/A','Tom' ,'Davies',34343434)
,('07972234564','N/A','N/A','Tom' ,'Davies',34343434) --Added this row
,('07234322322','N/A','N/A','Will','Jones' ,12121212)
,('07234322322','N/A','N/A','Will','Jones' ,13131313)
;
--===== This code does NOT work correctly for this example so don't use it.
-- And my apologies for thinking that it worked before.
WITH cteEnumerate AS
(
SELECT *,NameCnt = COUNT(*) OVER (PARTITION BY Mobile,FirstName,LastName)
FROM dbo.CustomerPolicies
)
SELECT * FROM cteEnumerate WHERE NameCnt > 1
;
--===== The mod to pietlindens' code DOES work so use it instead.
SELECT * FROM CustomerPolicies cp
WHERE cp.Mobile NOT IN
(SELECT Mobile
FROM CustomerPolicies
GROUP BY Mobile
HAVING COUNT(DISTINCT(FirstName+LastName))>1)
;
That code is not safe either. That's because two (or more) different FirstName and LastName separately could be the same once they are concatenated.
Some examples:
Ric Edelman / Rice Delman
Hank Nudson / Han Knudson
etc.
I suggest adding a vertical bar, say, between the names:
...DISTINCT(FirstName+'|'+LastName)...
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
September 23, 2021 at 10:39 am
This was removed by the editor as SPAM
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply