September 17, 2008 at 7:40 am
How would I go about finding certain records with transposed data (typeos):
Recordset
Custid Id
500 1234
500 1234
600 1233
600 1232
700 1000
700 1000
800 2001
800 2002
900 1500
900 1500
In this case I would want the custids of 600 and 800 returned because these maybe typos
Make sense?
Warm Regards,
Arthur Lorenzini
Sioux Falls, SD
September 17, 2008 at 7:53 am
Arthur, try this
CREATE TABLE #test
(
CustID int,
id int
)
INSERT INTO #test
SELECT 500,1234 UNION ALL
SELECT 500,1234 UNION ALL
SELECT 600,1233 UNION ALL
SELECT 600,1232 UNION ALL
SELECT 700,1000 UNION ALL
SELECT 700,1000 UNION ALL
SELECT 800,2001 UNION ALL
SELECT 800,2002 UNION ALL
SELECT 900,1500 UNION ALL
SELECT 900,1500
SELECT
t1.CustID,
t1.ID
FROM #test t1,
#test t2
WHERE t1.CustID = t2.CustID
AND t1.id <> t2.id
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
September 17, 2008 at 8:00 am
That's not quite what I am after but close. The problem with this is if I have the following record:
Custid id
1000 5000
1000 6500
This will get picked up by the query but this is probably not a typo so I wouldn't not want it returned. I just need the records that may have been (fat fingered if you will), and are maybe just one number off. (Eg. 2002, 2001)
Warm Regards,
Arthur Lorenzini
Sioux Falls, SD
September 17, 2008 at 8:33 am
While I really think that you need to clarify your definition a little better Arthur, here is a hack at it, using a modification of Greg's code:
SELECT
t1.CustID,
t1.ID
FROM #test t1,
#test t2
WHERE t1.CustID = t2.CustID
AND (t1.id - t2.id) IN (1,-1,10,-10,100,-100,1000,-1000)
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
September 17, 2008 at 9:26 am
I think that will work except for a small issue with a conversion. In my actual table, the ID column is a CHAR(9)
it actually holds SSN without the dashes. I get the following error:
Msg 403, Level 16, State 1, Line 1
Invalid operator for data type. Operator equals subtract, type equals char.
Warm Regards,
Arthur Lorenzini
Sioux Falls, SD
September 17, 2008 at 9:44 am
Barry, what is the significance of the numbers in the IN group? If the ids are random, then couldn't the possibilies be endless? In light of the fact that we now know they are SSN's, I am still curious.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
September 17, 2008 at 9:49 am
Arthur.Lorenzini (9/17/2008)
I think that will work except for a small issue with a conversion.
Arthur, I think there is another issue. If you have a customer id, say 600, showing up with two different id's, which we now know are SSNs, then you need to be able to determine which id(SSN) is the correct one. And for the one that is determined to be incorrect for the customer id in question, what is the corrrect corresponding customer id. Just a thought.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
September 17, 2008 at 10:02 am
This is what the code is turn out to be:
SELECT
t1.PrimaryCustID,
t1.PrimarySSN
FROM CBDTransposeData t1,
CBDTransposeData t2
WHERE t1.PrimaryCustID = t2.PrimaryCustID
AND (CONVERT(INT,t1.PrimarySSN) - CONVERT(INT,t2.PrimarySSN))
IN (1,-1,10,-10,100,-100,1000,-1000,10000,-10000,100000,-100000,1000000,-1000000,10000000,-1000000,10000000,-10000000,100000000,-100000000)
The second SSN could be a co-customer (joint applications) put we go from the thought the first entry is the Customer SSN ( this comes from the customer table).
Warm Regards,
Arthur Lorenzini
Sioux Falls, SD
September 17, 2008 at 10:28 am
What is someone fat fingers the first digit instead of the last, will it still work?
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
September 17, 2008 at 10:34 am
yes , I just tested it and it works.
Warm Regards,
Arthur Lorenzini
Sioux Falls, SD
September 17, 2008 at 10:50 am
Greg Snidow (9/17/2008)
Barry, what is the significance of the numbers in the IN group? If the ids are random, then couldn't the possibilies be endless? In light of the fact that we now know they are SSN's, I am still curious.
The significance is this spec from the OP:
I just need the records that may have been (fat fingered if you will), and are maybe just one number off. (Eg. 2002, 2001)
Mind you, I don't really trust that analysis either, which is why I said that he needs to think this through a little bit more.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
September 17, 2008 at 10:52 am
Arthur.Lorenzini (9/17/2008)
I think that will work except for a small issue with a conversion. In my actual table, the ID column is a CHAR(9)it actually holds SSN without the dashes. I get the following error:
Msg 403, Level 16, State 1, Line 1
Invalid operator for data type. Operator equals subtract, type equals char.
Cast them as INT's:
AND (Cast(t1.id as int) - Cast(t2.id as int)) IN (1,-1,10,-10,100,-100,1000,-1000)
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
September 17, 2008 at 10:54 am
... CONVERT() works as well. 🙂
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
September 17, 2008 at 10:59 am
rbarryyoung (9/17/2008)
Greg Snidow (9/17/2008)
Barry, what is the significance of the numbers in the IN group? If the ids are random, then couldn't the possibilies be endless? In light of the fact that we now know they are SSN's, I am still curious.The significance is this spec from the OP:
I just need the records that may have been (fat fingered if you will), and are maybe just one number off. (Eg. 2002, 2001)
I may not have explained this as well as I could have... This list (1,-1,10,-10,...) works because if you take any whole positive number and change just one digit by only +1 or -1, then the resulting difference between the correct value and the modified value is:
(+1/-1)*10^n
where "n" is what digit-place was changed (counting from the right, starting at 0).
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
September 17, 2008 at 11:55 am
Yep, I finally got it. I was thinking that it would only work if the last digit were off by one, but I see now it will work regardless of which place it is. As long as it is only off by one.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply