March 29, 2013 at 12:40 pm
Hi Guys,
I have a table with a column which is VARCHAR. This column has text and potentially Credit Card # and also has couple of dates. What I need to do is find all the records that may potentially have CC# in this column... I was thinking of using something like
WHERE patindex('%[0-9][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -]%',Memo) > 0
patindex('%[0-9][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -]%',Memo) > 0
Is there a better way to look for numbers in the string that may potentially be CC#. It is fine if some of them are not CC#. Maybe someone has already worked on a function like this where you find CC# from a string... Any help is appreciated. I am not looking for CLR function though as I have to do this using Query Analyzer.
Thanks,
Laura
March 29, 2013 at 12:58 pm
Didn't you post something just like this a day or two ago? I know I saw some other thread trying to do this exact same thing. I can't however find it now. :hehe:
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 29, 2013 at 12:59 pm
Laura i think it would be easier to delete dashes, and maybe also spaces from the string, and then look for 16 [0-9] digits in a row in the remaining string. (or 15 amex number?);
i saw your previous post, but now that it has percolated a bit, i'm thinking you need to manip the comment bit first to make the search easier
Lowell
March 29, 2013 at 1:57 pm
I saw something similar also, but I'm not sure it was from this OP. I can't find the post either.
March 29, 2013 at 3:40 pm
That seems like a reasonable approach. Using REPLICATE() will make it easier to see how many numbers are in the pattern and/or adjust it later:
WHERE PATINDEX('%[0-9]' + REPLICATE('[0-9 -]', 12) + '%', Memo) ...
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 30, 2013 at 11:36 am
Laura_SqlNovice (3/29/2013)
Hi Guys,I have a table with a column which is VARCHAR. This column has text and potentially Credit Card # and also has couple of dates. What I need to do is find all the records that may potentially have CC# in this column... I was thinking of using something like
WHERE patindex('%[0-9][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -][0-9 -]%',Memo) > 0
Is there a better way to look for numbers in the string that may potentially be CC#. It is fine if some of them are not CC#. Maybe someone has already worked on a function like this where you find CC# from a string... Any help is appreciated. I am not looking for CLR function though as I have to do this using Query Analyzer.
Thanks,
Laura
Do you actually want to extract the numbers (even if more than one in a row) or are you just trying to isolate the rows using the WHERE clause?
--Jeff Moden
Change is inevitable... Change for the better is not.
March 30, 2013 at 6:24 pm
i'm afraid this task isn't as easy as it may sound. The difficulty comes from the complexity of credit card number formulation. Now let's assume we just want to find the most common cards AMEX, MasterCard, Visa, Discover, Carte Blanche/Diners Club and perhaps a few others. Even among just these major carriers the rules are different. For example, AMEX numbers are 15 chars, MasterCard 16, Visa 13 or 16, Discover 16, and CBDC is 14.
Then there is the Issuer Identification Number (IIN) which varies from one digit (Visa) to as many as 6 digits (Discover). These IINs have changed over the years so there are many different ranges of digits for just about every type of card. These ranges can be looked up and put into a table to be used for CC validation.
And there's more...the last number of every card number is a checksum digit calculated by the Luhn Algorithm which is public domain. Virtually all cards use the Luhn Algorithm for the checksum.
So...just finding sets of 15 or 16 digits won't tell you whether it's likely to be a credit card number at all, much less if it's might be a valid card. A number can be well-formed and have a proper checksum and still not be valid if it's been retired or never issues for example. So all we can do is weed out numbers that break the rules. The rest can only be confirmed by the card issuer and that's another can of worms.
I've dug out some functions I had for validating card numbers and made a few adaptations to make them suitable for using as an example.
Function 1 is just a version of DelimitedSplit8K that splits EVERY character. Jeff Moden deserves most of the credit.
CREATE FUNCTION [dbo].[DelimitedSplit8KByChar]
(@pString VARCHAR(8000))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
cteTally(N) AS (SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E2
)
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY t.N),
Item = SUBSTRING(@pString,t.N+1,1)
FROM cteTally t
WHERE NULLIF(SUBSTRING(@pString,t.N+1,1),'') IS NOT NULL
GO
Function 2 parses the CC number and compares it to a table of Issuer Identification Number ranges.
THIS IS ONLY A SAMPLE AND MUST BE UPDATED WITH OFFICIAL ISSUER DATA BEFORE USING IN PRODUCTION!!
CREATE FUNCTION [dbo].[itvfGetCCIIN]
(
@CCNum VARCHAR(50),
@CCLen INT
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
cteIINValue(ID,CCType,StartIIN,EndIIN,CCLen) AS (
SELECT ID,CCType,StartIIN,EndIIN,CCLen FROM
(VALUES
(1,'American Express',34,34,15)
,(2,'American Express',37,37,15)
,(3,'Diners Club',300,305,14)
,(4,'Carte Blanche',300,305,14)
,(5,'enRoute',2014,2014,15)
,(6,'enRoute',2149,2149,15)
,(7,'MasterCard',51,55,16)
,(8,'Visa',4,4,13)
,(9,'Visa',4,4,16)
,(10,'Discover',6011,6011,16)
,(11,'Discover',622126,622925,16)
,(12,'Discover',644,649,16)
,(13,'Discover',65,65,16)
,(14,'JCB',3528,3589,16)
) AS Data (ID,CCType,StartIIN,EndIIN,CCLen)
),
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
E8(N) AS (SELECT 1 FROM E4 a, E4 b), --10E+6 or 100,000,000 rows max
cteTally(N) AS (SELECT 0 UNION ALL
SELECT TOP (SELECT ISNULL(MAX(EndIIN),10000) FROM cteIINValue)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E8
)
SELECT
ID
,IINType
,CCIIN
,IINLen
,StartIIN
,EndIIN
,CCLen
,Prefix
FROM
(
SELECT
ROW_NUMBER() OVER (ORDER BY t.N) AS ID
,cte.ID AS IINID
,CCType AS IINType
,N AS CCIIN
,CCLen AS IINLen
,StartIIN
,EndIIN
,@CCLen AS CCLen
,(CASE
WHEN ISNUMERIC(@CCNum) = 1
AND CONVERT(BIGINT,LEFT(@CCNum,LEN(t.N))) BETWEEN StartIIN and EndIIN
THEN CONVERT(BIGINT,LEFT(@CCNum,LEN(t.N)))
ELSE 0
END)
AS Prefix
FROM
cteIINValue cte
CROSS APPLY
cteTally t
WHERE
t.N BETWEEN StartIIN and EndIIN
) r
WHERE
1=1
AND r.CCIIN = r.Prefix
AND r.Prefix > 0
AND r.IINLen = r.CCLen
GO
Function 3 is the Luhn Algorithm for getting the checksum.
CREATE FUNCTION [dbo].[tvfLuhnValidation]
(
@CCStr VARCHAR(100)
)
RETURNS
@CheckSumValidation TABLE
(
ID INT IDENTITY(1,1) NOT NULL,
CCNum VARCHAR(20) NULL,
CkSumRemainder INT NULL,
PRIMARY KEY (ID)
)
WITH SCHEMABINDING
AS
BEGIN
DECLARE
@CCNum BIGINT
,@CheckIIN BIT
,@AllDigits BIGINT
,@ReverseDigits BIGINT
,@CheckSum BIGINT
,@CheckSumRemainder INT
SET @CCStr = REPLACE(REPLACE(@CCStr,' ',''),'-','')
IF PATINDEX('%[^0-9]%',@CCStr) > 0
BEGIN
INSERT INTO @CheckSumValidation
(CCNum,CkSumRemainder)
SELECT 0,99
END
ELSE
BEGIN
SET @CCNum = CONVERT(BIGINT,@CCStr)
SET @AllDigits = @CCNum
SET @checksum = CAST(RIGHT(@AllDigits,1) AS BIGINT)
SET @ReverseDigits = RIGHT(REVERSE(@AllDigits),LEN(@AllDigits)-1)
/* Get the check digit using the Luhn Algorithm */
;WITH cteCheckSum
AS
(
SELECT
(SUM(Item)+@CheckSum)%10 AS CheckSumRemainder
FROM
(
SELECT
s2.ItemNumber
,CAST(s2.Item AS BIGINT) AS Item
FROM
dbo.DelimitedSplit8KByChar(@ReverseDigits) AS s1
OUTER APPLY
dbo.DelimitedSplit8KByChar(s1.Item*2) AS s2
WHERE
s1.ItemNumber%2 <> 0
UNION ALL
SELECT
ItemNumber
,CAST(Item AS BIGINT) AS Item
FROM
dbo.DelimitedSplit8KByChar(@ReverseDigits)
WHERE
ItemNumber%2 = 0
) d
)
INSERT INTO @CheckSumValidation
(CCNum,CkSumRemainder)
SELECT
@CCNum AS CCNum
,CheckSumRemainder
FROM
cteCheckSum cs
END
RETURN
END
GO
Last but not least is some script to tie all these functions together. It's not the swiftest code because a lot is going on internally and I'm sure someone (as always!) will be able to offer improvements.
WITH cteSampleData --replace this with your real data
AS
(
SELECT * FROM
(VALUES
(1,'VISA','4012888888881881','10/14')
,(2,'MasterCard','5269924854210552','06/15')
,(3,'Voyager','869994992762272','08/14')
,(4,'VISA','4539390243132435','12/15')
,(5,'enRoute','214992938007085','09/13')
,(6,'VISA','4485983356242218','11/14')
,(7,'JCB','3088518677707770','01/14')
,(8,'VISA','4532254137583730','07/14')
,(9,'JCB','3560777438925512','12/15')
,(10,'Discover','6011618612311087','11/14')
,(11,'VISA','4417123456789113','07/15')
,(12,'Diners Club','3022329080952x','12/13')
) AS Data (ID,CCType,CCNum,CCExp)
)
--SELECT * FROM cteSampleData
SELECT
r2.ID
,(CASE
WHEN r2.CCNum IN (0,1) OR r2.CCNum IS NULL
THEN CAST(r2.OrigCCNum AS VARCHAR(50))
ELSE CAST(r2.CCNum AS VARCHAR(50))
END) AS CCNum
,r2.IINType AS ProbableCardType
,(CASE
WHEN v.CkSumRemainder = 0 THEN 'OK'
WHEN r2.CCNum IN (0,1) OR r2.CCNum IS NULL THEN 'Invalid Number'
ELSE 'Invalid CheckSum'
END) AS CardNumberStatus
FROM
(
SELECT
cte1.ID
,(CASE
WHEN cte1.CCNum IS NULL THEN 0
WHEN PATINDEX('%[^0-9]%',cte1.CCNum) > 0 THEN 1
ELSE CAST(cte1.CCNum AS BIGINT)
END)
AS CCNum
,cte1.CCNum AS OrigCCNum
,(CASE
WHEN r1.IINType IS NOT NULL THEN r1.IINType
ELSE 'Unknown'
END)
AS IINType
FROM
cteSampleData cte1
LEFT OUTER JOIN
(
SELECT DISTINCT
ROW_NUMBER() OVER (PARTITION BY r.ID ORDER BY r.ID) AS CCNumGroup
,r.ID
,r.CCNum
,iin.IINType
FROM
(
SELECT
cte.ID
,cte.CCNum
,LEN(cte.CCNum) AS CCLen
FROM
cteSampleData cte
) r
CROSS APPLY
dbo.itvfGetCCIIN(r.CCNum,r.CCLen) iin
) r1
ON cte1.ID = r1.ID
WHERE
CCNumGroup = 1 OR CCNumGroup IS NULL
) r2
CROSS APPLY
dbo.tvfLuhnValidation(r2.CCNum) AS v
ORDER BY
ID
April 1, 2013 at 8:33 am
Yes Jeff I just need to isolate those row... I do not need to find the numbers. I will go through the response from everyone now. Thanks a lot to everyone in taking to respond to this. Yeah somehow my earlier post was deleted... may be they thought I had put actual CC# in the script I had... Or I might have done something wrong while creating the post. Thanks Steven for the scripts.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply