October 29, 2012 at 9:26 am
Hi,
I've been given the task of creating a single unique numerical value from two composite unique alphanumeric values. It needs to be possible to convert in both directions.
So, in the following example, the Result column has to be populated with a numerical value which is derived from the first two values. It should be possible to calculate the first two values from the result.
Value1 Value 2 Result
------ ------- ------
AXH32CTB 22124587 <some unique numeric value>
544DX88DGT 21457751 <some unique numeric value>
My first attempts have either resulted in a number so large that it doesn't fit into a bigint, or cannot be calculated back to find Value1 and Value2?
I have NO control over the two given values!
My maths isn't great - is this even possible?
Yours hopefully,
Martin
October 29, 2012 at 10:13 am
webtekkie (10/29/2012)
Hi,I've been given the task of creating a single unique numerical value from two composite unique alphanumeric values. It needs to be possible to convert in both directions.
So, in the following example, the Result column has to be populated with a numerical value which is derived from the first two values. It should be possible to calculate the first two values from the result.
Value1 Value 2 Result
------ ------- ------
AXH32CTB 22124587 <some unique numeric value>
544DX88DGT 21457751 <some unique numeric value>
My first attempts have either resulted in a number so large that it doesn't fit into a bigint, or cannot be calculated back to find Value1 and Value2?
I have NO control over the two given values!
My maths isn't great - is this even possible?
Yours hopefully,
Martin
What you are describing is a two encryption that only produces numbers. What is the purpose of this? Not only does it sound extremely difficult but it also sounds extremely pointless. If you want to encrypt and decrypt your data like that I would look at RC4 or some other type of two way encryption. Then you can just concatenate your 2 values together with an underscore or something.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
October 29, 2012 at 10:32 am
Hi Sean,
Yeah, it does sound like a strange request, but I'm working for a bank and we have historic data in all sorts of wierd formats. It's not an encryption exercise, I'm simply trying to marry up two third party applications where one of them uses the two alphanumeric values and the other requires a single numeric value.
If it isn't possible, that's fine - I just want to avoid going back to a bunch of people to tell them it can't be done only to find some spotty graduate who's got a 1st in Advanced Maths from Oxford tell me I'm wrong - it is a bank, after all 😛
Thanks,
Martin
October 29, 2012 at 10:37 am
webtekkie (10/29/2012)
Hi Sean,Yeah, it does sound like a strange request, but I'm working for a bank and we have historic data in all sorts of wierd formats. It's not an encryption exercise, I'm simply trying to marry up two third party applications where one of them uses the two alphanumeric values and the other requires a single numeric value.
If it isn't possible, that's fine - I just want to avoid going back to a bunch of people to tell them it can't be done only to find some spotty graduate who's got a 1st in Advanced Maths from Oxford tell me I'm wrong - it is a bank, after all 😛
Thanks,
Martin
I am by no mean a mathematics expert so I can't say that it is not possible.
This sounds more like a mapping issue. Can you create a mapping table. This would let you have a table with your single numeric values and where appropriate has the other two columns populated? Something along those lines.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
October 29, 2012 at 10:41 am
I think I can see a way to do this. Let me iron some stuff out and I'll get back to you.
I've been using this link and a spreadsheet. I'll take another look tomorrow, but I've got to head off now.
October 29, 2012 at 10:56 am
Cadavre (10/29/2012)
I think I can see a way to do this. Let me iron some stuff out and I'll get back to you.
Decoding values such as '544DX88DGT' (and let say 'ZZZZZZZZZZ') into decimal-numeric representation, will end-up with quiet large number which will not fit even into BIGINT (as ZZZZZZZZZZ will be decoded to 32^9).
What is the longest alpha-numeric string in Value1 and what is the datatype of Value2?
October 29, 2012 at 11:12 am
Hi Eugene,
Thanks for taking the time to reply.
You are right, it is too big for a bigint if I use that method as the two values can be quite long.
Sean - yes, a mapping table is the best option I can come up with to do this if I can't find this utopian solution that I'm dreaming of.
Thanks,
Martin
October 29, 2012 at 11:14 am
Cadavre,
Sounds intriguing! Look forward to seeing what you can come up with tomorrow. FYI, the numerical input is always a positive absolute number.
Thanks,
Martin
October 29, 2012 at 11:25 am
Could you answer the questions I've asked?
What is the maximum size of Value1 and datatype of Value2?
There is only one method to get alpha-numeric sequence into decimal - it's called n-base encoding (or should be called n-base decoding). Basically it's the same as converting HEX numbers (which are 16-based numerics) into decimal numbers (which are 10-based numerics).
So, in case of all English letters used (together with digits), you will have 32-based "numerics".
October 29, 2012 at 11:46 am
Eugene - Max size for Value1 is 7 chars and for Value2 is 10 chars.
October 29, 2012 at 12:12 pm
Is there a reason why you can't just create a table with an identity column and add all your unique value pairs to it? Then you could simply look it up the integer value whenever you needed it.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
October 29, 2012 at 7:42 pm
webtekkie (10/29/2012)
Eugene - Max size for Value1 is 7 chars and for Value2 is 10 chars.
With those lengths, it is possible to use a DECIMAL type to store the encoded string. While the following is super-cheesy, it should give you a start.
First set up the table and some sample data:
DECLARE @Values TABLE
(ID INT IDENTITY, Value1 VARCHAR(7)COLLATE SQL_Latin1_General_CP1_CS_AS
, Value2 VARCHAR(10) COLLATE SQL_Latin1_General_CP1_CS_AS
,v DECIMAL(37,0))
DECLARE @Alphanumerics CHAR(62) =
'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
-- Scramble the encryption key
;WITH Tally (n) AS (
SELECT n=number
FROM [master].dbo.spt_values Tally
WHERE [Type] = 'P' AND Number BETWEEN 1 AND 62)
SELECT @Alphanumerics = (
SELECT SUBSTRING(@Alphanumerics, n, 1)
FROM Tally a
ORDER BY NEWID()
FOR XML PATH(''))
SELECT EncryptionKey=@Alphanumerics
-- Sample data
INSERT INTO @Values (Value1, Value2)
SELECT 'AXH32CT','22124587'
UNION ALL SELECT '544DX88','21457751'
Now put the encoded value into the "v" column and show the results:
-- Encode
;WITH Tally (n) AS (
SELECT n=number
FROM [master].dbo.spt_values Tally
WHERE [Type] = 'P' AND Number BETWEEN 1 AND 10),
Encode1 AS (
SELECT ID, Value1, Value2, a.v1, n
FROM @Values
CROSS APPLY (
SELECT n, RIGHT('0' +
CAST(CHARINDEX(SUBSTRING(Value1,n,1), @Alphanumerics) AS VARCHAR),2)
FROM Tally
WHERE n BETWEEN 1 AND LEN(Value1)) a(n, v1)),
Catenate1 AS (
SELECT ID, Value1, Value2
,v1=CAST(LEN(Value1) AS VARCHAR) + (
SELECT v1 + ''
FROM Encode1 b
WHERE a.ID = b.ID
ORDER BY n
FOR XML PATH(''))
FROM Encode1 a
GROUP BY ID, Value1, Value2),
Encode2 AS (
SELECT ID, Value1, Value2, v1, a.v2, n
FROM Catenate1
CROSS APPLY (
SELECT n, RIGHT('0' +
CAST(CHARINDEX(SUBSTRING(Value2,n,1), @Alphanumerics) AS VARCHAR),2)
FROM Tally
WHERE n BETWEEN 1 AND LEN(Value2)) a(n, v2)),
Catenate2 AS (
SELECT ID, Value1, Value2, v1
,v2=RIGHT('0' + CAST(LEN(Value2) AS VARCHAR), 2) + (
SELECT v2 + ''
FROM Encode2 b
WHERE a.ID = b.ID
ORDER BY n
FOR XML PATH(''))
FROM Encode2 a
GROUP BY ID, Value1, Value2, v1)
UPDATE a
SET v=v1+v2
-- Display the encoded values
OUTPUT INSERTED.ID, INSERTED.Value1, INSERTED.Value2, INSERTED.v
FROM @Values a
INNER JOIN Catenate2 b
ON a.ID = b.ID
Finally, decode v:
-- Decode
;WITH Tally (n) AS (
SELECT n=number
FROM [master].dbo.spt_values Tally
WHERE [Type] = 'P' AND Number BETWEEN 1 AND 37),
Decode1 AS (
SELECT ID, Value1, Value2, v, n, v1
FROM @Values
CROSS APPLY (
SELECT n, SUBSTRING(@Alphanumerics
,CAST(SUBSTRING(CAST(v AS VARCHAR(37)), 2*n, 2) AS INT), 1)
FROM Tally
WHERE n BETWEEN 1 AND SUBSTRING(CAST(v AS VARCHAR(37)), 1, 1)) a(n, v1)),
Catenate1 AS (
SELECT ID, Value1, Value2, v
,v1=( SELECT v1 + ''
FROM Decode1 b
WHERE a.ID = b.ID
ORDER BY n
FOR XML PATH(''))
FROM Decode1 a
GROUP BY ID, Value1, Value2, v),
Decode2 AS (
SELECT ID, Value1, Value2, v, v1, n, v2
FROM Catenate1
CROSS APPLY (
SELECT 1+2*SUBSTRING(CAST(v AS VARCHAR(37)), 1, 1)) a (offset)
CROSS APPLY (
SELECT n, SUBSTRING(@Alphanumerics
,CAST(SUBSTRING(CAST(v AS VARCHAR(37)), (1+offset)+2*n, 2) AS INT), 1)
FROM Tally
WHERE n BETWEEN 1 AND SUBSTRING(CAST(v AS VARCHAR(37)), offset+1, 2)) b(n, v2)),
Catenate2 AS (
SELECT ID, Value1, Value2, v, v1
,v2=( SELECT v2 + ''
FROM Decode2 b
WHERE a.ID = b.ID
ORDER BY n
FOR XML PATH(''))
FROM Decode2 a
GROUP BY ID, Value1, Value2, v, v1)
SELECT ID, Value1, Value2, v, v1, v2
FROM Catenate2
This is most certainly not any sophisticated encoding but you can improve it slightly by scrambling the order of the alphanumerics in @Alphanumerics as I have done.
Cool and fun problem!
Just be sure to save the scrambled @Alphanumerics value so you can later decode!
Edit: Had to add the case sensitive collation to Value1 and Value2 in the table variable because otherwise the decode may not work properly.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 29, 2012 at 10:10 pm
webtekkie (10/29/2012)
Eugene - Max size for Value1 is 7 chars and for Value2 is 10 chars.
Just to be sure, Value 1 will only contain 1 to 7 characters consisting of numeric digits and upper case letters and Value 2 will consist of only 1 to 10 numeric digits. Is that correct?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 29, 2012 at 10:18 pm
Jeff Moden (10/29/2012)
webtekkie (10/29/2012)
Eugene - Max size for Value1 is 7 chars and for Value2 is 10 chars.Just to be sure, Value 1 will only contain 1 to 7 characters consisting of numeric digits and upper case letters and Value 2 will consist of only 1 to 10 numeric digits. Is that correct?
Why do I smell a super-fast, super-efficient solution in the works that will put my rough cut to shame?
I figured with pages of responses someone should have suggested something before I came along.
For the record, I realize that my encoding solution could be improved by reducing to 3 (instead of 5) cascading CTEs by combining the encoding of Value1 and Value2 at one time, but I don't have the bandwidth at the moment to do that. Not that this would help it once Jeff puts on his puzzle-solving hat.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
October 30, 2012 at 4:44 am
Dwain - that's impressive! I'm going through it now to see understand what's going on here.
Jeff - I've checked with the data sourcing team - they have told me that there will ONLY be 7 chars for Value1 and 10 chars for Value2 - never more and never less.
Viewing 15 posts - 1 through 15 (of 33 total)
You must be logged in to reply to this topic. Login to reply