July 27, 2011 at 10:21 pm
Anyone has Procedure to scramble data like addresses, but the new data should not be any characters... address1 should look like address1 field and so on? Thanks!
July 28, 2011 at 1:29 am
Ghanta (7/27/2011)
Anyone has Procedure to scramble data like addresses
Do you mean column encryption? http://msdn.microsoft.com/en-us/library/ms179331(v=SQL.100).aspx
Ghanta (7/27/2011)
but the new data should not be any characters... address1 should look like address1 field and so on?
Or do you mean replacing a real address with a dummy address, in order to use a real database for testing, so the Developers don't see real addresses?
BrainDonor
July 28, 2011 at 7:09 am
Ghanta (7/27/2011)
Anyone has Procedure to scramble data like addresses, but the new data should not be any characters... address1 should look like address1 field and so on? Thanks!
Will using one customer's address in place of another's do? In other words, row-to-row scrambling?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 28, 2011 at 7:32 am
a lot of scrambling of data is specific to the tables and lookups invloved;
I've got something that replaces all existing data with something like this (street names are a random html color) :
addr1 addr2 addr3
7312 RosyBrown Street Building 384 Unit 1029
1363 DarkSlateBlue Street Building 208 Unit 1746
1094 DarkGoldenrod Street Building 468 Unit 1651
3770 Navy Street Building 172 Unit 2049
2443 MintCream Street Building 201 Unit 7170
but the code assumes a lot of stuff about specific columns and foreign keys, since it's designed around my data.let me see if i can't dump it into a proc to make it more universal,
but here's a prototype:
UPDATE cmcontact
set busemail = 'notreal@email.com',
editedby = 'Administrator',
busfax = '9542179598',
busphone = '8883267680',
buszipcode = '40000',
title = ''
--random first land last names, based on colors and 6 first names:
;With MyColorCTE AS
( SELECT 'AliceBlue' AS TheColor UNION ALL
SELECT 'AntiqueWhite' UNION ALL SELECT 'Aqua' UNION ALL SELECT 'Aquamarine' UNION ALL
SELECT 'Azure' UNION ALL SELECT 'Beige' UNION ALL SELECT 'Bisque' UNION ALL SELECT 'Black' UNION ALL
SELECT 'BlanchedAlmond' UNION ALL SELECT 'Blue' UNION ALL SELECT 'BlueViolet' UNION ALL SELECT 'Brown' UNION ALL
SELECT 'BurlyWood' UNION ALL SELECT 'CadetBlue' UNION ALL SELECT 'Chartreuse' UNION ALL SELECT 'Chocolate' UNION ALL
SELECT 'Coral' UNION ALL SELECT 'CornFlowerBlue' UNION ALL SELECT 'Cornsilk' UNION ALL SELECT 'Crimson' UNION ALL
SELECT 'Cyan' UNION ALL SELECT 'DarkBlue' UNION ALL SELECT 'DarkCyan' UNION ALL SELECT 'DarkGoldenrod' UNION ALL
SELECT 'DarkGray' UNION ALL SELECT 'DarkGreen' UNION ALL SELECT 'DarkKhaki' UNION ALL SELECT 'DarkMagenta' UNION ALL
SELECT 'DarkOliveGreen' UNION ALL SELECT 'DarkOrange' UNION ALL SELECT 'DarkOrchid' UNION ALL SELECT 'DarkRed' UNION ALL
SELECT 'DarkSalmon' UNION ALL SELECT 'DarkSeaGreen' UNION ALL SELECT 'DarkSlateBlue' UNION ALL SELECT 'DarkSlateGray' UNION ALL
SELECT 'DarkTurquoise' UNION ALL SELECT 'DarkViolet' UNION ALL SELECT 'DeepPink' UNION ALL SELECT 'DeepSkyBlue' UNION ALL
SELECT 'DimGray' UNION ALL SELECT 'DodgerBlue' UNION ALL SELECT 'FireBrick' UNION ALL SELECT 'FloralWhite' UNION ALL
SELECT 'ForestGreen' UNION ALL SELECT 'Fuchsia' UNION ALL SELECT 'Gainsboro' UNION ALL SELECT 'GhostWhite' UNION ALL
SELECT 'Gold' UNION ALL SELECT 'Goldenrod' UNION ALL SELECT 'Gray' UNION ALL SELECT 'Green' UNION ALL
SELECT 'GreenYellow' UNION ALL SELECT 'Honeydew' UNION ALL SELECT 'HotPink' UNION ALL SELECT 'IndianRed' UNION ALL
SELECT 'Indigo' UNION ALL SELECT 'Ivory' UNION ALL SELECT 'Khaki' UNION ALL SELECT 'Lavender' UNION ALL
SELECT 'LavenderBlush' UNION ALL SELECT 'LawnGreen' UNION ALL SELECT 'LemonChiffon' UNION ALL SELECT 'LightBlue' UNION ALL
SELECT 'LightCoral' UNION ALL SELECT 'LightCyan' UNION ALL SELECT 'LightGray' UNION ALL SELECT 'LightGreen' UNION ALL
SELECT 'LightPink' UNION ALL SELECT 'LightSalmon' UNION ALL SELECT 'LightSeaGreen' UNION ALL SELECT 'LightSkyBlue' UNION ALL
SELECT 'LightSlateGray' UNION ALL SELECT 'LightSteelBlue' UNION ALL SELECT 'LightYellow' UNION ALL SELECT 'Lime' UNION ALL
SELECT 'LimeGreen' UNION ALL SELECT 'Linen' UNION ALL SELECT 'Magenta' UNION ALL SELECT 'MidnightBlue' UNION ALL
SELECT 'Maroon' UNION ALL SELECT 'MediumAquamarine' UNION ALL SELECT 'MediumBlue' UNION ALL SELECT 'MediumVioletRed' UNION ALL
SELECT 'MediumOrchid' UNION ALL SELECT 'MediumTurquoise' UNION ALL SELECT 'MediumPurple' UNION ALL SELECT 'MediumSpringGreen' UNION ALL
SELECT 'MediumSeaGreen' UNION ALL SELECT 'MediumSlateBlue' UNION ALL SELECT 'MintCream' UNION ALL SELECT 'MistyRose' UNION ALL
SELECT 'Moccasin' UNION ALL SELECT 'NavajoWhite' UNION ALL SELECT 'Navy' UNION ALL SELECT 'OldLace' UNION ALL
SELECT 'Olive' UNION ALL SELECT 'OliveDrab' UNION ALL SELECT 'Orange' UNION ALL SELECT 'OrangeRed' UNION ALL
SELECT 'Orchid' UNION ALL SELECT 'PaleGoldenrod' UNION ALL SELECT 'PaleGreen' UNION ALL SELECT 'PaleTurquoise' UNION ALL
SELECT 'PaleVioletRed' UNION ALL SELECT 'PapayaWhip' UNION ALL SELECT 'PeachPuff' UNION ALL SELECT 'Peru' UNION ALL
SELECT 'Pink' UNION ALL SELECT 'Plum' UNION ALL SELECT 'PowderBlue' UNION ALL SELECT 'Purple' UNION ALL
SELECT 'Red' UNION ALL SELECT 'RosyBrown' UNION ALL SELECT 'RoyalBlue' UNION ALL SELECT 'SaddleBrown' UNION ALL
SELECT 'Salmon' UNION ALL SELECT 'SandyBrown' UNION ALL SELECT 'SeaGreen' UNION ALL SELECT 'Seashell' UNION ALL
SELECT 'Sienna' UNION ALL SELECT 'Silver' UNION ALL SELECT 'SkyBlue' UNION ALL SELECT 'SlateBlue' UNION ALL
SELECT 'SlateGray' UNION ALL SELECT 'Snow' UNION ALL SELECT 'SpringGreen' UNION ALL SELECT 'SteelBlue' UNION ALL
SELECT 'Tan' UNION ALL SELECT 'Teal' UNION ALL SELECT 'Thistle' UNION ALL SELECT 'Tomato' UNION ALL
SELECT 'Turquoise' UNION ALL SELECT 'Violet' UNION ALL SELECT 'Wheat' UNION ALL SELECT 'White' UNION ALL
SELECT 'WhiteSmoke' UNION ALL SELECT 'Yellow' UNION ALL SELECT 'YellowGreen' )
,CTERANDOMLYNUMBER AS
(
SELECT CONTACTTBLKEY,
TheColor,
ROWNUM = ROW_NUMBER() OVER (PARTITION BY CONTACTTBLKEY ORDER BY NEWID())
FROM CMCONTACT --THE TABLE WITH THE KEY
CROSS JOIN MyColorCTE --THE TABLE WITH THE RANDOM VALUE
)
UPDATE CMCONTACT --THE JOINING TABLE
SET CMCONTACT.LASTNAME = CTERANDOMLYNUMBER.TheColor --THE RANDOM VALUE
FROM CTERANDOMLYNUMBER
WHERE CMCONTACT.CONTACTTBLKEY=CTERANDOMLYNUMBER.CONTACTTBLKEY --FOR EACH KEY
AND ROWNUM = 1 --LIMITS TO ONE ARBITRARY VALUE DUE TO THE ORDER BY NEWID
Lowell
July 28, 2011 at 7:39 am
Jeff Moden (7/28/2011)
Ghanta (7/27/2011)
Anyone has Procedure to scramble data like addresses, but the new data should not be any characters... address1 should look like address1 field and so on? Thanks!Will using one customer's address in place of another's do? In other words, row-to-row scrambling?
Thanks Jeff. Yes that will work and I am thinking I will change the numbers on that addresses...
July 28, 2011 at 7:49 am
Yeah Lowell thanks. I was checking if someone has something generic for this.
July 28, 2011 at 12:04 pm
BrainDonor (7/28/2011)
Ghanta (7/27/2011)
Anyone has Procedure to scramble data like addressesDo you mean column encryption? http://msdn.microsoft.com/en-us/library/ms179331(v=SQL.100).aspx
Ghanta (7/27/2011)
but the new data should not be any characters... address1 should look like address1 field and so on?Or do you mean replacing a real address with a dummy address, in order to use a real database for testing, so the Developers don't see real addresses?
BrainDonor
BrainDonor I meant replacing a real address with a dummy address like you said.
July 28, 2011 at 6:46 pm
Ghanta (7/28/2011)
Jeff Moden (7/28/2011)
Ghanta (7/27/2011)
Anyone has Procedure to scramble data like addresses, but the new data should not be any characters... address1 should look like address1 field and so on? Thanks!Will using one customer's address in place of another's do? In other words, row-to-row scrambling?
Thanks Jeff. Yes that will work and I am thinking I will change the numbers on that addresses...
That works for me. This is about as generic a method as you can get. As always, most of the details are in the code.
--=====================================================================================================================
-- Setup some "burnable" test data to demonstrate on.
-- Nothing in this section is a part of the solution. We're just building test data.
--=====================================================================================================================
--===== Work in a nice safe place that everyone has
USE TempDB
;
--===== Conditionally drop the test table to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL
DROP TABLE #TestTable
;
--===== Create and populate the test table on-the-fly.
-- We use a well know source for this demonstration.
SELECT AddressID, AddressLine1
INTO #TestTable
FROM AdventureWorks.Person.Address
;
--===== Add the likely Primary Key to the test table.
-- Don't use named constraints on a temp table because of concurrency issues.
-- No problem on tests like this but will cause problems in Production.
ALTER TABLE #TestTable
ADD PRIMARY KEY CLUSTERED (AddressID)
;
--=====================================================================================================================
-- The section solves the problem by randomly changing all the addresses to some other address in the table.
--=====================================================================================================================
--===== Randomly "Scramble" the addresses to another ID
WITH
cteEnumerateStraight AS
(--==== This creates the original non-randomized "sort" of AddressLine1
SELECT RowNum = ROW_NUMBER() OVER (ORDER BY AddressID),
AddressID,
AddressLine1
FROM #TestTable
),
cteEnumerateRandom AS
(--==== This creates a random "sort" of AddressLine1
SELECT RowNum = ROW_NUMBER() OVER (ORDER BY NEWID()),
AddressLine1
FROM #TestTable
)--==== This changes the "straight" address to one of the "random" ones.
-- All addresses are affected because of the uniqueness of ROW_NUMBER()
-- and the fact both RowNum columns were built from the same table.
-- I say "all addresses are affected" but there is the random chance
-- that the random address will be the original. You could check for
-- such a thing by comparing the original table against the Dev table.
UPDATE tgt
SET AddressLine1 = rnd.AddressLine1
FROM cteEnumerateStraight tgt
INNER JOIN cteEnumerateRandom rnd
ON tgt.RowNum = rnd.RowNum
;
--Jeff Moden
Change is inevitable... Change for the better is not.
July 28, 2011 at 6:47 pm
p.s. Please post back if you have any difficulty in randomizing the numerics in each address.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 28, 2011 at 6:52 pm
Ah... almost forgot. I got your email. I know you were probably "under the gun" for this task but I don't normally have the time to respond with code while I'm at work. Ya gotta be patient with me. 😉 Sometimes, it's even worth the wait. 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
July 28, 2011 at 6:58 pm
Thanks a lot Jeff. This will help me save a lot of time. Thanks to Lovell as well as I can use his approach for names and other fields. I will be able to generate random numbers for numeric data. Thanks again.
July 28, 2011 at 7:03 pm
Jeff Moden (7/28/2011)
Ah... almost forgot. I got your email. I know you were probably "under the gun" for this task but I don't normally have the time to respond with code while I'm at work. Ya gotta be patient with me. 😉 Sometimes, it's even worth the wait. 😛
Yes it is worth a wait. Thanks! I had worked on a script that would place the addresses into a temporary table and then I would randomly assign the addresses, but that did not guarantee new addresses for everyone. Your script helps me out rite away. Thanks again.
July 28, 2011 at 7:17 pm
Ghanta (7/28/2011)
... but that did not guarantee new addresses for everyone. Your script helps me out rite away. Thanks again.
You're welcome and thanks for the feedback but, like I said in the code, my method doesn't "guarantee" new addresses for everyone. There's the random chance that an address or two could randomly be assigned back to its original spot. You need to write a check against the original data to ensure that hasn't happened. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
July 28, 2011 at 7:40 pm
Sure Jeff I will do that. Thanks.
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply