June 11, 2015 at 3:27 pm
My goal is to deidentify the following hypothetical data set (provided by Vanderbilt University http://biostat.mc.vanderbilt.edu/wiki/pub/Main/XuleiLiu/HowtoDe-identifyData_LXL_20080307.pdf):
SSN
111223333
111223333
123456789
123456789
222441045
222441045
289761125
343551104
343551104
343551104
343551104
638621207
638621207
746921198
and make an ID field that will look like the following: (I want to de identify the data and not use ssn)
STUDY_ID SSN
1 111223333
1 111223333
2 123456789
2 123456789
3 222441045
3 222441045
4 289761125
5 343551104
5 343551104
5 343551104
5 343551104
6 638621207
6 638621207
6 638621207
7 746921198
If someone could please provide me the code to do this in Microsoft Access (or a simpler way to do this using access, I would really appreciate it. Thanks.
June 11, 2015 at 3:41 pm
Create another table of just the unique SSNs. Add an autonumber to the table. It will populate with unique numbers (might have gaps). Add another column to the Patient table (long integer). Update with the value of the autonumber. Remove the SSN from the database. Or put it in another database that's secure (SQL Server)... then you can encrypt the SSN column and have a view that decrypts the SSN and assigns permissions so that only the people that are allowed to see the SSN have permission to it.
June 11, 2015 at 3:56 pm
Thanks for such a fast reply. I have very little knowledge in access and SQL, so please explain the following process 'Update with the value of the autonumber.' I can make a separate table with unique ssns and autonumbers. But how do I update the first table with multiple ssns so that the autonumbers correspond with the multiple entries for ssns? I do not come from a CS background so please dumb it down.
June 11, 2015 at 4:22 pm
mwill172 (6/11/2015)
Thanks for such a fast reply. I have very little knowledge in access and SQL, so please explain the following process 'Update with the value of the autonumber.' I can make a separate table with unique ssns and autonumbers. But how do I update the first table with multiple ssns so that the autonumbers correspond with the multiple entries for ssns? I do not come from a CS background so please dumb it down.
Okay. let me take a giant step backwards. Whoever stored people's SSNs like that is taking serious risks. SSNs, if they need to be stored should be in a secure database, not in Access. Had the same argument at Sarah Cannon. Just plain dangerous in the wrong hands.
Create a new table with an autonumber field and the SSN field.
CREATE TABLE UniqueSSN (SSN CHAR(10) PRIMARY KEY, SSN_ID Autonumber);
INSERT INTO UniqueSSN ( SSN )
SELECT DISTINCT TableWithDuplicateSSNs.SSN
FROM TableWithDuplicateSSNs;
Then you'd have to update your original table.
UPDATE UniqueSSN
INNER JOIN TableWithDuplicateSSNs ON UniqueSSN.SSN = TableWithDuplicateSSNs.SSN
SET TableWithDuplicateSSNs.UseThisID = [UniqueSSN].[ID];
Make a backup before you do this stuff, though. If it goes sideways, you've just destroyed your table(s).
June 25, 2015 at 11:35 am
1. Create new table (Example: Table1) with ID as Field1 and SSN as Field2. Save and close the new table.
2. Open Query design:
3. right click and select show tables:
4. select "Table1" and the original table that holds your info.
5. click and drag SSN from "Table1" to SSN of your original table. (it will create a relationship between SSN on the 2 tables.)
6. In the 1st field box in lower window input something like this:
Student IDSSN: [ID] & " " & [SSN] & "32" <- the & "32" I placed in there to add more numbers to mask what the numbers mean, you can remove if you like.
what it should look like when you run the query is 1 11122333332
Once your done with that click on "design" tab and "make table!" in the ribbon and it will prompt you make or select a table to create or update. type in "table2" (or whatever you choose to call it) and that will make your table when you run that query.
7. Save and name the query you've just created
8. Assign it to a command button in whatever form you feel it would best belong to.
This is a read only table and you will not be able to add or remove data from this table. to add or remove you'd need to go to the original DB. This is just to mask the data for front end report use.
Hope this helps and have a good one.
June 25, 2015 at 11:49 am
Instead of the & "32" I would sub that out for & "=Date(ddd)" which will give you a 3 digit Julian date. that would give you a reference to when the query was last ran.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply