unpersonalize a SQL database

  • Hello,

    I have an question about how to unpersonalize a SQL database

    The reason is I work in an hospital and we want to develop an new application on an SQL database, but in this database are patients with all their information like medication and other things

    I don’t want to give the development team a copy of this database without atliest changing the name of the patient but there are a lot of patients

    Is there a way to do this automatically ?

    Greetings

  • I've had to do this a few times, and couldn't find any tools to help (I haven't looked recently, so there may be some now).

    In the end I wrote a few functions to scramble names, dates of birth etc and then created a script to do the scrambling, table by table, something like this...

    UPDATE patient set PatientName = dbo.ScrambleName(PatientName),

    PatientDOB = dbo.ScrambleDate(PatientDOB) etc etc

    I don’t want to give the development team a copy of this database...

    It may not be a question of WANT, there are quite probably laws that say you CAN'T give the development team a copy, without scrambling the data.

  • Something like this?

    WITH PatientNos AS (

    SELECT ROW_NUMBER () OVER (ORDER BY NEWID()) AS PNo

    , *

    FROM Patients

    )

    UPDATE PatientNos

    SET PatientName = 'Patient' + CAST (PNo as varchar(7))

    John

  • There may be a lot more than name you need to scramble. SSN. MRN. DOB. When you're done there shouldn't be any way to track back to the actual patient. I wasn't part of the project so I don't have the code but we pulled a client's DB so we had an internal system for testing that was full of data and had to scramble all patient identifying information. Instead of doing something linear like John posted we went for completely random names and numbers. We ended up with long, incomprehensible names but it worked.

  • Thanx for your replays

    I started with a script that replaces the name, address, and a couple of other fields to other information

    So address is than address +IDnumber of the record.

    So thanx for your help :w00t:

    Greetings.

  • That might work for address but if you're planning on doing name, SSN and MRN like that it may not meet the requirements to be considered anonymous. Are you planning on using that method for those too or a different one?

  • Yes, i do this for all the fields where there is only a trace of personal information

  • You may have other, less-obvious considerations. I work at a law school, and sometimes I have to be careful not give out lists of sanitized students and their courses and grades for a term, if the assemblage of courses could be used to identify a particular student. You need to know your business and what information (other than the obvious) could be used to identify an individual.

    Rich

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply