March 26, 2010 at 3:20 pm
Management wants a report of columns that are personal data, so that we can make sure we are protecting that data from misuse. I need to find columns within the entire database that might possibly contain personal data.
I have a working query, but is there a better way to write this?
SELECT 'DOB' AS NPI, o.name AS TableName, c.name AS ColumnName FROM sys.columns c JOIN sys.objects o ON c.object_ID = o.object_id WHERE o.type = 'u' AND (c.name LIKE '%DOB%' OR c.name like '%birth%')
UNION
SELECT 'Email' AS NPI, o.name AS TableName, c.name AS ColumnName FROM sys.columns c JOIN sys.objects o ON c.object_ID = o.object_id WHERE o.type = 'u' AND Replace(c.name,'_','') LIKE '%email%'
UNION
SELECT 'Address' AS NPI, o.name AS TableName, c.name AS ColumnName FROM sys.columns c JOIN sys.objects o ON c.object_ID = o.object_id WHERE o.type = 'u' AND c.name like '%address%' AND Replace(c.name,'_','') not like '%emailaddress%'
UNION
SELECT 'Zip' AS NPI, o.name AS TableName, c.name AS ColumnName FROM sys.columns c JOIN sys.objects o ON c.object_ID = o.object_id WHERE o.type = 'u' AND c.name LIKE '%ZIP%' AND c.name not LIKE '%suffix%' and c.name not LIKE '%sfx%'
UNION
SELECT 'LastName' AS NPI, o.name AS TableName, c.name AS ColumnName FROM sys.columns c JOIN sys.objects o ON c.object_ID = o.object_id WHERE o.type = 'u' AND (Replace(c.name,'_','') LIKE '%lastname%' or Replace(c.name,'_','') like '%lname%')
BTW - Yes, I realize I do not have SSN in my check. We do not collect SSN.
March 27, 2010 at 8:22 am
😉
Kindly used sql code tag for writing any sql query.
And also post same data for batter view.
March 27, 2010 at 11:13 am
Marcia Q (3/26/2010)
Management wants a report of columns that are personal data, so that we can make sure we are protecting that data from misuse. I need to find columns within the entire database that might possibly contain personal data.I have a working query, but is there a better way to write this?
...
Well, here's the query that I would write:
To all Application managers and Development Leads:
Please be aware that as part of GloboCorp's Data Protection Initiative,
we require the identification (application name, database, table and column
name) of all production database columns that may hold personal user,
employee, customer or vendor information. It is of highest important that
we receive this information as soon as possible as without it we cannot
guarantee the protection of this personal information within your application
and it may even put GloboCorp in jeopardy of violating the applicable legal
regulations and potentially subject to corresponding penalties.
sincerely, Your database Team
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 27, 2010 at 11:16 am
A bit tongue in cheek perhaps, but my point is serious: database administration should not be doing something like this own their own and independent of the Applications or the business lines. Their is just no way that any number of uninformed SQL Queries can reliably determine this without guidance (or really good documentation) from the applications. You will miss things and then you could well be liable...
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 27, 2010 at 1:11 pm
I agree... something like an SSN could be couched in a column called GlobalReferenceNumber, etc. There's too much of a chance of missing something. Make the creators and users of the data identify the personal data by server, instance, DB, table, and column name. They should also identify files that they may be importing, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 29, 2010 at 2:24 pm
Vijay - ah yes, I should have used sql coding. My apologies. Nothing special with the data for this since these queries just run against the system tables. It can be run against AdventureWorks or any handy db.
Jeff/Barry - thanks for the warning. There is waaay more analysis/effort going into the project than what I wrote, including work by application and business teams. Sorry to mislead, but I was just trying to keep my post short. This is not an enterprise wide activity. All of our other databases have already been thoroughly analyzed (ad nauseum), fought over, documented, locked down, etc. We will be acquiring this particular db from outside channels soon, and I'm just trying to help the app team with their work.
All I really wanted to know is if there is a better and/or simpler way to write my query.
BTW Barry, I got a kick out of your first post.
March 29, 2010 at 9:49 pm
Heh, thanks Marcia. Sometimes we forget that there's more than on kind of query. 🙂
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply