May 16, 2012 at 6:40 am
L' Eomot Inversé (5/15/2012)
Stefan Krzywicki (5/15/2012)
L' Eomot Inversé (5/15/2012)
I guess you would find working with EU data protection law would give you fits! 😛There is an absolute necessity to be able to delete things for real (that means irreversibly render them invisible/inaccessible, even by physical analysis of the media with full knowledge of all formats and full access to any keys necessary for decryption of the data) if the data is personally identifiable (ie refers to a specific person, who may be identified from the data, perhaps in conjunction with other data).
Does obscuring the data so it is no longer attributable to a specific inividual satisfy the requirements?
Yes, but in fact it is much harder to do that successfully than you might think. Several people have successfully identified people from what was claimed to be thoroughly anonymised data. I think Ross Anderson has some papers on the topic, and I know there are several Americans who have published on this. But I no longerkeep track of the literature (all I do now is read what turns up on Outlaw or on The Reg, and follow the UK Cryptography list), because I no longer have to worry about data protection; so I can't provide links to any of the research.
Like when my company sends out a survey, which is supposedly anonlymous, but asks male/female, years of service, dept, etc.
Male, over 30 years, IT - they can pretty much pick me out. :Whistling:
May 16, 2012 at 6:40 am
jcrawf02 (5/16/2012)
Brandie Tarvin (5/16/2012)
L' Eomot Inversé (5/15/2012)
We have idiocies like proposals for an "absolute right to be forgotten" which means, for example, that someone serving a prison sentence for rape or for murder of juveniles can demand that the fact that he is doing so is not recorder anywhere the ordinary public could access it, and that the information should not be available to any potential employer - not even to a school board.We have the "right to be forgotten" movement going on in the States too, but I don't think it's gotten to that extreme. Of course, I've been too busy with other things to pay attention to all the nitty-gritty details.
So...you forgot about it? :hehe:
I forget, what are we forgetting?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
May 16, 2012 at 6:52 am
SQLRNNR (5/16/2012)
jcrawf02 (5/16/2012)
Brandie Tarvin (5/16/2012)
L' Eomot Inversé (5/15/2012)
We have idiocies like proposals for an "absolute right to be forgotten" which means, for example, that someone serving a prison sentence for rape or for murder of juveniles can demand that the fact that he is doing so is not recorder anywhere the ordinary public could access it, and that the information should not be available to any potential employer - not even to a school board.We have the "right to be forgotten" movement going on in the States too, but I don't think it's gotten to that extreme. Of course, I've been too busy with other things to pay attention to all the nitty-gritty details.
So...you forgot about it? :hehe:
I forget, what are we forgetting?
There are three sure signs of aging and memory is always the second thing to go.
I can't remember what the first sign is.
May 17, 2012 at 9:01 am
L' Eomot Inversé (5/15/2012)
Stefan Krzywicki (5/15/2012)
L' Eomot Inversé (5/15/2012)
I guess you would find working with EU data protection law would give you fits! 😛There is an absolute necessity to be able to delete things for real (that means irreversibly render them invisible/inaccessible, even by physical analysis of the media with full knowledge of all formats and full access to any keys necessary for decryption of the data) if the data is personally identifiable (ie refers to a specific person, who may be identified from the data, perhaps in conjunction with other data).
Does obscuring the data so it is no longer attributable to a specific inividual satisfy the requirements?
Yes, but in fact it is much harder to do that successfully than you might think. Several people have successfully identified people from what was claimed to be thoroughly anonymised data. I think Ross Anderson has some papers on the topic, and I know there are several Americans who have published on this. But I no longerkeep track of the literature (all I do now is read what turns up on Outlaw or on The Reg, and follow the UK Cryptography list), because I no longer have to worry about data protection; so I can't provide links to any of the research.
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 17, 2012 at 9:36 am
Stefan Krzywicki (5/17/2012)
L' Eomot Inversé (5/15/2012)
Stefan Krzywicki (5/15/2012)
L' Eomot Inversé (5/15/2012)
I guess you would find working with EU data protection law would give you fits! 😛There is an absolute necessity to be able to delete things for real (that means irreversibly render them invisible/inaccessible, even by physical analysis of the media with full knowledge of all formats and full access to any keys necessary for decryption of the data) if the data is personally identifiable (ie refers to a specific person, who may be identified from the data, perhaps in conjunction with other data).
Does obscuring the data so it is no longer attributable to a specific inividual satisfy the requirements?
Yes, but in fact it is much harder to do that successfully than you might think. Several people have successfully identified people from what was claimed to be thoroughly anonymised data. I think Ross Anderson has some papers on the topic, and I know there are several Americans who have published on this. But I no longerkeep track of the literature (all I do now is read what turns up on Outlaw or on The Reg, and follow the UK Cryptography list), because I no longer have to worry about data protection; so I can't provide links to any of the research.
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.
I believe that is exaggerated. Several years ago I developed a pharmacy system for Eckerd Drugs and ascertaining identity of customers was a really tough problem, even when you had a name, date of birth and address. We had two Maria Martinezes living in the same building in Austin. By a freak of coincidence, they had the same birthday and no social insurance numbers because they were living with their immigrated children and they themselves were not expected to earn any income.
May 17, 2012 at 9:41 am
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.
May 17, 2012 at 10:20 am
Steve Jones - SSC Editor (5/17/2012)
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.
Address? 😛
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 17, 2012 at 10:21 am
SQL Kiwi (5/17/2012)
Steve Jones - SSC Editor (5/17/2012)
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.Address? 😛
dental records
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
May 17, 2012 at 10:40 am
SQL Kiwi (5/17/2012)
Steve Jones - SSC Editor (5/17/2012)
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.Address? 😛
Smart-*** Kiwi.
I think it's actually favorite liquor.
May 17, 2012 at 10:46 am
Steve Jones - SSC Editor (5/17/2012)
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.
Here we go: This paper says 87%. And that's just zip code, gender and birthdate. I'd bet it hits the mid to high 90s if you add in age.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 17, 2012 at 10:58 am
Revenant (5/17/2012)
Stefan Krzywicki (5/17/2012)
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.I believe that is exaggerated. Several years ago I developed a pharmacy system for Eckerd Drugs and ascertaining identity of customers was a really tough problem, even when you had a name, date of birth and address. We had two Maria Martinezes living in the same building in Austin. By a freak of coincidence, they had the same birthday and no social insurance numbers because they were living with their immigrated children and they themselves were not expected to earn any income.
But how often does it happen that when you look at a name, gender, address, and birthdate you find your data fits more than one person? Hardly ever. So that data will identify an individual almost all the time - much closer to 100% of teh time than it is to 99%.
If you substitute zip for address, it will still usually identify an individual - Stefan's "70% or 90% or something like that" is in the right ballpark. And Stefan has indeed specified too many parameters, since he had age and also specified birthdate, from which age is deducible.
It's even worse in UK with postcode instead of ZIP, because most post codes cover only a small number of buildings. In Spain postcode isn't much help - my postcode there covers many thousands of dwellings.
Tom
May 17, 2012 at 11:01 am
Stefan Krzywicki (5/17/2012)
Steve Jones - SSC Editor (5/17/2012)
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.
Here we go: This paper says 87%. And that's just zip code, gender and birthdate. I'd bet it hits the mid to high 90s if you add in age.
Well, birthdate or age, they denote the same thing, birthdate a more accurately.
May 17, 2012 at 11:04 am
Revenant (5/17/2012)
Stefan Krzywicki (5/17/2012)
Steve Jones - SSC Editor (5/17/2012)
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.
Here we go: This paper says 87%. And that's just zip code, gender and birthdate. I'd bet it hits the mid to high 90s if you add in age.
Well, birthdate or age, they denote the same thing, birthdate a more accurately.
Heh, yeah. I just realized that by "birthdate" I was thinking MM/DD only.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
May 17, 2012 at 11:05 am
L' Eomot Inversé (5/17/2012)
Revenant (5/17/2012)
Stefan Krzywicki (5/17/2012)
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.I believe that is exaggerated. Several years ago I developed a pharmacy system for Eckerd Drugs and ascertaining identity of customers was a really tough problem, even when you had a name, date of birth and address. We had two Maria Martinezes living in the same building in Austin. By a freak of coincidence, they had the same birthday and no social insurance numbers because they were living with their immigrated children and they themselves were not expected to earn any income.
But how often does it happen that when you look at a name, gender, address, and birthdate you find your data fits more than one person? Hardly ever. So that data will identify an individual almost all the time - much closer to 100% of teh time than it is to 99%.
If you substitute zip for address, it will still usually identify an individual - Stefan's "70% or 90% or something like that" is in the right ballpark. And Stefan has indeed specified too many parameters, since he had age and also specified birthdate, from which age is deducible.
It's even worse in UK with postcode instead of ZIP, because most post codes cover only a small number of buildings. In Spain postcode isn't much help - my postcode there covers many thousands of dwellings.
ZIP+4 here the USA will narrow it even closer.
May 17, 2012 at 11:06 am
Revenant (5/17/2012)
Stefan Krzywicki (5/17/2012)
Steve Jones - SSC Editor (5/17/2012)
Thanks to new data mining abilities I think I read that they have the ability, to 75% or 90% or something like that, to find an individual in the States with nothing more than your Zip Code, gender, age and birthdate. I may have even specified too many parameters there.
Zip, gender, birthdate gets some crazy % identified, or closely identified. Add in one other thing and I think it gets to like 98% likelihood.
Here we go: This paper says 87%. And that's just zip code, gender and birthdate. I'd bet it hits the mid to high 90s if you add in age.
Well, birthdate or age, they denote the same thing, birthdate a more accurately.
Yet another reason to "lie" about things like birth year to websites and subscription lists that don't have a legal need for that information.
Viewing 15 posts - 36,121 through 36,135 (of 66,712 total)
You must be logged in to reply to this topic. Login to reply