The Dangers of Algorithms

  • Kyrilluk (3/2/2016)


    I don't want things public. That creates issues. But they need to be able to be audited, independently.

    Can two experts disagree? Sure. We see this all the time in all sorts of engineering disputes, medical issues, financial ones, etc. The point is there can be some sort of group to weight both sides. Jury, arbitrator, judge, whatever.

    I'm not asking for things to be 100% correct, but rather that they can be judged.

    If we are still speaking about data mining algorithms, the issue is that some algorithm - such as Neural Networks - cannot be audited. This is why they are called "black box" algorithms. When I say that they cannot be audited, I mean that although it is in theory possible, in practice, you will need to have your expert perusing your models for month before being able to work out why one particular decision was taken (for example why your loan was refused). And by then, the algorithm would have produced a different model any way.

    To try to prove that an algorithm is biased against you or made a deliberate mistake (or that you are been discriminated against) is too costly. As I stated earlier on, the only way, is to judge the result (you didn't get the loan so the bank manager or the algorithm must therefore be racist/sexist/islamophob/homophob/minority-o-phob/etc) and not the algorithm.

    I have been seriously thinking about this over the last couple of days and believe that a Neural Network can be audited. Not directly, as both Kyrilluk and I have explained. However, the training sets can be analysed and evaluated to see if there is any bias trained in.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Gary Varga (3/2/2016)


    I have been seriously thinking about this over the last couple of days and believe that a Neural Network can be audited. Not directly, as both Kyrilluk and I have explained. However, the training sets can be analysed and evaluated to see if there is any bias trained in.

    But then, what constitute bias? If we think about the famous Titanic dataset, if we want to predict who is going to survive the titanic and that our training dataset is showing that people that survived were mostly wealthy women, do we conclude that our dataset is biased? Do we need to drop the sex and the wealth column from our dataset because it gives uncomfortable reading?

  • Kyrilluk (3/2/2016)


    Gary Varga (3/2/2016)


    I have been seriously thinking about this over the last couple of days and believe that a Neural Network can be audited. Not directly, as both Kyrilluk and I have explained. However, the training sets can be analysed and evaluated to see if there is any bias trained in.

    But then, what constitute bias? If we think about the famous Titanic dataset, if we want to predict who is going to survive the titanic and that our training dataset is showing that people that survived were mostly wealthy women, do we conclude that our dataset is biased? Do we need to drop the sex and the wealth column from our dataset because it gives uncomfortable reading?

    FWIW, and this is off-topic, I've seen datasets drop columns, or even drop the whole data collection, because the reality it revealed wasn't something that funding source wanted known.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • Just because the outcome leans towards some categorisation of people doesn't mean that the training of the algorithm is biased. Maybe the situation it is modelling is biased so that all the algorithm is doing is reflecting reality accurately.

    Bias, as both you and Rod have suggested, is in the manipulation of the data and/or the algorithm NOT that the result shows a bias which is what the Titanic analysis illustrates.

    Results illustrating bias <> biased modelling.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Kyrilluk (3/2/2016)


    Gary Varga (3/2/2016)


    I have been seriously thinking about this over the last couple of days and believe that a Neural Network can be audited. Not directly, as both Kyrilluk and I have explained. However, the training sets can be analysed and evaluated to see if there is any bias trained in.

    But then, what constitute bias? If we think about the famous Titanic dataset, if we want to predict who is going to survive the titanic and that our training dataset is showing that people that survived were mostly wealthy women, do we conclude that our dataset is biased? Do we need to drop the sex and the wealth column from our dataset because it gives uncomfortable reading?

    This isn't about the dataset, it's about how the choices are made. If I exclude sex, that's fine, but not if the company doing so is using this to create a bias.

    The need for examining algorithms isn't the actual algorithm, but the use to which it's put.

  • Kyrilluk (3/2/2016)


    I don't want things public. That creates issues. But they need to be able to be audited, independently.

    Can two experts disagree? Sure. We see this all the time in all sorts of engineering disputes, medical issues, financial ones, etc. The point is there can be some sort of group to weight both sides. Jury, arbitrator, judge, whatever.

    I'm not asking for things to be 100% correct, but rather that they can be judged.

    If we are still speaking about data mining algorithms, the issue is that some algorithm - such as Neural Networks - cannot be audited. This is why they are called "black box" algorithms. When I say that they cannot be audited, I mean that although it is in theory possible, in practice, you will need to have your expert perusing your models for month before being able to work out why one particular decision was taken (for example why your loan was refused). And by then, the algorithm would have produced a different model any way.

    To try to prove that an algorithm is biased against you or made a deliberate mistake (or that you are been discriminated against) is too costly. As I stated earlier on, the only way, is to judge the result (you didn't get the loan so the bank manager or the algorithm must therefore be racist/sexist/islamophob/homophob/minority-o-phob/etc) and not the algorithm.

    This is where the usage of algorithms gets out of hand. if you cannot explain what the algorithm is doing (what it used and how it used it), then you're essentially violating the basic definition of an algorithm. In other words once you lose the ability to explain (in detail) how it came to its conclusion - it's kind of no longer an algorithm.

    I do think there's a serious legal issue here, since you otherwise have to be able to explain what influenced your decision. If you can't that ultimately means you can't prove that you didn't discriminate.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • David.Poole (3/2/2016)


    Humans are good at spotting patterns even when there isn't one.

    An algorithm will be applied consistently and with out bias or prejudice.

    Can you fool them if you understand them? Google Derren Brown.

    Remember humans are susceptible to the hippo principle. Highest paid persons opinion

    Just have to be careful even here. There are many cases where using the pattern is precisely what gets you in trouble. Think profiling: at an aggregate level you might see behaviors that are more prevalent in certain groups, and a pattern matcher might decide that that's the quickest way to arrive at the decision (i.e. you're in that group therefore you are in or out). using that macro observation to rule on each individual case will land you in court. Just because I am more likely to perform a certain behavior doesn't automatically grant you the right to assume I will perform such behaviors.

    There are many cases here in the US where blindly ("i.e. consistent and without prejudice") applying what otherwise might be a statistical significant observation at the aggregate level to an individual level is precisely what lands you in hot water.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Here's the upcoming danger of "algorithms"...

    http://www.imdb.com/title/tt0119177/plotsummary

    Too far fetched for you? How about a dose of reality, then.

    http://www.eyeondna.com/2007/05/19/want-a-job-submit-your-dna/

    http://www.eyeondna.com/2009/10/29/getting-a-job-at-the-university-of-akron-could-require-a-dna-sample/

    As idiotic as they normally are ...

    http://www.eyeondna.com/2007/04/26/victory-for-genetic-information-nondiscrimination-act-gina/

    Unfortunately, it's not airtight and genetic discrimination does have a foothold in the U.S....

    http://www.eyeondna.com/2008/04/23/genetic-information-nondiscrimination-act-gina-nears-unanimous-consent-passage-in-us-senate/

    ... and from that link, I quote with some emphasis on the problem that prevails...

    Under GINA:

    •It would be illegal for insurance companies to raise premiums or deny coverage based on genetic information (see Genetic Testing and Health Insurance in the New York Times)

    •Employers are prohibited from using genetic information to hire, fire, promote, or assign jobs (see Want a job? Submit your DNA)

    [font="Arial Black"]Life and long-term care insurance coverages, however, are not part of GINA.

    [/font]

    I won't get into the totally ridiculous ways companies try to determine good candidates for any particular job except to say that they miss a lot of diamonds in the rough and hire a lot of total jackasses because of their "algorithms".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The more the database can do on its own the better. Less chance for the programmer to mess it up.

Viewing 9 posts - 46 through 53 (of 53 total)

You must be logged in to reply to this topic. Login to reply