Trying to determine fraudulent activity based on numbers dialed

  • Hello, I am in an industry that uses phone systems for detained individuals...what I have found is some of these individuals will dial numbers in sequence looking for the next victim (basically dialing trying to find someone to steal from). I am trying to figure out a way to find this dialing pattern so I can assist the facility in stopping them. For instance they will dial a phone number, then that number + 1, etc., and continue for dozens of iterations (not a typical scenario for a phone to be used like this). Here is a script to create a sample temp table:

    CREATE TABLE #btnsdialed

    (

    btn varchar(15)

    )

    INSERT INTO #btnsdialed VALUES (8172229001)

    INSERT INTO #btnsdialed VALUES (8172229002)

    INSERT INTO #btnsdialed VALUES (8172229003)

    INSERT INTO #btnsdialed VALUES (8172229004)

    INSERT INTO #btnsdialed VALUES (8172229004)

    INSERT INTO #btnsdialed VALUES (8172229005)

    INSERT INTO #btnsdialed VALUES (8172229006)

    INSERT INTO #btnsdialed VALUES (8172229017)

    INSERT INTO #btnsdialed VALUES (8172229008)

    INSERT INTO #btnsdialed VALUES (8172229009)

    INSERT INTO #btnsdialed VALUES (8172229005)

    INSERT INTO #btnsdialed VALUES (8172229016)

    INSERT INTO #btnsdialed VALUES (8172229007)

    INSERT INTO #btnsdialed VALUES (8172229008)

    INSERT INTO #btnsdialed VALUES (8172229039)

    INSERT INTO #btnsdialed VALUES (8172229010)

    INSERT INTO #btnsdialed VALUES (8172229011)

    What I am wanting to be able to do is run this on the entire facility and have it show me any phone that has consecutive dialing patterns where the numbers being dialed are consecutive more than X times (probably 5 times)...yes, the database has the number dialed stored as a varchar, adding to the complexity....

    I have found a few posts close, but could not get anything working.....thanks! BTW, if you can assist with this you may very well add time to a bad person on the inside.. 🙂

  • Here goes the pseudo-code...

    FOR EACH SourceNumber on DialedNumbersTable(

    LOAD CURSOR DialedNumbersCursor from DialedNumbersTable

    SET SeedNumber = DialedNumbersCursor(1)

    SET Matches = 1

    SET Pointer = 1

    LOOP CURSOR DialedNumbersCursor

    SET Pointer = Pointer + 1

    IF DialedNumbersCursor(Pointer) = SeedNumber + 1

    then SET Matches = Matches + 1

    ELSE

    SET Matches = 1

    SET SeedNumber = DialedNumbersCursor(Pointer)

    ENDIF

    IF Matches > 5

    then GOTCHA_ROUTINE() <<== Report Suspected Pattern

    ENDIF

    ENDLOOP

    END

    )

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • you probvably need to add more columns to the sample data;

    then you could use something like rownumber;

    here's a rough guess to get us started:

    SELECT * FROM (

    SELECT

    row_number() over(PARTITION BY USERID,EVENTDATE,LEFT(BUTTONSDIALED,7) ORDER BY USERID AS RW,

    USERID,

    EVENTDATE,

    BUTTONSDIALED

    From SampleData) MyAlias

    WHERE RW > 3 --more than 3 "randomish attmpts at # in teh same area, exchange, and first # of the last 4 digits

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I'm wondering if we are over thinking this a bit.... What about using "quirky update" to determine the mathematical difference between phone numbers in each row and then look for a pattern there? This would allow you to not only look for consecutive differences, but also possible patterns.

    Thanks,

    Jared

    Jared
    CE - Microsoft

  • The simple way to look for that kind of patterns in rows is to join a table to itself with the rules you want it to follow.

    For example:

    SELECT DISTINCT

    *

    FROM #btnsdialed AS B1

    INNER JOIN #btnsdialed AS B2

    ON LEFT(B1.btn, 9) = LEFT(B2.btn, 9)

    AND RIGHT(B1.btn, 1) = RIGHT(B2.btn, 1) - 1

    INNER JOIN #btnsdialed AS B3

    ON LEFT(B1.btn, 9) = LEFT(B3.btn, 9)

    AND RIGHT(B2.btn, 1) = RIGHT(B3.btn, 1) - 1

    INNER JOIN #btnsdialed AS B4

    ON LEFT(B1.btn, 9) = LEFT(B4.btn, 9)

    AND RIGHT(B3.btn, 1) = RIGHT(B4.btn, 1) - 1

    INNER JOIN #btnsdialed AS B5

    ON LEFT(B1.btn, 9) = LEFT(B5.btn, 9)

    AND RIGHT(B4.btn, 1) = RIGHT(B5.btn, 1) - 1 ;

    That will find 5 in a row, any place it appears in the table.

    It won't catch an iteration from 09 to 10 in the last two digits. If you want that, you need to just modify the Left() and Right() function calls to accommodate 2 digits being compared.

    Because it's doing math on strings, performance will be quite poor on large datasets. I'm not sure how poor, but it won't be good. There might be ways around that, I don't have time to play with that right now.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/22/2011)


    It won't catch an iteration from 09 to 10 in the last two digits. If you want that, you need to just modify the Left() and Right() function calls to accommodate 2 digits being compared.

    Because it's doing math on strings, performance will be quite poor on large datasets. I'm not sure how poor, but it won't be good. There might be ways around that, I don't have time to play with that right now.

    Phone numbers in the united states do not start with 0, so why not cast as an int and use math? OP: I assume there are no international calls allowed?

    Jared

    Jared
    CE - Microsoft

  • Here is an actual sample table with real data from the person committing the fraud: (I did change the area code to protect the innocent..)

    This is a specific phone from yesterday during a specific time frame, its obvious the guy is dialing for dollars.....

    CREATE TABLE #cd_sitea

    (

    btn varchar(25) ,

    phone int,

    cdate char(8),

    ctime char(4)

    )

    insert into #cd_sitea values (4447262104,605,20111121,1900)

    insert into #cd_sitea values (4447262105,605,20111121,1902)

    insert into #cd_sitea values (4447262106,605,20111121,1903)

    insert into #cd_sitea values (4447262107,605,20111121,1905)

    insert into #cd_sitea values (4447262092,605,20111121,1906)

    insert into #cd_sitea values (4447262092,605,20111121,1907)

    insert into #cd_sitea values (4447262108,605,20111121,1908)

    insert into #cd_sitea values (4447262109,605,20111121,1909)

    insert into #cd_sitea values (4447262110,605,20111121,1911)

    insert into #cd_sitea values (4447262111,605,20111121,1912)

    insert into #cd_sitea values (4447262092,605,20111121,1914)

    insert into #cd_sitea values (4447262092,605,20111121,1915)

    insert into #cd_sitea values (4447262112,605,20111121,1917)

    insert into #cd_sitea values (4447262113,605,20111121,1917)

    insert into #cd_sitea values (4447262114,605,20111121,1918)

    insert into #cd_sitea values (4447262066,605,20111121,1919)

    insert into #cd_sitea values (4447262065,605,20111121,1922)

    insert into #cd_sitea values (4447262067,605,20111121,1923)

    insert into #cd_sitea values (4447262066,605,20111121,1925)

    insert into #cd_sitea values (4447262115,605,20111121,1926)

    insert into #cd_sitea values (4447262117,605,20111121,1927)

    insert into #cd_sitea values (4447262118,605,20111121,1928)

    insert into #cd_sitea values (4447262119,605,20111121,1930)

    insert into #cd_sitea values (4447262120,605,20111121,1931)

    insert into #cd_sitea values (4447262121,605,20111121,1933)

    insert into #cd_sitea values (4447262122,605,20111121,1934)

    insert into #cd_sitea values (4447262123,605,20111121,1935)

    insert into #cd_sitea values (4447262066,605,20111121,1936)

    insert into #cd_sitea values (4447262125,605,20111121,1937)

    insert into #cd_sitea values (4443735708,605,20111121,1938)

    insert into #cd_sitea values (4447262126,605,20111121,1940)

    insert into #cd_sitea values (4447262127,605,20111121,1942)

    insert into #cd_sitea values (4447262128,605,20111121,1943)

    insert into #cd_sitea values (4447262129,605,20111121,1944)

    insert into #cd_sitea values (4447262130,605,20111121,1945)

    I just added 4000 plus rows in the zip attachment that you can put into a temp table, it is for all phones at one facility, again, I changed the area codes so they are not real numbers, however, the logic will apply, as you can see several phones where they are dialing to defraud people...the last ones we listend to the detained person claims to be from law enforcement and is trying to get an older person to forward their phone for "testing"....these guys will never learn!

    I really appreciate all the replies so far, this may actually get me somewhere, I was at a dead end!

  • They are allowed to dial international, but this routine can ignore those, as they do not typically dial international, so yes, all North American numbers start with a 2 thru 9. I have been using and len(btn) = 10 to restrict it to just North American numbers. The samples I have provided are the same...

  • You're going to find this a little odd but what it will do is give you indicators if those types of dialers are found. First, we need a little core table modification though. If that's completely out of the question take the phone numbers and dump them to a temp table for your script, basically like you've done above.

    Now, the table adjustments:

    ALTER TABLE #cd_sitea

    ADD NumericColumn BIGINT NULL

    UPDATE #cd_sitea

    SETNumericColumn = CASE WHEN isnumeric(btn) = 0 THEN NULL ELSE CONVERT( BIGINT, btn) END

    CREATE INDEX idx_cdsitea_NumCol ON #cd_sitea (phone, NumericColumn)

    And now for the code, using a little math and some sorting:

    ;WITH RowNumInc AS

    (SELECT

    phone,

    NumericColumn,

    ROW_NUMBER() OVER (PARTITION BY phone ORDER BY NumericColumn) AS rn

    FROM

    #cd_sitea

    )

    SELECT

    phone,

    NumericColumn - rn AS GroupingValue,

    COUNT(*) AS NumOccur

    FROM

    RowNumInc

    GROUP BY

    phone,

    NumericColumn - rn

    HAVING

    COUNT(*) > 5

    If you have any confusion about how that works just give a holler. 🙂


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • p-nut (11/22/2011)


    GSquared (11/22/2011)


    It won't catch an iteration from 09 to 10 in the last two digits. If you want that, you need to just modify the Left() and Right() function calls to accommodate 2 digits being compared.

    Because it's doing math on strings, performance will be quite poor on large datasets. I'm not sure how poor, but it won't be good. There might be ways around that, I don't have time to play with that right now.

    Phone numbers in the united states do not start with 0, so why not cast as an int and use math? OP: I assume there are no international calls allowed?

    Jared

    Has nothing to do with starting with 0. "Last two digits", not "first two".

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Craig,

    This is woking really well! I am going to present a report to the facility today based on last nights activities, and hopefully get these guys in line. Working on adding in the description fields, etc now...thanks again!

  • gmcnitt (11/23/2011)


    Craig,

    This is woking really well! I am going to present a report to the facility today based on last nights activities, and hopefully get these guys in line. Working on adding in the description fields, etc now...thanks again!

    Glad to hear. 🙂 There's a few other things you can do to make that better long term but it'll get you runnin' for now.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • GSquared (11/23/2011)


    p-nut (11/22/2011)


    GSquared (11/22/2011)


    It won't catch an iteration from 09 to 10 in the last two digits. If you want that, you need to just modify the Left() and Right() function calls to accommodate 2 digits being compared.

    Because it's doing math on strings, performance will be quite poor on large datasets. I'm not sure how poor, but it won't be good. There might be ways around that, I don't have time to play with that right now.

    Phone numbers in the united states do not start with 0, so why not cast as an int and use math? OP: I assume there are no international calls allowed?

    Jared

    Has nothing to do with starting with 0. "Last two digits", not "first two".

    I see that, but then you have the full phone number being treated as an integer it does not matter. I think they have it figured out now anyway 🙂

    Jared

    Jared
    CE - Microsoft

  • It may be worth the effort to expand on this to look for other regular patterns. I would imagine eventually someone will figure out the sequential dialing is being detected, and alter it a bit. For example, phone# + 2 rather than 1, or same final four digits, but different area codes.

    Odds are, the patterns will not be complex, determining the next number, or where you left off would be a problem for most people to track themselves.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply