Count patterns of behaviour in a certain order

  • Hi,

    I'm looking for a method to count the number of people in a dataset of Peoples Actions who have completed these Actions in a certain order?

    Let's say there are 4 unique Actions and the people complete these actions in a chronological date order.

    Actions are as follows;

    1. Go to work

    2. Go shopping

    3. Go fishing

    4. Go home

    So Person 'A' does Action 1, then Action 2, then Action 3 and finally Action 4 in date order

    Person 'B' does Action 2, then Action 4, then Action 1 and then Action 3

    There are 1,000s of records so I want to know who did the following;

    How many People did their Actions in order of;

    1,2,3 & then 4

    4,2,1 & then 3

    2,3,1 & then 4

    etc etc.

    Thanks for any suggestions/help

  • Do you already have an existing database, or are you now looking at how best to design the database?

    If you already have the database and cannot change the design, then please post the structure of the tables (as CREATE TABLE statements, focusing only on columns that are relevant to the question - but do include constraints and indexes, please), along with a small selection of sample data (as INSERT statements) to illustrate the problem, and the expected results from that sample data.

    Please run your script in an empty database before posting so that we can focus on solving your problem instead of fixing typos.

    If you are designing the database, then I would say that this specific requirement could be a good reason to make an exception to the normal rule of avoiding repeating groups. For this query, a table that has an identifier for the person and four columns Action1, Action2, Action3, and Action4 would be ideal; the query then becomes extremely simple. But in most cases, you need to design a database for more than a single query, and most other parts of the application will probably not be happy with this design. When designing the database, you have to look at all the parts of the application, and if that makes this specific query harder, then so be it.

    There are also some things that are unclear in the requirements. Does every person always complete all four actions, or can they choose to do just two or three of them, or even do more than four by repeating an action (e.g. work - shop - work - fish - home)? Also is a person recorded just once, or can a person be recorded multiple times (e.g. yesterday I did the actions in order ABCD, but today I did them in order ACDB)? The answers to those questions are relevant for your design.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thanks for the response,

    Unfortunately the data is what it is, & not an error.

    I'm trying to count patterns of behavior, so those Actions can change.

    Currently, I'm working on four Actions to get this method right, so all records will have 4 Actions.

    So this question could be rephrased as;

    A Group of 10K People perform 4 Actions, they can perform them in any order bt must perform all of them.

    With this data, I want to know how many people completed the Actions in the following orders;

    1,2,3,4

    2,4,3,1

    4,2,3,1

    etc etc ...

    Thanks

  • You are repeating the question, but you are not giving the additional information we need. Without the CREATE TABLE and INSERT statements plus expected results I requested, I cannot help you.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Thanks Hugo,

    I'm just looking for the logic to count patterns of action in a sequence.

    If you're not aware of any, then thanks for your contribution.

    I'm thinking someone has come across this problem before & found a solution so it's a Method question rather than one that can be answered by having some data.

    Cheers Peter

  • Hi Forum,

    I have found an Excel VBA solution as an example of what I need.

    It's a function that uses some strings of what I want to find & Counts them.

    Does anyone have a similar function in SQL Server?

    Thanks

  • pbo71465 (4/3/2016)


    Thanks Hugo,

    I'm just looking for the logic to count patterns of action in a sequence.

    If you're not aware of any, then thanks for your contribution.

    I'm thinking someone has come across this problem before & found a solution so it's a Method question rather than one that can be answered by having some data.

    Cheers Peter

    Hi Peter,

    I do know ways to solve this issue. The generic pattern is SELECT (something) FROM (something) GROUP BY (something). The second something is the hardest to get right, and how to do that depends on the structure of the data you need to process.

    If you insist on asking for help without revealing sufficient details, then I can only wish you good luck - perhaps someone will be able to give you what you need. But I think you will find that you can get much better help if you tell us more about your situation.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • pbo71465 (4/3/2016)


    Thanks Hugo,

    I'm just looking for the logic to count patterns of action in a sequence.

    If you're not aware of any, then thanks for your contribution.

    I'm thinking someone has come across this problem before & found a solution so it's a Method question rather than one that can be answered by having some data.

    Cheers Peter

    find it a bit odd that you wont give some sample data....but heyho, thats your choice.

    as a "method" heres my thoughts

    assuming you have a table with three columns (persons and actions and dates) then I would add a ROW_NUMBER to the table based on person and date.

    Then create a cross tab of persons and above ROW_NUMBER() with the actiontype as output

    Then using the crosstab, perform a COUNT of persons and GROUP BY

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Here’s how I would do this. First, number your cases 1,2,3,4 using a cross apply block or similar with CASE statements (if they’re not already numbered in the appropriate sequence). Second, use ROW_NUMBER against a (possibly distincted) set filtered for only these cases. Partition by individual and order by date. Then filter out rows where row number is not equal to case number, and finally count what you have left, discarding counts less than 4. All this - in a single query - is trivially easy for most of the regulars who lurk around here.

    This isn’t the only method at your disposal but I think it would probably be the first method most of us would try.

    But wait – “try”? Try against what exactly? With only a description of the problem domain and no sample data to code against, we cannot even “try”. So no code, sorry.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • as a starter.....given the following what results would you expect?

    if this isnt a basic interpretation of your table, then please post a representative script that does explain your table structure

    USE Tempdb

    GO

    CREATE TABLE PersonAction(

    PersonID VARCHAR(1) NOT NULL

    ,actiondate DATETIME NOT NULL

    ,actiontype INTEGER NOT NULL

    );

    INSERT INTO PersonAction(PersonID,actiondate,actiontype) VALUES

    ('A','01/01/2016',1)

    ,('B','01/01/2016',1)

    ,('C','01/01/2016',1)

    ,('D','01/01/2016',2)

    ,('A','02/01/2016',2)

    ,('B','02/01/2016',2)

    ,('C','02/01/2016',2)

    ,('D','02/01/2016',4)

    ,('A','03/01/2016',3)

    ,('B','03/01/2016',4)

    ,('C','03/01/2016',3)

    ,('D','03/01/2016',2)

    ,('A','04/01/2016',4)

    ,('B','04/01/2016',3)

    ,('C','04/01/2016',4)

    ,('D','04/01/2016',1)

    ,('E','04/01/2016',3)

    ,('E','05/01/2016',4)

    ,('E','06/01/2016',1);

    SELECT PersonID,

    actiondate,

    actiontype

    FROM PersonAction

    ORDER BY PersonID,

    actiondate;

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • J Livingston SQL (4/4/2016)


    as a starter.....given the following what results would you expect?

    if this isnt a basic interpretation of your table, then please post a representative script that does explain your table structure

    USE Tempdb

    GO

    CREATE TABLE PersonAction(

    PersonID VARCHAR(1) NOT NULL

    ,actiondate DATETIME NOT NULL

    ,actiontype INTEGER NOT NULL

    );

    INSERT INTO PersonAction(PersonID,actiondate,actiontype) VALUES

    ('A','01/01/2016',1)

    ,('B','01/01/2016',1)

    ,('C','01/01/2016',1)

    ,('D','01/01/2016',2)

    ,('A','02/01/2016',2)

    ,('B','02/01/2016',2)

    ,('C','02/01/2016',2)

    ,('D','02/01/2016',4)

    ,('A','03/01/2016',3)

    ,('B','03/01/2016',4)

    ,('C','03/01/2016',3)

    ,('D','03/01/2016',2)

    ,('A','04/01/2016',4)

    ,('B','04/01/2016',3)

    ,('C','04/01/2016',4)

    ,('D','04/01/2016',1)

    ,('E','04/01/2016',3)

    ,('E','05/01/2016',4)

    ,('E','06/01/2016',1);

    SELECT PersonID,

    actiondate,

    actiontype

    FROM PersonAction

    ORDER BY PersonID,

    actiondate;

    See, if you would have posted a simple script setting up a small test with your layout and test data right away, someone would have come up with a solution very quickly. Now that J has done this job for you, here's a first attempt at what you've asked for:

    with ctePersons as (

    select distinct PersonID

    from PersonAction

    )

    select a.actions, count(*) as [count]

    from (

    select

    p.PersonID,

    stuff((

    select ',' + convert(varchar(36),pa.actionType) as [text()]

    from PersonAction pa

    where pa.PersonID = p.PersonID

    order by pa.actionDate

    for xml path(''), type

    ).value('.','varchar(2048)'), 1, 1, '') as actions

    from ctePersons p

    ) a

    group by a.actions;

    The results are as follows:

    actions(No column name)

    1,2,3,42

    1,2,4,31

    2,4,2,11

    3,4,11



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • would be appreciated if the OP would reply, rather than logging on and not making any further comment.

    heyho!

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • J Livingston SQL (4/6/2016)


    would be appreciated if the OP would reply, rather than logging on and not making any further comment.

    heyho!

    Counting patterns of unexpected behaviour in a certain order, Graham? 😉

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (4/6/2016)


    J Livingston SQL (4/6/2016)


    would be appreciated if the OP would reply, rather than logging on and not making any further comment.

    heyho!

    Counting patterns of unexpected behaviour in a certain order, Graham? 😉

    Chris ...dont think it is "unexpected behaviour"...beginning to see more and more of these types of questions where OP is given a solution and never responds...until the next question. All I can assume is that they are passing it off as their own code....just hope that they understand and can manage any subsequent issues.

    So...is it "expected behaviour" ?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • J Livingston SQL (4/6/2016)


    Chris ...dont think it is "unexpected behaviour"...beginning to see more and more of these types of questions where OP is given a solution and never responds...until the next question. All I can assume is that they are passing it off as their own code....just hope that they understand and can manage any subsequent issues.

    I frankly don't care if they pass it of as their own code. I use code posted by others very often, sometimes unchanged, sometimes modified, and sometimes I only use the idea of the solution and apply it to a very different problem. More often than not, I don't even remember if I came up with an idea myself or if I copied it (and whome I copied it from).

    What bothers me is when people don't respond with additional information when requested. I deiberately have not posted any solution because the problem was unclear. I could have made assumptions and posted something, but what if the assumptions are wrong? Far too often, the OP will still copy/paste the code and never notice that they are introducing bugs in their code.

    I don't mind helping people solve their problem, but it's their task to know what the problem is. If they are unable to define it, then any solution posted by anyone is a wild shot in the dark, and every poster who copies and uses such a solution is taking a huge risk - often without even being aware of the risk.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply