Query to group and count data elements from a single column

  • In records 1, 2 and 3 (plus 3000 more rows), we have the following sample data from table SAMPLE_DATA, COL01:

    The brown cat jumped over the fence
    The red fox dug under the fence
    The white rabbit ran past the fence

    I need to list the elements of each Col1 value and compare them to the rest of the elements from other Col1 records.
    A SQL statement that produces the following output listed in order from most occurring to least occurring, then alphabetically
    The number is the count of how many times the element appears in the data set and the text is the element.
    I only need the top 1000 rows.  Capitals and lower case should be treated as the same letter.

    (6) the
    (3) fence
    (1) brown
    (1) cat
    (1) dug
    (1) fox
    (1) jumped
    (1) over
    (1) past
    (1) rabbit
    (1) ran
    (1) red
    (1) under
    (1) white

    Any suggestions would be very helpful.  Thank you.

  • paulgq - Wednesday, March 6, 2019 12:15 AM

    In Col1, records 1, 2 and 3 we have the following data:

    The brown cat jumped over the fence
    The red fox dug under the fence
    The white rabbit ran past the fence

    I need to list the elements of each Col1 value and compare them to the rest of the elements from other Col1 records.
    A SQL statement that produces the following output listed in order from most occurring to least occurring, then alphabetically
    The number is the count of how many times the element appears in the data set and the text is the element.
    I only need the top 1000 rows.  Capitals and lower case should be treated as the same letter.

    (6) the
    (3) fence
    (1) brown
    (1) cat
    (1) dug
    (1) fox
    (1) jumped
    (1) over
    (1) past
    (1) rabbit
    (1) ran
    (1) red
    (1) under
    (1) white

    Any suggestions would be very helpful.  Thank you.

    Here is one way of doing this
    😎


    USE TEEST;
    GO
    SET NOCOUNT ON;

    ;WITH SAMPLE_DATA(COL01) AS
    (
      SELECT 'The brown cat jumped over the fence'  UNION ALL
      SELECT 'The red fox dug under the fence'   UNION ALL
      SELECT 'The white rabbit ran past the fence'
    )
    SELECT
      WRD.[value] AS WORD
     ,COUNT(*) AS WCNT
    FROM SAMPLE_DATA  SD
    CROSS APPLY string_split(SD.COL01,CHAR(32)) WRD
    GROUP BY WRD.[value]
    ORDER BY WCNT DESC;

    Output

    WORD    WCNT
    ------- -----
    The     6
    fence   3
    fox     1
    jumped  1
    over    1
    past    1
    rabbit  1
    ran     1
    red     1
    under   1
    white   1
    brown   1
    cat     1

  • that seems elegant.  but I'm confused about this part

    ;WITH SAMPLE_DATA(COL01) AS
    (
    SELECT 'The brown cat jumped over the fence' UNION ALL
    SELECT 'The red fox dug under the fence'  UNION ALL
    SELECT 'The white rabbit ran past the fence'
    )

    There are hundreds of rows that have all different values.
    How do I select all of the rows from SAMPLE_DATA COL01 ?

    Thanks,

  • paulgq - Wednesday, March 6, 2019 1:45 AM

    that seems elegant.  but I'm confused about this part

    ;WITH SAMPLE_DATA(COL01) AS
    (
    SELECT 'The brown cat jumped over the fence' UNION ALL
    SELECT 'The red fox dug under the fence'  UNION ALL
    SELECT 'The white rabbit ran past the fence'
    )

    There are hundreds of rows that have all different values.
    How do I select all of the rows from SAMPLE_DATA COL01 ?

    Thanks,

    The SAMPLE_DATA CTE is simply there to provide sample data for the method, you will simply skip that and point to the table/column you are going to use.
    😎

    What version of SQL Server are you using?
    If you are on SQL Server < 2012 then use this function.
    If you are on SQL Server 2012 or 2014 then use this function.
    For SQL Server > 2014, use the string_split function.

  • ok, think that I understand.  my statement will be

    SELECT WRD.[value] AS WORD, COUNT(*) AS WCNT
    FROM SAMPLE_DATA
    CROSS APPLY string_split(SAMPLE_DATA.COL01,CHAR(32)) WRD
    GROUP BY WRD.[value]
    ORDER BY WCNT DESC;

    I am using MS Access so I am unsure if the above syntax will work correctly or not.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply