Count Data in Large Seprated Text

  • How to count part of data in a field contains large Separated text

    like that

    i have that table structure

    ID , Name, Location

    1 Ronaldo Country:Spain, City:Madrid

    2 Messi Country:Spain, City:Barcelona

    3 Rooney Country:England, City:Manchester

    4 Robben Country:Netherlands, City:Amsterdam

    5 Del Piero Country:Italy, City:Milan

    6 Totti Country:Italy, City:Milan

    7 Benzema Country:Spain, City:Madrid

    8 Ozil Country:Spain, City:Madrid

    9 KAKA Country:Spain, City:Madrid

    10 Beckham Country:England, City:Manchester

    I want to to handle 2 select statements in previous table one to get count of countries and the other to get Count of Cities

    I want to Get Countries Count like that

    Country Count

    Spain 5

    England 2

    Italy 2

    Netherlands 1

    The select statement i want to get Cities count from it

    City Count

    Madrid 4

    Manchester 2

    Milan 2

    Amsterdam 1

    Barcelona 1

    so how to handle previous Select Statements

  • Will all the rows follow the same pattern?

    Country:<Countryname>, City:<cityname>

    There wont be any extra data in that Location column, would it?

  • no .. may be some rows will have country only without any city to be like that only (no key and no value for the city)

    Country:<Country>

    without any city

    and may be will add more details in future to be like that

    Country:<Country>, City:<City>, StreetName:<StreetName>,PostalCode:<PostalCode>

  • Aw no!!! Dont concatenate values into a single column. Normalize and seperate them out as columns which will save u time and money!

  • ColdCoffee (4/4/2012)


    Aw no!!! Dont concatenate values into a single column. Normalize and seperate them out as columns which will save u time and money!

    +1. you are seeing why normalizing is such a great thing and not just some academic fuddy duddy that every one thinks you should follow. most of the stuff makes total common sense when you start to think about getting the data back out.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • You really need to NORMALIZE that table. Other than that, this was the closest I could get with that data.

    SELECT Distinct SUBSTRING(Location,9,5) As country, COUNT(SUBSTRING(Location,9,5))

    OVER(PARTITION BY SUBSTRING(Location,9,5) ) As Count FROM Ex

    Order By COUNT Desc

    Vinu Vijayan

    For better and faster solutions please check..."How to post data/code on a forum to get the best help" - Jeff Moden[/url] 😉

  • Sure i know Normalization is the best solution 🙂

    but that is not available in my situation ...the requirements to be like what i did ...i can't change it

  • I got what i want by doing that

    In My Example The Location Data like That

    [Country] Egypt [City] Alexandria [Country] France [City] Alexandria

    [Country] USA [City] New York [Country] France [City] Cairo

    [Country] Spain [City] Madrid [Country] France [City] Cairo

    [Country] England [City] London

    ----Count Separated Text Code--------

    DECLARE @res NVARCHAR(max)

    SET @res = ''

    DECLARE @x NVARCHAR(max)

    SET @x = ''

    CREATE TABLE #Result ( item nvarchar(max) )

    INSERT INTO #Result

    SELECT REPLACE(CONVERT(nvarchar(MAX), Location),'[Country]',',')

    FROM dbo.Locations

    SELECT @res = @res + item + ', ' from #Result

    SELECT @x=substring(@res,1,len(@res)-1) --as Result

    select

    SUBSTRING(items,0,(LEN (items)-CHARINDEX('[', REVERSE(items)))+1),

    COUNT(items)

    FROM dbo.Split(@x,',')

    where SUBSTRING(items,0,(LEN (items)-CHARINDEX('[', REVERSE(items)))+1) != ''

    Group By SUBSTRING(items,0,(LEN (items)-CHARINDEX('[', REVERSE(items)))+1)

    Order by COUNT(items)

    ----Count Separated Text Code--------

  • Split Function Code

    USE [Test]

    GO

    /****** Object: UserDefinedFunction [dbo].[Split] Script Date: 04/07/2012 19:03:11 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER FUNCTION [dbo].[Split](@String varchar(8000), @Delimiter char(1))

    returns @temptable TABLE (items varchar(8000))

    as

    begin

    declare @idx int

    declare @slice varchar(8000)

    select @idx = 1

    if len(@String)<1 or @String is null return

    while @idx!= 0

    begin

    set @idx = charindex(@Delimiter,@String)

    if @idx!=0

    set @slice = left(@String,@idx - 1)

    else

    set @slice = @String

    if(len(@slice)>0)

    insert into @temptable(Items) values(@slice)

    set @String = right(@String,len(@String) - @idx)

    if len(@String) = 0 break

    end

    return

    end

  • ahmedhussein787 (4/7/2012)


    Split Function Code

    USE [Test]

    GO

    /****** Object: UserDefinedFunction [dbo].[Split] Script Date: 04/07/2012 19:03:11 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER FUNCTION [dbo].[Split](@String varchar(8000), @Delimiter char(1))

    returns @temptable TABLE (items varchar(8000))

    as

    begin

    declare @idx int

    declare @slice varchar(8000)

    select @idx = 1

    if len(@String)<1 or @String is null return

    while @idx!= 0

    begin

    set @idx = charindex(@Delimiter,@String)

    if @idx!=0

    set @slice = left(@String,@idx - 1)

    else

    set @slice = @String

    if(len(@slice)>0)

    insert into @temptable(Items) values(@slice)

    set @String = right(@String,len(@String) - @idx)

    if len(@String) = 0 break

    end

    return

    end

    Please see the following article for why you might want to stop using a split function with a WHILE loop in it.

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • OK do you have any alternative ??

  • ahmedhussein787 (4/8/2012)


    OK do you have any alternative ??

    The "alternative" high performance splitter code is attached to the end of that article in the "resources" section.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I must be missing something here... Like why is a split string function needed at all?

    DECLARE @l TABLE (ID INT, [Name] VARCHAR(30), Location VARCHAR(100))

    INSERT INTO @l (ID, [Name], Location)

    SELECT 1, 'Ronaldo', 'Country:Spain, City:Madrid'

    UNION ALL SELECT 2, 'Messi','Country:Spain, City:Barcelona'

    UNION ALL SELECT 3,'Rooney','Country:England, City:Manchester'

    UNION ALL SELECT 4,'Robben','Country:Netherlands, City:Amsterdam'

    UNION ALL SELECT 5,'Del Piero','Country:Italy, City:Milan'

    UNION ALL SELECT 6,'Totti','Country:Italy, City:Milan'

    UNION ALL SELECT 7,'Benzema','Country:Spain, City:Madrid'

    UNION ALL SELECT 8,'Ozil','Country:Spain, City:Madrid'

    UNION ALL SELECT 9,'KAKA','Country:Spain, City:Madrid'

    UNION ALL SELECT 10,'Beckham','Country:England, City:Manchester'

    UNION ALL SELECT 11,'Dwain', 'Country:Thailand'

    ;WITH Splits AS (

    SELECT ID, [Name]

    ,SUBSTRING(Location, 8+CHARINDEX('Country:', Location)

    , CASE

    WHEN 0=CHARINDEX(',', Location)

    THEN LEN(location) ELSE CHARINDEX(',', Location)-9 END) AS Country

    ,CASE WHEN 0=CHARINDEX(',', Location)

    THEN ''

    ELSE SUBSTRING(Location, 5+CHARINDEX('City:', Location), LEN(Location)) END As City

    FROM @l

    )

    SELECT Country, COUNT(Country) AS [Count]

    FROM Splits

    GROUP BY Country

    --SELECT City, COUNT(City) AS [Count]

    --FROM Splits

    --GROUP BY City

    If only one city and one country, or one country (with no city) is present you can just parse that out of the strings!

    Again, forgive me if I'm missing something.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • True enough, Dwain.:-) I was more concerned about the OP's bad split function than anything else.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff. Thought I had a brain freeze there.

    If you like truly cryptic solutions that can lead to job security, you may want to consider this:

    DECLARE @l TABLE (ID INT, [Name] VARCHAR(30), Location VARCHAR(100))

    INSERT INTO @l (ID, [Name], Location)

    SELECT 1, 'Ronaldo', 'Country:Spain, City:Madrid'

    UNION ALL SELECT 2, 'Messi','Country:Spain, City:Barcelona'

    UNION ALL SELECT 3,'Rooney','Country:England, City:Manchester'

    UNION ALL SELECT 4,'Robben','Country:Netherlands, City:Amsterdam'

    UNION ALL SELECT 5,'Del Piero','Country:Italy, City:Milan'

    UNION ALL SELECT 6,'Totti','Country:Italy, City:Milan'

    UNION ALL SELECT 7,'Benzema','Country:Spain, City:Madrid'

    UNION ALL SELECT 8,'Ozil','Country:Spain, City:Madrid'

    UNION ALL SELECT 9,'KAKA','Country:Spain, City:Madrid'

    UNION ALL SELECT 10,'Beckham','Country:England, City:Manchester'

    UNION ALL SELECT 11,'Dwain', 'Country:Thailand'

    ;WITH CTE AS (

    SELECT CAST('<R><L '+REPLACE(REPLACE(Location, ':', '="'), ',', '" ') + '"/></R>' AS XML) AS Y

    FROM @l)

    SELECT Country, COUNT(Country)

    FROM (

    SELECT x.I.value('@Country[1]', 'VARCHAR(MAX)') AS Country

    FROM (SELECT * FROM CTE) z

    CROSS APPLY Y.nodes('//R/L') AS x(I)) w

    GROUP BY Country

    ;WITH CTE AS (

    SELECT CAST('<R><L '+REPLACE(REPLACE(Location, ':', '="'), ',', '" ') + '"/></R>' AS XML) AS Y

    FROM @l)

    SELECT City, COUNT(City)

    FROM (

    SELECT x.I.value('@City[1]', 'VARCHAR(MAX)') AS City

    FROM (SELECT * FROM CTE) z

    CROSS APPLY Y.nodes('//R/L') AS x(I)) w

    WHERE City IS NOT NULL

    GROUP BY City

    It turns out this will be extensible to when you, despite others recommendations to not do this, you extend your strings to contain other values besides City and Country. 😛


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply