Group the common word

  • Hey,

    Can you please provide some on idea on group the common word?

    Example:

    Code Desc

    A1 AAAA BBBB CCCC

    A1.1 AAAA BBBB CCCC DDDDD

    A1.2 AAAA BBBB CCCC DDDDD EEEEE

    A2 XXXX

    A2.1 XXXX YYYY

    A2.3 XXXX YYYY ZZZZ

    Output will be

    Code Desc Desc2 Desc3 Desc4

    A1 AAAA BBBB CCCC AAAA BBBB CCCC

    A1.1 AAAA BBBB CCCC DDDDD AAAA BBBB CCCC DDDDD

    A1.2 AAAA BBBB CCCC DDDDD EEEEE AAAA BBBB CCCC DDDDD EEEEE

    A2 XXXX XXXX

    A2.1 XXXX YYYY XXXX YYYY

    A2.3 XXXX YYYY ZZZZ XXXX YYYY ZZZZ

    Thanks,

    Karthik

  • You said grouping, but the output shows replicating the existing information? Did you really mean that or am I missing something?

  • Krtyknm (8/6/2012)


    Hey,

    Can you please provide some on idea on group the common word?

    Example:

    Code Desc

    A1 AAAA BBBB CCCC

    A1.1 AAAA BBBB CCCC DDDDD

    A1.2 AAAA BBBB CCCC DDDDD EEEEE

    A2 XXXX

    A2.1 XXXX YYYY

    A2.3 XXXX YYYY ZZZZ

    Output will be

    Code Desc Desc2 Desc3 Desc4

    A1 AAAA BBBB CCCC AAAA BBBB CCCC

    A1.1 AAAA BBBB CCCC DDDDD AAAA BBBB CCCC DDDDD

    A1.2 AAAA BBBB CCCC DDDDD EEEEE AAAA BBBB CCCC DDDDD EEEEE

    A2 XXXX XXXX

    A2.1 XXXX YYYY XXXX YYYY

    A2.3 XXXX YYYY ZZZZ XXXX YYYY ZZZZ

    Thanks,

    Karthik

    Your input and output seem to contain random numbers of columns. Please provide sample data and results in a more consumable format, as per the link in my signature.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • well this kind of gets you the data chopped up into the pieces youa r elooking for;

    i'm using the famous DelimitedSplit8K you'll find links to in many posters signitures, and you can search teh site as well:

    With MyCTE( Code,[Desc])

    AS

    (

    SELECT 'A1','AAAA BBBB CCCC' UNION ALL

    SELECT 'A1.1','AAAA BBBB CCCC DDDDD' UNION ALL

    SELECT 'A1.2','AAAA BBBB CCCC DDDDD EEEEE' UNION ALL

    SELECT 'A2','XXXX' UNION ALL

    SELECT 'A2.1','XXXX YYYY' UNION ALL

    SELECT 'A2.3','XXXX YYYY ZZZZ'

    )

    select * From MyCTE

    cross apply dbo.DelimitedSplit8K([Desc],' ') x

    i have no idea what you wanted for your final outoput; it was just not intuitive what you are after.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • It looks to me like you're just repeating the contents of each row. "A B C" becomes "A B C A B C" (I just took out the repeating letters to make it a bit easier to read). Each row seems to do exactly that.

    But that doesn't match the columns you listed in your output.

    So, I really can't tell what you're trying to do here. If it's just repeat a value, then just add the value (as a string) to itself (as a string), and you'll have it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hey Guys,

    Sorry for providing in incorrect format.

    Please find the below.

    With MyCTE( Code,[Desc])

    AS

    (

    SELECT 'A1','AAAA BBBB CCCC' UNION ALL

    SELECT 'A1.1','AAAA BBBB CCCC DDDDD' UNION ALL

    SELECT 'A1.2','AAAA BBBB CCCC DDDDD EEEEE' UNION ALL

    SELECT 'A2','XXXX' UNION ALL

    SELECT 'A2.1','XXXX YYYY' UNION ALL

    SELECT 'A2.3','XXXX YYYY ZZZZ'

    )

    select * From MyCTE

    ;With MyCTEOutput( Code,[Desc],[Desc1],Desc2,Desc3)

    AS

    (

    SELECT 'A1','AAAA BBBB CCCC','AAAA BBBB CCCC','','' UNION ALL

    SELECT 'A1.1','AAAA BBBB CCCC DDDDD','AAAA BBBB CCCC','DDDDD','' UNION ALL

    SELECT 'A1.2','AAAA BBBB CCCC DDDDD EEEEE', 'AAAA BBBB CCCC','DDDDD','EEEEE' UNION ALL

    SELECT 'A2','XXXX','','','' UNION ALL

    SELECT 'A2.1','XXXX YYYY','XXXX','YYYY','' UNION ALL

    SELECT 'A2.3','XXXX YYYY ZZZZ','XXXX','YYYY','ZZZZ'

    )

    SELECT * FROM MyCTEOutput

    Basically, we need to keep the common word in all the rows.

    Thanks,

    Karthik

  • Do you know before-hand how many columns you'll have in the final output? If not, you'll need some sort of dynamic SQL to write the query for you. That'll get complex fast!

    Alternatively, can the output be in XML format? If so, then the variable "columns" doesn't matter. A recursive For XML query will do what's needed for that.

    If it's a fixed number of columns, then you can do this with a recursive CTE. It's just a hierarchy crawl, with variable data in the "repeating sets" columns.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Ok. Here the columns will be added dynamically based on the first row and second row description value. If the first description value is AAA BBB CCC and second description value os AAA BBB CCC DDD EEE, then two coulmn will be added for DDD and EEE.

    So we need to separate the common words in the column desc for the code A1 (group).

    Let me know if it is not make sense.

    Thanks,

    Karthik

  • That makes sense.

    That means a variable number of columns, from what I can see. That's fairly complex to code, unless XML is an option for the output.

    Will the application that reads this data accept variable columns? Most won't, but it can be built to do so in some cases.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Yes, variable columns are accepted in the code.

  • Okay. So what you'll need to do is break the data up by word, so you have the row ID (A, A1) and each "word" as a separate row. Look at Jeff Moden's article on string parsing, on this site, to get an idea of how to do that.

    Then you'll crawl the hierarchy (if it's more than one level deep), or join to the parent directly (if the hierarchy is only one level deep), by row, parent, and "word". That will give you the common words.

    Then take the rest of the words and run a dynamic pivot on them. If you Bing/Google "T-SQL dynamic pivot", you'll find good articles on that subject. I think Jeff Moden has one on that, too. If not, I think Phil Factor and Robyn Page have one on Simple-Talk.com.

    It'll be a little complex, but if you do what those articles say, you'll be able to get it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I have parsed the string using space as a delimmiter, however not sure how to make it as a hierarchy. Also i need to the keep the first row value in all the rows within a group (A0). Can you please provide some guidance.

    Thanks,

    Karthik

  • Guys,

    Thanks for your help.

    I have worked out using Recursive CTE to get the desired results.

    Thanks,

    Karthik

  • Krtyknm (8/7/2012)


    Guys,

    Thanks for your help.

    I have worked out using Recursive CTE to get the desired results.

    Thanks,

    Karthik

    This is a two way street, Karthik. If you came up with a solution, the right thing to do would be to share it here with the folks that tried to help. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply