Retrieving First Word, First + Second Word, First + Second + Third Word, First

  • Hi All,

    I have the company names in one column. Need a sql server query to  fetch in separate column like

    Example:

    ABC private limited company

    First Word

    ABC

    First + Second Word

    ABC private

    First + Second + Third Word

    ABC private limited

    First + Second + Third + Fourth Word

    ABC private limited company

    Please find the attached excel sheet for reference.

     

    Attachments:
    You must be logged in to view attached files.
  • Here is something to get you started:

    WITH CompanyData AS 
    (
    SELECT * FROM (VALUES
    ('[24]7 Customer, Inc.'),
    ('137 Pillars House, Chiang Mai'),
    ('139 Holdings , Ltd.'),
    ('20Cube Logistics Solutions Pvt Ltd (MUMBAI, IN)'),
    ('24/7, Inc'),
    ('2GO Express, Inc.'),
    ('2Go Group, Inc.'),
    ('2s2, Inc.'),
    ('2Way World Co , Ltd.'),
    ('314 Fitness, Inc.'),
    ('3A Pharmaceutical Company, Ltd'),
    ('3d Fitness, LLC'),
    ('3D3 Com Pty, Ltd'),
    ('3dfit, Inc.'),
    ('3t Fitness, LLC'),
    ('4 Better Health, Inc'),
    ('42 Intertrade Co.,Ltd.'),
    ('4DX Imaging Technology Co., Ltd'),
    ('4FRONT Solutions, LLC')
    ) AS CompanyList (company)
    ),
    CTE2 AS
    (
    SELECT company,
    value AS word,
    ROW_NUMBER() OVER (PARTITION BY company ORDER BY (SELECT NULL)) AS rownum
    FROM CompanyData
    CROSS APPLY STRING_SPLIT(CompanyData.company, ' ')
    ),
    RecursiveCTE AS
    (
    SELECT company,
    word,
    rownum,
    CAST(word AS NVARCHAR(MAX)) AS accumulated
    FROM CTE2
    WHERE rownum = 1
    UNION ALL
    SELECT r.company,
    c.word,
    c.rownum,
    r.accumulated + ' ' + c.word
    FROM RecursiveCTE r
    JOIN CTE2 c ON r.company = c.company AND r.rownum + 1 = c.rownum
    )
    SELECT company,
    MAX(CASE WHEN rownum = 1 THEN word END) AS [1 Word],
    MAX(CASE WHEN rownum = 2 THEN accumulated END) AS [2 Words],
    MAX(CASE WHEN rownum = 3 THEN accumulated END) AS [3 Words],
    MAX(CASE WHEN rownum = 4 THEN accumulated END) AS [4 Words],
    MAX(CASE WHEN rownum = 5 THEN accumulated END) AS [5 Words],
    MAX(CASE WHEN rownum = 6 THEN accumulated END) AS [6 Words],
    MAX(CASE WHEN rownum = 7 THEN accumulated END) AS [7 Words]
    FROM RecursiveCTE
    GROUP BY company;
  • What is the Primary key column name and datatype for the table where these company names are stored?

    And, are you really using SQL Server 2022?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • And, no... I wouldn't use a Recursive CTE for this because they're just too expensive for Reads.  A well written WHILE loop is more effective.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • @sathishkm-2 - Please explain the business reason for why you need to do this.  Is it in hopes of making a more viable search for a company name?

    Ok... shifting gears to the solution and in the absence of a PK and IF you're really using SQL Server 2022...

    Here's how you should submit test data to help folks that will help you...

    --===== Create the test table. This is NOT a part of the solution!
    DROP TABLE IF EXISTS #TestTable;
    GO
    SELECT v1.CompanyName
    INTO #TestTable
    FROM (VALUES
    ('[24]7 Customer, Inc.')
    ,('137 Pillars House, Chiang Mai')
    ,('139 Holdings , Ltd.')
    ,('20Cube Logistics Solutions Pvt Ltd (MUMBAI, IN)')
    ,('24/7, Inc')
    ,('2GO Express, Inc.')
    ,('2Go Group, Inc.')
    ,('2s2, Inc.')
    ,('2Way World Co , Ltd.')
    ,('314 Fitness, Inc.')
    ,('3A Pharmaceutical Company, Ltd')
    ,('3d Fitness, LLC')
    ,('3D3 Com Pty, Ltd')
    ,('3dfit, Inc.')
    ,('3t Fitness, LLC')
    ,('4 Better Health, Inc')
    ,('42 Intertrade Co.,Ltd.')
    ,('4DX Imaging Technology Co., Ltd')
    ,('4FRONT Solutions, LLC')
    )v1(CompanyName)

    ;
    GO

    Here's some code that will only work in SQL Server 2022.  There's a healthy alternative if your not but we need to know if you're NOT using 2022 to provide that.  This code could be "DRYed" out a bit more using CROSS APPLY to "cascade" putting the pieces back together but probably not worth it here.  Details of what each section of the code does are in the comments.

       WITH 
    cteEnumerateCompanyName AS
    (--==== In the absence of a PK, enumerate each row to put things back together later.
    SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    ,CompanyName
    FROM #TestTable
    )
    ,cteSplit AS
    (--==== Split each rows on spaces and only keep up to the first 4 "words"
    -- and put each part on the same row using a "CROSSTAB", which is frequently "better" than a PIVOT
    -- https://www.sqlservercentral.com/articles/cross-tabs-and-pivots-part-1-converting-rows-to-columns-1
    SELECT ecn.RN
    ,CompanyName
    ,Word_1 =MAX(IIF(splt.ordinal = 1, splt.value,''))
    ,Word_2 =MAX(IIF(splt.ordinal = 2, splt.value,''))
    ,Word_3 =MAX(IIF(splt.ordinal = 3, splt.value,''))
    ,Word_4 =MAX(IIF(splt.ordinal = 4, splt.value,''))
    FROM cteEnumerateCompanyName ecn
    CROSS APPLY STRING_SPLIT(CompanyName,' ',1) splt --This is the part that limits the code to 2022.
    GROUP BY ecn.RN, CompanyName
    )--==== Put the first 4 words back together in stages as requested.
    SELECT CompanyName
    ,Word_1
    ,Word_1_2 = CONCAT_WS(' ', Word_1, Word_2)
    ,Word_1_2_3 = CONCAT_WS(' ', Word_1, Word_2, Word_3)
    ,Word_1_2_3_4 = CONCAT_WS(' ', Word_1, Word_2, Word_3, Word_4)
    FROM cteSplit
    ;

    And here are the results...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Or maybe just this:

    SELECT CompanyName, 
    LEFT(CompanyName, EndOfWord1) AS [First Word],
    LEFT(CompanyName, EndOfWord2) AS [First + Second Word],
    LEFT(CompanyName, EndOfWord3) AS [First + Second + Third Word],
    LEFT(CompanyName, EndOfWord4) AS [First + Second + Third + Fourth Word]
    FROM #TestTable
    CROSS APPLY (
    SELECT CHARINDEX(' ', CompanyName + ' ') - 1 AS EndOfWord1
    ) AS ca1
    CROSS APPLY (
    SELECT CASE WHEN CHARINDEX(' ', CompanyName, EndOfWord1 + 2) = 0 THEN LEN(CompanyName)
    ELSE CHARINDEX(' ', CompanyName, EndOFWord1 + 2) - 1 END AS EndOfWord2
    ) AS ca2
    CROSS APPLY (
    SELECT CASE WHEN CHARINDEX(' ', CompanyName, EndOfWord2 + 2) = 0 THEN LEN(CompanyName)
    ELSE CHARINDEX(' ', CompanyName, EndOFWord2 + 2) - 1 END AS EndOfWord3
    ) AS ca3
    CROSS APPLY (
    SELECT CASE WHEN CHARINDEX(' ', CompanyName, EndOfWord3 + 2) = 0 THEN LEN(CompanyName)
    ELSE CHARINDEX(' ', CompanyName, EndOFWord3 + 2) - 1 END AS EndOfWord4
    ) AS ca4

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    Or maybe just this:

    NNNNNNNNNIIIIIIIIICCCCCCEEEEEEE!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • An alternative

    select v1.CompanyName,
    left(v1.CompanyName, sum(iif(ss.ordinal<=1, v.str_len, 0))) _1_word,
    left(v1.CompanyName, sum(iif(ss.ordinal<=2, v.str_len, 0))+1) _2_word,
    left(v1.CompanyName, sum(iif(ss.ordinal<=3, v.str_len, 0))+2) _3_word,
    left(v1.CompanyName, sum(iif(ss.ordinal<=4, v.str_len, 0))+3) _4_word
    from #TestTable v1
    cross apply string_split(v1.CompanyName, ' ', 1) ss
    cross apply (values (len(ss.[value]))) v(str_len)
    where ss.ordinal<=4
    group by v1.CompanyName
    order by v1.CompanyName;

    • This reply was modified 11 months, 2 weeks ago by  Steve Collins.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Also NNNNIIIIIICCCCCCCEEEEEE!

    If I get some time, I try to do some performance tests on the lot tonight.  This is a part of why I love this community!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This was removed by the editor as SPAM

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply