how to split following string

  • hi all,

    i have a string as 'a;b;cde;f;ghi;d' or 'a;b;cde;f;ghi;defg'

    now i want to split the given string as 'a;b;cde;f;ghi;' or 'a;b;cde;f;ghi;'

    i tried using substring,left,charindex functions but i did not get the required output

    can anybody help me out to solve this issue

    thanks in advance

    regards

    Durgesh J

  • seem easy or maybe I don't understand very well your question here:

    there is one of the solution:

    CREATE TABLE TTT

    (

    ID INT IDENTITY (1,1) NOT NULL,

    CHARS varchar(100)

    )

    INSERT INTO TTT

    SELECT 'a;b;cde;f;ghi;d' UNION ALL

    SELECT 'a;b;cde;f;ghi;defg'

    SELECT CHARS, LEFT(CHARS,14), SUBSTRING(CHARS,15,15)

    FROM TTT;

    Hope it helps!

    :hehe:

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • DURGESH (2/13/2009)


    hi all,

    i have a string as 'a;b;cde;f;ghi;d' or 'a;b;cde;f;ghi;defg'

    now i want to split the given string as 'a;b;cde;f;ghi;' or 'a;b;cde;f;ghi;'

    i tried using substring,left,charindex functions but i did not get the required output

    can anybody help me out to solve this issue

    thanks in advance

    regards

    Durgesh J

    I believe you're going to get a lot of answers that might not be what you want because of the sample data you've shown. If you want some really good answers, you have to ask a really good question... see the link in my signature below. To summarize, if you provide a CREATE TABLE statment and some real test data in the form of insert statements, you probably do a lot better.

    For example, you already have one solution that uses a fixed length to find the 4 delimiter... but is that the way your data really is? Read the link, post the data like it says how to.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, I hear a non-RBAR solution coming... anyway, thanks for the awesome presentation last night at the SEMSUG at Microsoft!

    I had a RBAR "discussion" with a CTO once. He still didn't believe my solution even after my solution took 60 seconds compared to his RBAR solution that took 3 hours.

    _______________
    bkDBA
    0.175 seconds -- 10 year average margin of victory at the Daytona 500
    212 Degrees

  • bkDBA (2/13/2009)


    Jeff, I hear a non-RBAR solution coming... anyway, thanks for the awesome presentation last night at the SEMSUG at Microsoft!

    I had a RBAR "discussion" with a CTO once. He still didn't believe my solution even after my solution took 60 seconds compared to his RBAR solution that took 3 hours.

    Thanks for the great feedback, bk... especially about the SEMSUG presentation... that was a whole lot of fun and it was a great group of people.

    Have you a name to go with the handle of bkDBA that you wouldn't mind sharing?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Brian Karcher

    I'm the IT Manager at Progessive Metal.

    _______________
    bkDBA
    0.175 seconds -- 10 year average margin of victory at the Daytona 500
    212 Degrees

  • bkDBA (2/14/2009)


    Brian Karcher

    I'm the IT Manager at Progessive Metal.

    Thanks, Brian. My short term memory is pretty well shot... correct me if I'm wrong, please... you weren't one of the folks that came up and introduced yourself after the presentation, were you?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Another version, assumes that data should be removed after last delimiter

    CREATE TABLE TTT

    (

    ID INT IDENTITY (1,1) NOT NULL,

    CHARS varchar(100)

    )

    INSERT INTO TTT

    SELECT 'a;b;cde;f;ghi;d' UNION ALL

    SELECT 'a;b;cde;f;ghi;defg'

    SELECT CHARS, REVERSE(SUBSTRING( REVERSE(CHARS),CHARINDEX(';',REVERSE(CHARS)),LEN(REVERSE(CHARS))))

    FROM TTT;

  • Gopi Nath Muluka (2/17/2009)


    Another version, assumes that data should be removed after last delimiter

    CREATE TABLE TTT

    (

    ID INT IDENTITY (1,1) NOT NULL,

    CHARS varchar(100)

    )

    INSERT INTO TTT

    SELECT 'a;b;cde;f;ghi;d' UNION ALL

    SELECT 'a;b;cde;f;ghi;defg'

    SELECT CHARS, REVERSE(SUBSTRING( REVERSE(CHARS),CHARINDEX(';',REVERSE(CHARS)),LEN(REVERSE(CHARS))))

    FROM TTT;

    Be careful! 😉 REVERSE is one of the slowest functions there is.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/13/2009)


    DURGESH (2/13/2009)


    hi all,

    i have a string as 'a;b;cde;f;ghi;d' or 'a;b;cde;f;ghi;defg'

    now i want to split the given string as 'a;b;cde;f;ghi;' or 'a;b;cde;f;ghi;'

    i tried using substring,left,charindex functions but i did not get the required output

    can anybody help me out to solve this issue

    thanks in advance

    regards

    Durgesh J

    I believe you're going to get a lot of answers that might not be what you want because of the sample data you've shown. If you want some really good answers, you have to ask a really good question... see the link in my signature below. To summarize, if you provide a CREATE TABLE statment and some real test data in the form of insert statements, you probably do a lot better.

    For example, you already have one solution that uses a fixed length to find the 4 delimiter... but is that the way your data really is? Read the link, post the data like it says how to.

    Apparently, the op has left the building... and so have I. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Give him or her time, might be on the other side of the world.

    If you compare REVERSE with SUBSTRING or other string functions, how much slower is it? Any benchmarks?

  • Steve Jones - Editor (2/17/2009)


    Give him or her time, might be on the other side of the world.

    If you compare REVERSE with SUBSTRING or other string functions, how much slower is it? Any benchmarks?

    Heh... I asked the op a question on the first day of this thread... that was 5 full days before I gave up on the op. How much time are you supposed to give someone in a world connected by the internet? I thought 5 days was more than enough.

    For your question, I don't have a bench mark... But, now that someone has asked, I will tonight... I'll use the code given with the original example given and compare it to what some of the answers given without reverse are.

    I'd provide my own answer, but the op hasn't answer my simple question, yet.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 5 days is plenty. Sorry, missed the OP date.

    I'm interested to see the benchmarks. I know doing this in C would be complicated, but not tremendously more than a search like charindex or substring.

  • Jeff Moden (2/14/2009)


    Thanks, Brian. My short term memory is pretty well shot... correct me if I'm wrong, please... you weren't one of the folks that came up and introduced yourself after the presentation, were you?

    No I wasn't... so your memory is still intact! I'd be very interested in some more statistics with your approaches. I love proving to people I can do something in 5 seconds when they swear it's not possible and will take at least 1 hour in .NET, unless we add more processors...

    _______________
    bkDBA
    0.175 seconds -- 10 year average margin of victory at the Daytona 500
    212 Degrees

  • Steve Jones - Editor (2/18/2009)


    5 days is plenty. Sorry, missed the OP date.

    I'm interested to see the benchmarks. I know doing this in C would be complicated, but not tremendously more than a search like charindex or substring.

    Ok... here we go... first, some test data (a piddly million rows). As usual, read the comments in the code to see what's going on...

    --===== Create and populate a 1,000,000 row test table.

    -- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

    -- Column "SomeString" consists of 6 delimited groups of 1 to 3 characters each

    -- Jeff Moden

    SELECT TOP 1000000

    RowNum = IDENTITY(INT,1,1),

    SomeString = CAST(RIGHT(NEWID(),ABS(CHECKSUM(NEWID()))%3+1) +';'

    + RIGHT(NEWID(),ABS(CHECKSUM(NEWID()))%3+1) +';'

    + RIGHT(NEWID(),ABS(CHECKSUM(NEWID()))%3+1) +';'

    + RIGHT(NEWID(),ABS(CHECKSUM(NEWID()))%3+1) +';'

    + RIGHT(NEWID(),ABS(CHECKSUM(NEWID()))%3+1) +';'

    + RIGHT(NEWID(),ABS(CHECKSUM(NEWID()))%3+1)

    AS VARCHAR(50))

    INTO dbo.JBMTest

    FROM Master.dbo.SysColumns t1

    CROSS JOIN Master.dbo.SysColumns t2

    --===== Add a clustered primary key just because

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== Display the

    SELECT TOP 100 * FROM dbo.JBMTest

    Ok... now some benchmark code. Notice that I short circuit the display by inserting all output into "bit bucket" variables so we're not measuring any display durations. Also notice that SUBSTRING beats LEFT most of the time. ALSO notice that a single REVERSE almost doubles the time and 4 REVERSES almost quadruples the time when compared to SUBSTRING...

    --===== Declare the bit bucket variables used to short circuit the display

    DECLARE @BitBucketCHAR VARCHAR(50)

    DECLARE @BitBucketINT INT

    --===== Start the CPU and duration timers

    SET STATISTICS TIME ON

    --===== Run the tests... the PRINT statements identify each test in the output

    PRINT '==================== No Mod''s (Baseline) ===================='

    SELECT @BitBucketCHAR = SomeString

    FROM dbo.JBMTest

    PRINT '==================== SUBSTRING(1,15) ===================='

    SELECT @BitBucketCHAR = SUBSTRING(SomeString,1,15)

    FROM dbo.JBMTest

    PRINT '==================== LEFT(15) ===================='

    SELECT @BitBucketCHAR = LEFT(SomeString,15)

    FROM dbo.JBMTest

    PRINT '==================== Reverse ===================='

    SELECT @BitBucketCHAR = REVERSE(SomeString)

    FROM dbo.JBMTest

    PRINT '==================== 4 Reverse''s ===================='

    SELECT @BitBucketCHAR = REVERSE(REVERSE(REVERSE(REVERSE(SomeString))))

    FROM dbo.JBMTest

    PRINT '==================== CharIndex ===================='

    SELECT @BitBucketINT = CHARINDEX(';',SomeString)

    FROM dbo.JBMTest

    PRINT '==================== 5 CharIndex''s ===================='

    SELECT @BitBucketINT = CHARINDEX(';',SomeString,CHARINDEX(';',SomeString,CHARINDEX(';',SomeString,CHARINDEX(';',SomeString,CHARINDEX(';',SomeString)+1)+1)+1)+1)

    FROM dbo.JBMTest

    PRINT '==================== LEN() ===================='

    SELECT @BitBucketINT = LEN(SomeString)

    FROM dbo.JBMTest

    --===== Turn of the CPU and duration timers

    SET STATISTICS TIME OFF

    So, like I warned, be careful because REVERSE is one of the most expensive functions there is.

    We also see that if the data is exactly as the OP posted and the combined length of the first five segments is always 15, a simple SUBSTRING(aaaa,1,15) will win the race hands down. BUT, if that's NOT the case and the length of the first five segments can vary AND there are always just six segments, THEN you have to find the 5th and final delmiter to do the substring. 4 REVERSES just isn't necessary especially since it's so comparatively expensive. And why anyone would need to do a REVERSE on a LEN() is a bit beyond me! 😉

    So, here's a test of the 4 REVERSES compared to the way I think it should be... and, no, a Tally table solution would NOT be the best thing for this because the combination of CHARINDEX and a single REVERSE is faster than what a Tally can find the last delimiter in...

    --===== Declare the bit bucket variable used to short circuit the display

    DECLARE @BitBucketCHAR VARCHAR(50)

    --===== Start the CPU and duration timers

    SET STATISTICS TIME ON

    --===== Run the tests... the PRINT statements identify each test in the output

    PRINT '==================== 4 Reverse Method ===================='

    SELECT @BitBucketCHAR = REVERSE(SUBSTRING(REVERSE(SomeString),CHARINDEX(';',REVERSE(SomeString)),LEN(REVERSE(SomeString))))

    FROM dbo.JBMTest

    PRINT '==================== A more direct approach ===================='

    SELECT @BitBucketCHAR = SUBSTRING(SomeString,1,LEN(SomeString)-CHARINDEX(';',REVERSE(SomeString))+1)

    FROM dbo.JBMTest

    --===== Turn of the CPU and duration timers

    SET STATISTICS TIME OFF

    I know what you're thinking... "BIG DEAL! 3.4 seconds (less on faster machines) over a million rows! What the heck is your problem?"

    Well, if you need to do that same thing over a hundred million rows, like I frequently need to, that suddenly becomes a difference of 340 seconds (5.6 minutes on my machine). And, that's just for one such calculation... what if you need to do 5 such calculations? Now you're up to 28 minutes and 20 seconds of... wasted time because it could be done better.

    Let's look at the human side of things, while we're at it... which method is easier to read? This...

    REVERSE(SUBSTRING(REVERSE(SomeString),CHARINDEX(';',REVERSE(SomeString)),LEN(REVERSE(SomeString))))

    ... or this...

    SUBSTRING(SomeString,1,LEN(SomeString)-CHARINDEX(';',REVERSE(SomeString))+1)

    Heh... shoot, just the character count of each should tell you that.

    So, like I said... be careful... REVERSE is one of the most expensive functions you can use and you probably shouldn't use 4 of them. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply