September 2, 2010 at 2:29 pm
Any idea how I could trim this out to only return the W51247?
W~W51247~0~1~0
September 2, 2010 at 2:35 pm
bpowers (9/2/2010)
Any idea how I could trim this out to only return the W51247?W~W51247~0~1~0
Can you give us some clearer requirements?
I mean, substring(@fieldname,3,6) would work for that example, but maybe not all examples.
So what are we really doing? Stripping away characters of a certain profile? Grabbing the second field in a list delimited by '~'?
September 2, 2010 at 2:46 pm
I am trying to link two tables based on the Work Order ID in a report. However, one of the table stores the Work Order type, ID, Sub ID, Lot Id, and Split ID into one string (W~W51247~0~1~0). I need to strip out the ID (W51247), so that I can link the two table based on the ID.
I appreciate all the help. I think I got it.
September 2, 2010 at 2:53 pm
SELECT SUBSTRING( @field, CHARINDEX( '~', @field)+1, LEN( @Field) - CHARINDEX('~', @field, charindex('~', @field)+1) -1)
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 2, 2010 at 2:54 pm
Assuming Work Order type will always be char(1) (based on the column definition) and ID will always be char(6) I'd go with the substring method Cory suggested.
Depending on the number of rows I might consider adding an indexed persisted computed column to join the two tables.
September 2, 2010 at 2:56 pm
if the length from the string where you have to extract the id is fixed then substring function will works eg substring(yourstring,3,5)
September 2, 2010 at 7:50 pm
bpowers (9/2/2010)
I am trying to link two tables based on the Work Order ID in a report. However, one of the table stores the Work Order type, ID, Sub ID, Lot Id, and Split ID into one string (W~W51247~0~1~0). I need to strip out the ID (W51247), so that I can link the two table based on the ID.I appreciate all the help. I think I got it.
Based on this, where the work order ID is the 2nd "field" in this string, I would suggest :
declare @test-2 varchar(100);
set @test-2 = 'W~W51247~0~1~0';
select Item
from dbo.DelimitedSplit8K(@test, '~')
where ItemID = 2;
Here is the latest version of the Delimited Split Function.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
September 3, 2010 at 8:30 am
WayneS (9/2/2010)
...Based on this, where the work order ID is the 2nd "field" in this string, I would suggest :
declare @test-2 varchar(100);
set @test-2 = 'W~W51247~0~1~0';
select Item
from dbo.DelimitedSplit8K(@test, '~')
where ItemID = 2;
Here is the latest version of the Delimited Split Function.
As much as I like the split string function I don't think it#s the best performing solution for this scenario.
Reason:
1) We know the position of the delimiter (at least it has not been stated otherwise) and
2) We only need to select one element from that list (at a known position).
Therefore, I think performing the multiple substring calls the function will do will be less efficient compared to the single substring call. That's just theory but I'm confident it can be proved. 😉
September 3, 2010 at 9:16 am
LutzM (9/3/2010)
WayneS (9/2/2010)
...Based on this, where the work order ID is the 2nd "field" in this string, I would suggest :
declare @test-2 varchar(100);
set @test-2 = 'W~W51247~0~1~0';
select Item
from dbo.DelimitedSplit8K(@test, '~')
where ItemID = 2;
Here is the latest version of the Delimited Split Function.
As much as I like the split string function I don't think it#s the best performing solution for this scenario.
Reason:
1) We know the position of the delimiter (at least it has not been stated otherwise) and
2) We only need to select one element from that list (at a known position).
Therefore, I think performing the multiple substring calls the function will do will be less efficient compared to the single substring call. That's just theory but I'm confident it can be proved. 😉
I had thought about this. The issue I have is that there are two assumptions being made:
1. That the preceding "field" will always be one character, and
2. That this "field" will always be 6 characters.
Since the "field" that is being extracted is composed of a letter and 5 numbers, and those numbers are already > 1/2 to rolling over to needing a 6th number, it seemed to me that using the Delimited Split function would be the safest way to handle things just in case either of those assumptions prove to be false in the future. So, assuming that either of those "fields" may grow in the future, the only given we really have is that we are extracting this field from the second delimited item in the string.
Now, if both of those assumptions turn out to be true, including in the future, then I absolutely agree with you. I just felt it was safer to go this way.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
September 3, 2010 at 12:33 pm
Absolutely agreed.
Now the OP can decide which scenario he's dealing with and he has the "best-fit solution" for either one. Can't be any better, right?
September 3, 2010 at 12:59 pm
WayneS (9/3/2010)
LutzM (9/3/2010)
WayneS (9/2/2010)
...Based on this, where the work order ID is the 2nd "field" in this string, I would suggest :
declare @test-2 varchar(100);
set @test-2 = 'W~W51247~0~1~0';
select Item
from dbo.DelimitedSplit8K(@test, '~')
where ItemID = 2;
Here is the latest version of the Delimited Split Function.
As much as I like the split string function I don't think it#s the best performing solution for this scenario.
Reason:
1) We know the position of the delimiter (at least it has not been stated otherwise) and
2) We only need to select one element from that list (at a known position).
Therefore, I think performing the multiple substring calls the function will do will be less efficient compared to the single substring call. That's just theory but I'm confident it can be proved. 😉
I had thought about this. The issue I have is that there are two assumptions being made:
1. That the preceding "field" will always be one character, and
2. That this "field" will always be 6 characters.
Since the "field" that is being extracted is composed of a letter and 5 numbers, and those numbers are already > 1/2 to rolling over to needing a 6th number, it seemed to me that using the Delimited Split function would be the safest way to handle things just in case either of those assumptions prove to be false in the future. So, assuming that either of those "fields" may grow in the future, the only given we really have is that we are extracting this field from the second delimited item in the string.
Now, if both of those assumptions turn out to be true, including in the future, then I absolutely agree with you. I just felt it was safer to go this way.
How good is the performance on the split function?
As Lutz pointed out, we only need to grab one thing from the list (always the second item) so I wonder if
select substring(substring(@str,charindex('~',@str)+1,len(@str)),1,charindex('~',substring(@str,charindex('~',@str)+1,len(@str)))-1)
would perform better (despite being hideously ugly).
September 3, 2010 at 7:09 pm
Nevyn (9/3/2010)
How good is the performance on the split function?As Lutz pointed out, we only need to grab one thing from the list (always the second item) so I wonder if
select substring(substring(@str,charindex('~',@str)+1,len(@str)),1,charindex('~',substring(@str,charindex('~',@str)+1,len(@str)))-1)
would perform better (despite being hideously ugly).
Fair question.
I created a large table with random characters, and with a delimiter placed at random locations in each row. (Actually, these are "dual-delimited" strings.) 3 major (#) delimiters, 10 minor (,) delimiters. Data in each record ends up being 513 characters long.
Processing 5000 rows:
Your all in-line functions: 1st run: 203ms; subsequent runs ~ 33ms.
I have 4 versions of the delimited split function:
DelimitedSplit (uses a physical tally table; can handle varchar(max)) : 7771ms
DelimitedSplit_v (uses a virtual (in-line) tally table; can handle varchar(max)): 18487ms
DelimitedSplit8k (uses a physical tally table; can handle varchar(8000)): 863ms
DelimitedSplit8k_v (uses a virtual (in-line) tally table; can handle varchar(8000)): 2063ms
So, your all in-line code soundly whopped up on all of the DelimitedSplit functions.
Using a physical tally table was about 2.5 times faster than a virtual tally table.
Using the 8k version is about 9 times faster than the varchar(max) version.
Test data generation:
-- First, build the test data
IF OBJECT_ID('dbo.MultiDelimitedDataTest') IS NOT NULL DROP TABLE dbo.MultiDelimitedDataTest
-- See Jeff Modem's article The "Numbers" or "Tally" Table: What it is and how it replaces a loop.
-- at http://www.sqlservercentral.com/articles/T-SQL/62867/.
;WITH
Tens (N) AS (SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL
SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL
SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0),
Thousands(N) AS (SELECT 1 FROM Tens t1 CROSS JOIN Tens t2 CROSS JOIN Tens t3),
Millions (N) AS (SELECT 1 FROM Thousands t1 CROSS JOIN Thousands t2),
Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM Millions),
RoughData AS
(
SELECT --TOP (100)
N,
-- 50 random characters
SomeLetters50 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),
-- 13 random positions
P1 = ABS(CHECKSUM(NEWID()))%250+1,
P2 = ABS(CHECKSUM(NEWID()))%250+1,
P3 = ABS(CHECKSUM(NEWID()))%250+1,
C1 = ABS(CHECKSUM(NEWID()))%500+1,
C2 = ABS(CHECKSUM(NEWID()))%500+1,
C3 = ABS(CHECKSUM(NEWID()))%500+1,
C4 = ABS(CHECKSUM(NEWID()))%500+1,
C5 = ABS(CHECKSUM(NEWID()))%500+1,
C6 = ABS(CHECKSUM(NEWID()))%500+1,
C7 = ABS(CHECKSUM(NEWID()))%500+1,
C8 = ABS(CHECKSUM(NEWID()))%500+1,
C9 = ABS(CHECKSUM(NEWID()))%500+1,
C0 = ABS(CHECKSUM(NEWID()))%500+1
FROM Tally
)
SELECT N,
--*,
-- replicate the 50 character string 10 times
-- stuff 3 '#' in there at random positions
-- stuff 10 ',' in there at random positions
TestData = STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(STUFF(REPLICATE(SomeLetters50,10),
P1, 0, '#'),
P2, 0, '#'),
P3, 0, '#'),
C1, 0, ','),
C2, 0, ','),
C3, 0, ','),
C4, 0, ','),
C5, 0, ','),
C6, 0, ','),
C7, 0, ','),
C8, 0, ','),
C9, 0, ','),
C0, 0, ',')
INTO dbo.MultiDelimitedDataTest
FROM RoughData
ORDER BY N
GO
ALTER TABLE [dbo].[MultiDelimitedDataTest] ADD CONSTRAINT [PK_MultiDelimitedDataTest] PRIMARY KEY CLUSTERED ([N] ASC) WITH (FILLFACTOR=100)
GO
Speed test code:
declare @delimiter char(1);
set @delimiter = ',';
SET STATISTICS IO,TIME ON;
if object_id('tempdb..#temp') IS NOT NULL DROP TABLE #temp;
select Item = substring(substring(TestData,charindex(@delimiter,TestData)+1,len(TestData)),1,charindex(@delimiter,substring(TestData,charindex(@delimiter,TestData)+1,len(TestData)))-1)
INTO #temp
from dbo.MultiDelimitedDataTest
where N < 5000;
GO
--<<<< DelimitedSplit SPEED CHECK
if object_id('tempdb..#temp') IS NOT NULL DROP TABLE #temp;
select Item
INTO #temp
from dbo.MultiDelimitedDataTest
CROSS APPLY dbo.DelimitedSplit8k_v(TestData, ',') -- used DelimitedSplit8k | DelimitedSplit | DelimitedSplit_v
WHERE ItemNumber = 2
AND N < 5000;
GO
SET STATISTICS IO,TIME OFF;
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply