March 14, 2016 at 7:55 am
Hello
I have a simple problem, that i can solve easily with help of cursors, but i want to know if anyone knows a better way.
I have a large piece of text that comes from a text file, in that file i have several lines (new lines separator), then in each line i have the data separated by tabs, and in this data i have two "fields" that i need to split by the char "\".
I have the split function working, and like i said, i can use cursors to do the job, but i was thinking if i can achieve the same thing with a better approach.
Sample Data, contents of the file:
123User3A\User3BTeamAMainOtherData1SerialA\SerialB01:10:35:256P1
124User4A\User4BTeamBMainOtherData2SerialC\SerialD01:10:36:256P1
125User5A\User5BTeamCMainFOtherData3SerialE\SerialF01:10:37:256P1
126User6A\User6BTeamDEliteOtherData4SerialAAA\SerialBBBB01:10:38:256P1
The final result should be a table, where the rows will be the lines, and the fields will be the columns, and every line is duplicated (the \ separator):
123 User3A TeamA Main OtherData1 SerialA 01:10:35:256 P1
123 User3B TeamA Main OtherData1 SerialB 01:10:35:256 P1
...
THanks
March 14, 2016 at 10:26 am
rootfixxxer (3/14/2016)
HelloI have a simple problem, that i can solve easily with help of cursors, but i want to know if anyone knows a better way.
I have a large piece of text that comes from a text file, in that file i have several lines (new lines separator), then in each line i have the data separated by tabs, and in this data i have two "fields" that i need to split by the char "\".
I have the split function working, and like i said, i can use cursors to do the job, but i was thinking if i can achieve the same thing with a better approach.
Sample Data, contents of the file:
123User3A\User3BTeamAMainOtherData1SerialA\SerialB01:10:35:256P1
124User4A\User4BTeamBMainOtherData2SerialC\SerialD01:10:36:256P1
125User5A\User5BTeamCMainFOtherData3SerialE\SerialF01:10:37:256P1
126User6A\User6BTeamDEliteOtherData4SerialAAA\SerialBBBB01:10:38:256P1
The final result should be a table, where the rows will be the lines, and the fields will be the columns, and every line is duplicated (the \ separator):
123 User3A TeamA Main OtherData1 SerialA 01:10:35:256 P1
123 User3B TeamA Main OtherData1 SerialB 01:10:35:256 P1
...
THanks
For the user column, will there always be exactly two users separated by sloshes? Same question for the serial column but need to know for sure that the position of the serials will always match the position of the users.
Also, please post the code that you're using for your splitter. Most folks have one that doesn't perform so well and we can help there, as well.
Last but not least, please post the CREATE TABLE code for your final destination table. It'll help.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 14, 2016 at 11:48 am
I would suggest investigating using a numbers or tally table. This search, https://www.bing.com/search?q=Number+or+Tally+table, retunrs a bunch of posts about what a numbers or tally table is and how to use it.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
March 14, 2016 at 1:24 pm
This "works". It uses Jeff's DelimitedSplit8K function for all the splits (found here[/url]). If the answer to Jeff's question about "sloshes" 😛 is no, then this won't be sufficient.
DECLARE
@String VARCHAR(1000) = '123' + CHAR(9) + 'User3A\User3B' + CHAR(9) + 'TeamA' + CHAR(9) + 'Main' + CHAR(9) + 'OtherData1' + CHAR(9) + 'SerialA\SerialB' +
CHAR(9) + '01:10:35:256' + CHAR(9) + 'P1' + CHAR(10) + '124' + CHAR(9) + 'User3A\User3B' + CHAR(9) + 'TeamA' + CHAR(9) + 'Main' + CHAR(9) + 'OtherData1' +
CHAR(9) + 'SerialA\SerialB' + CHAR(9) + '01:10:35:256' + CHAR(9) + 'P1' + CHAR(10),
@Delimiter CHAR(1) = CHAR(10),
@Delimiter2 CHAR(1) = CHAR(9),
@Delimiter3 CHAR(1) = '\';
WITH unpivotedData
AS (
SELECT
DSK.ItemNumber,
ROW_NUMBER() OVER(PARTITION BY DSK.ItemNumber ORDER BY DSK.ItemNumber, DSK2.ItemNumber, DSK3.Item) AS ID,
DSK3.Item
FROM
master.dbo.DelimitedSplit8K(@String, @Delimiter) AS DSK
CROSS APPLY master.dbo.DelimitedSplit8K(DSK.Item, @Delimiter2) AS DSK2
CROSS APPLY master.dbo.DelimitedSplit8K(DSK2.Item, @Delimiter3) AS DSK3
WHERE
DSK.Item <> ''
)
SELECT
PivotTable.[1],
PivotTable.[2],
PivotTable.[3],
PivotTable.[4],
PivotTable.[5],
PivotTable.[6],
PivotTable.[7],
PivotTable.[8],
PivotTable.[9],
PivotTable.[10]
FROM
(
SELECT
*
FROM
unpivotedData
) AS P PIVOT( MAX(Item) FOR ID IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10]) ) AS PivotTable
;
There probably is a better way, and I'd bet on Jeff to come back with it when you answer his question about "sloshes"
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
March 14, 2016 at 2:42 pm
I would upload the file using bulk insert or something similar to avoid the pivot and the additional split. But I was waiting for a reply from the OP.
March 14, 2016 at 7:50 pm
Luis Cazares (3/14/2016)
I would upload the file using bulk insert or something similar to avoid the pivot and the additional split. But I was waiting for a reply from the OP.
There ya go reading my mind again. 😀 Nice little format file and this get's a whole lot simpler. That's why I asked the questions I did... need to have a stable format to use a format file on this.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 14, 2016 at 7:53 pm
Jack Corbett (3/14/2016)
This "works". It uses Jeff's DelimitedSplit8K function for all the splits (found here[/url]). If the answer to Jeff's question about "sloshes" 😛 is no, then this won't be sufficient.
DECLARE
@String VARCHAR(1000) = '123' + CHAR(9) + 'User3A\User3B' + CHAR(9) + 'TeamA' + CHAR(9) + 'Main' + CHAR(9) + 'OtherData1' + CHAR(9) + 'SerialA\SerialB' +
CHAR(9) + '01:10:35:256' + CHAR(9) + 'P1' + CHAR(10) + '124' + CHAR(9) + 'User3A\User3B' + CHAR(9) + 'TeamA' + CHAR(9) + 'Main' + CHAR(9) + 'OtherData1' +
CHAR(9) + 'SerialA\SerialB' + CHAR(9) + '01:10:35:256' + CHAR(9) + 'P1' + CHAR(10),
@Delimiter CHAR(1) = CHAR(10),
@Delimiter2 CHAR(1) = CHAR(9),
@Delimiter3 CHAR(1) = '\';
WITH unpivotedData
AS (
SELECT
DSK.ItemNumber,
ROW_NUMBER() OVER(PARTITION BY DSK.ItemNumber ORDER BY DSK.ItemNumber, DSK2.ItemNumber, DSK3.Item) AS ID,
DSK3.Item
FROM
master.dbo.DelimitedSplit8K(@String, @Delimiter) AS DSK
CROSS APPLY master.dbo.DelimitedSplit8K(DSK.Item, @Delimiter2) AS DSK2
CROSS APPLY master.dbo.DelimitedSplit8K(DSK2.Item, @Delimiter3) AS DSK3
WHERE
DSK.Item <> ''
)
SELECT
PivotTable.[1],
PivotTable.[2],
PivotTable.[3],
PivotTable.[4],
PivotTable.[5],
PivotTable.[6],
PivotTable.[7],
PivotTable.[8],
PivotTable.[9],
PivotTable.[10]
FROM
(
SELECT
*
FROM
unpivotedData
) AS P PIVOT( MAX(Item) FOR ID IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10]) ) AS PivotTable
;
Nicely done, Jack!
--Jeff Moden
Change is inevitable... Change for the better is not.
March 15, 2016 at 6:27 am
Posted at the end of the day yesterday...
Thanks everyone for all the replies.
@Jeff Moden
Yes, it always have two names separated by sloshes like the serial.
The question about the position, i didn't understand what do you mean about the "match postion", in the file, it always be in the same "column", the size it may var, from 3 chars to 10 chars for each one of the values in the same row.
The split function:
CREATE FUNCTION [dbo].[SplitString]
(
@string NVARCHAR(MAX),
@delimiter CHAR(1)
)
RETURNS @output TABLE(id INT,splitdata NVARCHAR(MAX)
)
BEGIN
DECLARE @count INT
DECLARE @start INT, @end INT
SELECT @start = 1, @end = CHARINDEX(@delimiter, @string)
SET @count = 1
WHILE @start < LEN(@string) + 1 BEGIN
IF @end = 0
SET @end = LEN(@string) + 1
INSERT INTO @output (id,splitdata)
VALUES(@count,SUBSTRING(@string, @start, @end - @start))
SET @count = @count + 1
SET @start = @end + 1
SET @end = CHARINDEX(@delimiter, @string, @start)
END
RETURN
END
The destination table, it's really a temporary table, that will be used to update other tables...
CREATE TABLE [dbo].[RegTimesTemp](
[Num] [INT] NOT NULL,
[Name1] [NVARCHAR](50) NULL,
[Name2] [NVARCHAR](50) NULL,
[Team] [NVARCHAR](50) NULL,
[TypeG] [NVARCHAR](50) NULL,
[OtherData] [NVARCHAR](50) NULL,
[Serial1] [NVARCHAR](50) NULL,
[Serial2] [NVARCHAR](50) NULL,
[TimeMS] [BIGINT] NULL,
[Location] [NCHAR](2) NULL,
[CurrentDate] [SMALLDATETIME] NULL
)
@Jack Corbett
Thanks for the code.
I'll take a look later.
The files are small files, 300/400 lines top, the database it's only for this problem, so the performance isn't "critical"...
I was thinking in doing all the data handling in the side of the database, but then i thought that maybe isn't a good path... The files will be posted to a php script, and then the php script takes care of the process, so I'm measuring if i do the parse job in the script, and call individual or bulk insert in the database, or just, read the file and call a sp in database that does the parse/insert job...
Thanks
March 15, 2016 at 7:03 am
Jeff Moden (3/14/2016)
Nicely done, Jack!
Thanks Jeff, always nice to hear that from you.
I've never used bcp or BULK INSERT because I'm an SSIS guy, but if Luis and Jeff recommend that route, I'd tend to think they are correct. I could also recommend using SSIS, but you'd have to use the Derived Column transform with SSIS expressions to split the 2 columns with the "\" and that is something I would consider advanced SSIS, so I stayed away from it.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply