April 14, 2015 at 1:14 am
Hi All,
I am having one store procedure which use to load data from flat file to staging table dynamically.
everything is working fine.Staging_temp table have single column.All the data stored in that single column
below is the sample Data.
AB¯ALBERTA ¯93¯AI
AI¯ALBERTA INDIRECT ¯94¯AI
AL¯ALABAMA ¯30¯
After the Staging_temp data gets inserted into main table.
There are 2 scenario which I want to handle
1-suppose some flat files where number of columns are more than the actual table.
if you see the sample rows there are 4 column separated by "¯".but actual I am having only 3 columns in my main table.
output should be like below.
AB¯ALBERTA ¯93
AI¯ALBERTA INDIRECT ¯94
AL¯ALABAMA ¯30
I am able to achieve above scenarios using below function.
ALTER FUNCTION [dbo].[fnGetFirstOcurrences](
@Text nvarchar(max),
@delimiter char(1),
@StopAt smallint)
ReturnS nvarchar(max)
BEGIN
Declare @Return nvarchar(max)
,@Finalint = 0
,@times smallint = 1
,@Delimiters smallint
select @Delimiters = len(@Text) - len(replace(@Text, @Delimiter, ''));
If (@Delimiters >= @StopAt)
begin
while @times <= @StopAt
Begin
SET @final = CHARINDEX(@delimiter,@text + @delimiter,@final+1)
END
Set @Final = @Final - 1
END
Else
Set @Final = len(@Text)
select @Return = left(@Text,@Final);
Return isnull(@Return,@Text)
END
2-Second scenarios is what I want to handle is if some column of the table have lesscolumn than actual table in that case how to handle it
for example below is the sample Data.
AB¯ALBERTA
AI¯ALBERTA INDIRECT ¯94¯AI
AL¯ALABAMA ¯30¯
but I am actually having 3 column in my main table.
Please help me to achieve above scenario
Thanks in advance.
Regards,
Vipin Jha
April 14, 2015 at 1:25 am
I've answered this question in the first post you made about this here --> http://www.sqlservercentral.com/Forums/Topic1676026-3077-1.aspx%5B/url%5D.
So, quoting myself: -
Cadavre (4/10/2015)
IF OBJECT_ID('tempdb..#stagingTable') IS NOT NULL
BEGIN
DROP TABLE #stagingTable;
END;
-- 1,000,000 rows of sample data
SELECT IDENTITY( INT,1,1 ) AS [ID],
SUBSTRING(REPLACE(NEWID(), '-', '¯'), 1, CASE WHEN [bool] = 1 THEN 18
ELSE 23
END) AS [DATA]
INTO #stagingTable
FROM ( SELECT TOP 1000000
CRYPT_GEN_RANDOM(1) % 2
FROM master.dbo.syscolumns sc1
CROSS JOIN master.dbo.syscolumns sc2
CROSS JOIN master.dbo.syscolumns sc3
) a ( [bool] );
Solution using the 8K splitter: -
IF EXISTS ( SELECT 1
FROM sys.[objects]
WHERE [name] = 'DelimitedSplit8K'
AND type = 'IF' )
BEGIN;
DROP FUNCTION [dbo].[DelimitedSplit8K];
END;
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
GO
Actual solution: -
SELECT [DATA],
MAX(CASE WHEN ItemNumber = 1 THEN Item ELSE '' END) AS [1],
MAX(CASE WHEN ItemNumber = 2 THEN Item ELSE '' END) AS [2],
MAX(CASE WHEN ItemNumber = 3 THEN Item ELSE '' END) AS [3]
FROM #stagingTable
CROSS APPLY [dbo].[DelimitedSplit8K]([DATA], '¯') AS de
WHERE ItemNumber < 4
GROUP BY [DATA];
Produces: -
DATA 1 2 3
----------------------- ----------------------- ----------------------- -----------------------
00772BD2¯9396¯49ED 00772BD2 9396 49ED
00AB3FDE¯4DE1¯4030 00AB3FDE 4DE1 4030
01476877¯6DCC¯4B06 01476877 6DCC 4B06
014C256A¯930E¯4071¯B70E 014C256A 930E 4071
01F145BF¯371A¯4754 01F145BF 371A 4754
02D4C463¯47C2¯4B75¯863B 02D4C463 47C2 4B75
035C3548¯F07E¯4ED4¯81DB 035C3548 F07E 4ED4[/quote-0]
April 14, 2015 at 4:28 am
The solution posted by Cadavre looks very complete.
One possible change, when looking at the situation where there are fewer columns, in the SELECT clause, I'd probably use COALESCE instead of a CASE statement. But that's a pretty trivial change.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply