August 25, 2017 at 6:56 am
Hello everyone
I'll get right to the point.
I have the following script:
create table #tempData(recid int identity(1,1), EDI_tekst nvarchar(4000), LineNum int, Chunk int not null default 0);
insert into #tempData(Edi_tekst, liNeNum)
values
('A1AA1610W1 C01AA01001160121 WKMHLC81UADU057923 2013052420160927035070100509 ',1),
('A2957603Z002 I11ZZ3 ',2),
('A30000000000000099999999000000 ',3),
('A4AA1610W1 C01AA01001160121 01957603Z002FFF 01010005079000507911895790F00 ',4),
('A51100400408836 0005993000599300035340003534 ',5),
('A1AA1610W1 C01AA01001160125 WKMHLC81UBDU061314 2013052720161011040998100515 ',1),
('A2846503Z100YHG I14ZZ3 ',2),
('A30000000000000099999999000000 ',3),
('A4AA1610W1 C01AA01001160125 01846503Z100YHG 01010004527000452611884650R00 ',4),
('A51100300308836 0005342000534000026500002650 ',5),
('A1AA1610W1 C01AA01015160177 WKMHSU81XDDU073924 20131002201609160831708641 ',1),
('A2568202W050 Q55ZZ4 ',2),
('A30000000000000099999999000000 ',3),
('A4AA1610W1 C01AA01015160177 01568202W050 01010001750000174911856810R0R ',4),
('A51100900908836 0002065000206300079520007952 ',5),
('A4AA1610W1 C01AA01015160177 02577592E000 01010000098000009711856800A00 ',4),
('A51100400408836 0000116000011400035340003534 ',5),
('A4AA1610W1 C01AA01015160177 031430085257B 010100000120000012118 ',4),
('A50000000008836 0000014000001400000000000000 ',5)
Record, that starts with A1 is the starting point of a new "chunk".
I cant't figure out set based approach to update field chunk (using window function or something). I can do it row by row but that's not what I want.
I attached two images explaining what I want.
before
Manual updates (i'll do them just to display the desired result):
update #tempData set chunk = 1 where recid < 6;
update #tempData set chunk = 2 where recid > 5 and recid < 11;
update #tempData set chunk = 3 where recid > 10;
after
any suggesting is appreciated.
Regards, Marko
August 25, 2017 at 7:09 am
Every time LineNo resets to 1 (as ordered by recid), you want the chunk number to increase by 1.
Is that a correct summary of the requirement?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 25, 2017 at 7:16 am
Yes, that is exactly what I want.
thx for reply
August 25, 2017 at 7:21 am
WITH summed
AS
(
SELECT
td.Chunk
, NewChunk = SUM(IIF(td.LineNum = 1, 1, 0)) OVER (ORDER BY td.recid
ROWS UNBOUNDED PRECEDING
)
FROM #tempData td
)
UPDATE summed
SET summed.Chunk = summed.NewChunk;
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
August 25, 2017 at 7:36 am
Another alternative involves "quirky update". It does require, however, that the update occur in PRIMARY KEY order, so I added a clustered index to the table:create table #tempData (
recid int identity(1,1) PRIMARY KEY CLUSTERED,
EDI_tekst nvarchar(4000),
LineNum int,
Chunk int not null default 0
);
insert into #tempData(Edi_tekst, liNeNum)
values ('A1AA1610W1 C01AA01001160121 WKMHLC81UADU057923 2013052420160927035070100509 ',1),
('A2957603Z002 I11ZZ3 ',2),
('A30000000000000099999999000000 ',3),
('A4AA1610W1 C01AA01001160121 01957603Z002FFF 01010005079000507911895790F00 ',4),
('A51100400408836 0005993000599300035340003534 ',5),
('A1AA1610W1 C01AA01001160125 WKMHLC81UBDU061314 2013052720161011040998100515 ',1),
('A2846503Z100YHG I14ZZ3 ',2),
('A30000000000000099999999000000 ',3),
('A4AA1610W1 C01AA01001160125 01846503Z100YHG 01010004527000452611884650R00 ',4),
('A51100300308836 0005342000534000026500002650 ',5),
('A1AA1610W1 C01AA01015160177 WKMHSU81XDDU073924 20131002201609160831708641 ',1),
('A2568202W050 Q55ZZ4 ',2),
('A30000000000000099999999000000 ',3),
('A4AA1610W1 C01AA01015160177 01568202W050 01010001750000174911856810R0R ',4),
('A51100900908836 0002065000206300079520007952 ',5),
('A4AA1610W1 C01AA01015160177 02577592E000 01010000098000009711856800A00 ',4),
('A51100400408836 0000116000011400035340003534 ',5),
('A4AA1610W1 C01AA01015160177 031430085257B 010100000120000012118 ',4),
('A50000000008836 0000014000001400000000000000 ',5);
DECLARE @chunk AS int = 0;
UPDATE T
SET T.Chunk = ISNULL(NULLIF(@CHUNK, 0), 1),
@CHUNK = CASE WHEN T.LineNum = 1 THEN @chunk + 1 ELSE @chunk END
FROM #tempData AS T;
SELECT *
FROM #tempData;
DROP TABLE #tempData;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
August 25, 2017 at 8:32 am
Although I'm a really big fan of the very high performance "Quirky Update", it has some pretty serious rules to follow. For example, you must use WITH (TABLOCKX) to keep anyone from sneaking in on your update, you must use OPTION (MAXDOP 1) to absolutely guarantee that parallelism doesn't occur, and you need a bit of additional code to do a bit of error checking or something that will absolutely guarantee the use of the Clustered Index ( WITH (INDEX(1) has worked well for me).
If you're still using 2005, 2008, or 2008R2, there's still a relatively easy and fast way to do this using two (now fairly old) "Black Arts" techniques;
1. The Difference Between Ascending Sequences
2. cCTEs (Cascading CTEs)
Here's the commented code and it's not much more complex than Phil's 2012 solution:
WITH
cteDetermineGrouping AS
(--==== Creates a guaranteed sequence based on recid (which could be missing values)
-- and creates non-sequencial group values by subtraction of the LineNum from that.
SELECT Grp = ROW_NUMBER() OVER (ORDER BY recid) - LineNum
,Chunk
FROM #tempData
)
,cteSequenceGrp AS
(--==== This converts the non-sequential group numbers to sequential group numbers
SELECT SeqGrp = DENSE_RANK() OVER (ORDER BY GRP)
,Chunk
FROM cteDetermineGrouping
)--==== This does an update all the way back to the root table through the cCTEs (Cascading CTEs)
UPDATE cteSequenceGrp
SET Chunk = SeqGrp
;
--Jeff Moden
Change is inevitable... Change for the better is not.
August 25, 2017 at 10:54 am
Thanks Jeff for reminding me of the rest of the requirements on quirky update. It's been 9 years since I had the chance to actually use it, and I'd long since forgotten about the WITH TABLOCKX and the MAXDOP 1 hints. I still remember the excitement I felt when I read a post you made on that topic that long ago. It just seemed like way too good an idea to let go to waste at the time, but there aren't all that many situations I encounter that could make use of it.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
August 25, 2017 at 1:06 pm
sgmunson - Friday, August 25, 2017 10:54 AMThanks Jeff for reminding me of the rest of the requirements on quirky update. It's been 9 years since I had the chance to actually use it, and I'd long since forgotten about the WITH TABLOCKX and the MAXDOP 1 hints. I still remember the excitement I felt when I read a post you made on that topic that long ago. It just seemed like way too good an idea to let go to waste at the time, but there aren't all that many situations I encounter that could make use of it.
Thank you for the kind words, good Sir!
And, to be sure, yes... I still use it quite often even in >= 2012 environments when I need the extra oomph for performance. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
August 26, 2017 at 1:00 am
Thanks guys for your replies. I appreciate your effort very very much.
Now to the results:
I've tried both Phil's and Jeff's solutions. Phil's solution works fine, Jeff's on the other hand doesn't.
This is the query I ran.
update #tempdata set chunk = 0;
WITH
cteDetermineGrouping AS
(--==== Creates a guaranteed sequence based on recid (which could be missing values)
-- and creates non-sequencial group values by subtraction of the LineNum from that.
SELECT Grp = ROW_NUMBER() OVER (ORDER BY recid) - LineNum
,Chunk
FROM #tempData
)
,cteSequenceGrp AS
(--==== This converts the non-sequential group numbers to sequential group numbers
SELECT SeqGrp = DENSE_RANK() OVER (ORDER BY GRP)
,Chunk
FROM cteDetermineGrouping
)--==== This does an update all the
UPDATE cteSequenceGrp
SET Chunk = SeqGrp
;
select * from #tempData;
RESULT:
update #tempdata set chunk = 0;
WITH summed
AS
(
SELECT
td.Chunk
, NewChunk = SUM(IIF(td.LineNum = 1, 1, 0)) OVER (ORDER BY td.recid
ROWS UNBOUNDED PRECEDING
)
FROM #tempData td
)
UPDATE summed
SET summed.Chunk = summed.NewChunk;
select * from #tempData;
Tthanks again for your time on this one.
regards
Marko
August 26, 2017 at 1:20 pm
marko.celarc - Saturday, August 26, 2017 12:59 AMThanks guys for your replies. I appreciate your effort very very much.
Now to the results:I've tried both Phil's and Jeff's solutions. Phil's solution works fine, Jeff's on the other hand doesn't.
This is the query I ran.update #tempdata set chunk = 0;
WITH
cteDetermineGrouping AS
(--==== Creates a guaranteed sequence based on recid (which could be missing values)
-- and creates non-sequencial group values by subtraction of the LineNum from that.
SELECT Grp = ROW_NUMBER() OVER (ORDER BY recid) - LineNum
,Chunk
FROM #tempData
)
,cteSequenceGrp AS
(--==== This converts the non-sequential group numbers to sequential group numbers
SELECT SeqGrp = DENSE_RANK() OVER (ORDER BY GRP)
,Chunk
FROM cteDetermineGrouping
)--==== This does an update all the
UPDATE cteSequenceGrp
SET Chunk = SeqGrp
;select * from #tempData;
RESULT:
update #tempdata set chunk = 0;
WITH summed
AS
(
SELECT
td.Chunk
, NewChunk = SUM(IIF(td.LineNum = 1, 1, 0)) OVER (ORDER BY td.recid
ROWS UNBOUNDED PRECEDING
)
FROM #tempData td
)
UPDATE summed
SET summed.Chunk = summed.NewChunk;select * from #tempData;
Tthanks again for your time on this one.
regardsMarko
Sorry... you're correct... that one is my fault. I read the original problem as any time the "sequence has a decrease" instead of "any time the sequence restarted at 1" and didn't check the output against yours.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply