November 11, 2008 at 11:10 am
I have a large table that is made up like this
1st Record with header information
2nd - 10th detail record
this repeats for millions of rows.
I need to process the 1st record of every set by inserting it into a lookup table
and the next 9 rows need to be updated a with the record_id
that relates the the header record.
So the T-SQL needs to read the table in sequence - Header records are coded with an 'H' in the Type column.
I think I would do this with the combination of a While and a Cursor.
The Header record can and will appear many times in the course of this table.
Would I do this with two while loops, or a while and a cursor.
Any input you can give me would be greatlly appreciated.
I should add that there is no way to link the records that come after the Header record except that they are the records between each record that is coded with as a header record.
November 11, 2008 at 11:13 am
What's the table structure like? Is there a column that specifically distinguishes between the header record & detail records?
November 11, 2008 at 11:16 am
The column type will be coded with a 'H' to indicate it is a header record. All records that follow the Header record are assumed to be detail until the next header record comes along.
The table structure is the following (i'm omitting the columns that are of no use
Type H or D
Data Null for Header records, populated for details.
Header_ID (I added this to relate the detail records back to the header record)
November 11, 2008 at 11:23 am
What about the rest of the table structure?
November 11, 2008 at 12:01 pm
SQL DBA 808 (11/11/2008)
I think I would do this with the combination of a While and a Cursor.The Header record can and will appear many times in the course of this table.
Would I do this with two while loops, or a while and a cursor.
Any input you can give me would be greatlly appreciated.
I should add that there is no way to link the records that come after the Header record except that they are the records between each record that is coded with as a header record.
None of the above!
Give this a read: http://www.sqlservercentral.com/articles/Advanced+Querying/61716/
If you need help with the implementation, post table structure / sample data / expected results (an example of how to do this is shown in the link in my signature).
November 11, 2008 at 12:05 pm
Sounds like a bank/credit feed file. There's probably no data structure other than 2 columns. 1 for the header record you manually added and the other is probably the actual data, but this is just a guess.
Assuming your data is similar to this...
HTEST1
Record1
Record2
Record3
Record4
Record5
Record6
Record7
Record8
Record9
HTEST2
Record1
Record2
Record3
Record4
Record5
Record6
Record7
Record8
Record9
HTEST3
Record1
Record2
Record3
Record4
Record5
Record6
Record7
Record8
Record9
This is what I did:
CREATE TABLE #InputData
(Cnt INT IDENTITY(1,1), Data VARCHAR(1000))
DECLARE MyCursor CURSOR
FOR SELECT Cnt, Data FROM #InputData ORDER BY Cnt
OPEN MyCursor
DECLARE @Cnt INT, @data VARCHAR(1000), @Header VARCHAR(10)
FETCH NEXT FROM MyCursor
INTO @Cnt, @data
WHILE @@FETCH_STATUS = 0
BEGIN
IF SUBSTRING(@Data, 1, 1) = 'H'
BEGIN
SET @Header = SUBSTRING(@Data, 2, 10)
INSERT INTO #Final
SELECT 'H', NULL, @Header
END
ELSE
BEGIN
INSERT INTO #Final
SELECT 'D', @data, @Header
END
FETCH NEXT FROM MyCursor
INTO @Cnt, @data
END
CLOSE MyCursor
DEALLOCATE MyCursor
SELECT * FROM #Final
November 11, 2008 at 12:39 pm
Steve T: Yes. That's very close to what I need to do. Thank you.
Garadin: I will read that and see if I can apply it to this situation before my management kills me for not having this process completed sooner.
November 11, 2008 at 1:06 pm
I think you can do it without looping through while/cursor. But you need to alter both details and lookup table.
Here is what I suggest:
1. Alter add surrogate_key not null int identity(1,1);
2. Alter add imported_key int;
3. insert into
select * , surrogate_key as imported_key
from
where type ='H'
4. update
set record_id =
select
from a
join
(
select
l1.record_id,
l1.imported_key as rec_from,
l2.imported_key-1 as rec_to
from l2
on l1.record_id = l2.record_id + 1
) b
on a.surrogate_key between b.rec_from and b.rec_to
This algorithm assumes that your record_id is sequential. If not, you will have to createa another surrogate key, in this case rec_id_surrogate in lookup table and link it to both imported_key and record_id.
November 11, 2008 at 1:08 pm
I think you can do it without looping through while/cursor. But you need to alter both details and lookup table.
Here is what I suggest:
1. Alter "your_large_table" add surrogate_key not null int identity(1,1);
2. Alter "your_lookup_table" add imported_key int;
3. insert into "your_lookup_table"
select * , surrogate_key as imported_key
from "your_large_table"
where type ='H'
4. update "your_large_table"
set record_id =
select
from "your_large_table" a
join
(
select
l1.record_id,
l1.imported_key as rec_from,
l2.imported_key-1 as rec_to
from "your_lookup_table" l1 join "your_lookup_table" l2
on l1.record_id = l2.record_id + 1
) b
on a.surrogate_key between b.rec_from and b.rec_to
This algorithm assumes that your record_id is sequential. If not, you will have to createa another surrogate key, in this case rec_id_surrogate in lookup table and link it to both imported_key and record_id.
November 11, 2008 at 1:10 pm
When I pasted my code, this site changed removed all my brackets. I replaced them with quotes (") here.
Here is the code:
1. Alter "your_large_table" add surrogate_key not null int identity(1,1);
2. Alter "your_lookup_table" add imported_key int;
3. insert into "your_lookup_table"
select * , surrogate_key as imported_key
from "your_large_table"
where type ='H'
4. update "your_large_table"
set record_id = b.record_id
from "your_large_table" a
join
(
select
l1.record_id,
l1.imported_key as rec_from,
l2.imported_key-1 as rec_to
from "your_lookup_table" l1 join "your_lookup_table" l2
on l1.record_id = l2.record_id + 1
) b
on a.surrogate_key between b.rec_from and b.rec_to
November 11, 2008 at 1:21 pm
SQL DBA 808 (11/11/2008)
Steve T: Yes. That's very close to what I need to do. Thank you.Garadin: I will read that and see if I can apply it to this situation before my management kills me for not having this process completed sooner.
You may be making a trade off between time to learn the method, and time for the thing to actually run. Using a cursor to evaluate millions of rows is likely going to be very bad.
November 11, 2008 at 3:11 pm
Garadin (11/11/2008)
SQL DBA 808 (11/11/2008)
Steve T: Yes. That's very close to what I need to do. Thank you.Garadin: I will read that and see if I can apply it to this situation before my management kills me for not having this process completed sooner.
You may be making a trade off between time to learn the method, and time for the thing to actually run. Using a cursor to evaluate millions of rows is likely going to be very bad.
I do have to agree with Garadin here regarding the performance on evaluating millions of rows. I got to think he was probably consolidation some data and normally would not be doing this against millions of rows. I haven't read you link, but it does sounds interesting.
November 11, 2008 at 6:22 pm
SQL DBA 808 (11/11/2008)
I should add that there is no way to link the records that come after the Header record except that they are the records between each record that is coded with as a header record.
If there is no way to relate them, then how do you keep them in this order? I ask because SQL Server will not just do this for you, and there is no default or deterministic order on a table or it's returned dataset, unless you specify one.
More specifically: just because they were inserted into some table in that order is NO guarantee that they will come back in that order when you select from the table.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
November 11, 2008 at 7:46 pm
If you would post some sample DDL and DATA, it would be a lot easier to tackle your problem.
Read this article for some pointers: http://www.sqlservercentral.com/articles/Best+Practices/61537/
November 11, 2008 at 8:27 pm
Steve T (11/11/2008)
Garadin (11/11/2008)
SQL DBA 808 (11/11/2008)
Steve T: Yes. That's very close to what I need to do. Thank you.Garadin: I will read that and see if I can apply it to this situation before my management kills me for not having this process completed sooner.
You may be making a trade off between time to learn the method, and time for the thing to actually run. Using a cursor to evaluate millions of rows is likely going to be very bad.
I do have to agree with Garadin here regarding the performance on evaluating millions of rows. I got to think he was probably consolidation some data and normally would not be doing this against millions of rows. I haven't read you link, but it does sounds interesting.
Oh it's worth the read. It's worth many reads.
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply