create merged strings from a source table

Question

Post reply

create merged strings from a source table

Sergei Zarembo

Hall of Fame

Points: 3887
More actions
July 8, 2005 at 8:50 am

#363105

--Source table :
Create table T
(attr_id int,
property_id varchar (255),
value varchar (255))
GO
Insert into T (attr_id, property_id, value)
Select 11, 'ENT' , '-23'
UNION ALL
Select 22, 'HSD' , 'more than 532'
UNION ALL
Select 11, 'ENT' , '>34'
UNION ALL
Select 22, 'HSD' , '-23'
UNION ALL
Select 11, 'ENT' , '<3456'
UNION ALL
Select 22, 'HSD' , '-23'
GO
Select * from T
--the Source table looks like:
attr_id property_id      value [varchar]
11       ENT                -23
22       HSD                more than 532
11       ENT                   >34
22       HSD                -23
11       ENT                <3456
22      HSD                   -23
--and 100000 more records like this
--Task: Create another table in which all values for every unique pair of attr_id and property_id would be MERGED; separate them by '; '; truncate the trailing ';'
--result table needs to look like:
attr_id property_id      valueMerged
11       ENT               -23; <3456; >34
22       HSD               -23; -23; more than 532
--?: does anyone know an efficient way of doing this?
--What I ended up doing was creating a finction:
CREATE FUNCTION GetConcatString_v2(@attr_id int, @property_id varchar(255))
RETURNS varchar(1000)
AS
BEGIN
    DECLARE @s-2 varchar(1000)
    SET @s-2 = ''
SELECT @s-2 = @s-2 + value + '; '
    FROM (select value from T WHERE attr_id = @attr_id and property_id = @property_id
AND value is not null) S1
    ORDER BY value
    if len(@s) = 0 set @s-2 = 'empty string'
    SELECT @s-2 = left(@s, len(@s)-1)
    RETURN @s-2
END
--and then using it in SELECT:
select attr_id, property_id, dbo.getConcatString_v2(attr_id, property_id) as valueMerged
--INTO _D_src_Tindep_merged
from T WHERE value is not null
group by attr_id, property_id
   order by attr_id, property_id
--Could it be done with less code?
--the 'real' source table contains 100000 rows. It took the above code 15 min to precess the source data into merged strings.
Thanx

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

It's pretty much the only way to do this.

Check the execution plan and make sure that the function's query is as optmizied as possible.

I'd also make sure that there's a compound index on those colums :

value, attr_id, property_id (in the best order possible). This will allow to make an index seek each time with having to go to the clustered index, speeding the query greatly (especially if it's doing some sort of scan now).

Sergei Zarembo Hall of Fame Points: 3887 More actions · Answer 2

thanks for yr response. I'm glad I was on a right track after all.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 3

Did you check for the execution plan? Did it use a scan? has the compound index done anything to speed this up?

Sergei Zarembo Hall of Fame Points: 3887 More actions · Answer 4

i haven't had a chance yet; at work now, busy w/ smth else

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 5

K... please post the results.. I'm curious to see if I'm right about this one.

Sergei Zarembo Hall of Fame Points: 3887 More actions · Answer 6

I was able to introduce a clustered key 'ckey' in the source table

chemid prop_id value ckey pkey

203 VC 1.71 1 95359

1655 TPT 160.75 2 24700

1290 TC 552 3 36531

1290 TC 552 3 41925

1347 HSTD -570.9 4 35296

1347 HSTD -569.024 4 37084

for every unique couple (chemid , prop_id&nbsp

when I run the above SELECT with a slightly modified function , I got the performace enhancment of ~300X! Instead of 15 min in ran 3 sec!

thanks for the tip Remi!

______________________

select ckey, dbo.GetConcatString_clusteredTest(ckey) as valueMerged

INTO _D_src_Tindep_clusterTest_del

from _D_src_Tindep_del where value is not null --and ckey < 100

group by ckey

Sergei Zarembo Hall of Fame Points: 3887 More actions · Answer 7

correction: 'chemid' in the previous post sh've been representing 'attr_id'

I slightly misrepresented the data structure , but the bottom line is clustered index helped a lot!

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

The compound index could have had the exact same effect. This would have been the case because all the values needed to generate the select would have been retrieved directly from the index, therefore not reading any data page from the table.

Sergei Zarembo Hall of Fame Points: 3887 More actions · Answer 9

just realized that I should not have introduced that new field 'ckey' (and 'pkey' for that matter). I could simply create a clustered (compound?) index for ('chem_id' and 'property_id) and then run ver1 of the code with the same (can i say 'blazing' ?) speed.

Remi, could u pls explain in brief difference between clustered and compound indexes?

Thanks

Sergei Zarembo Hall of Fame Points: 3887 More actions · Answer 10

in the prevous post I meant ' create clustered index in Enterprise Manager interface'

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 11

Doesn't even need to be clustered (but I'd like to see the speed difference, if any).

Check this for information on covering (compound) index (covering being the best term for this technic).

http://www.sql-server-performance.com/nb_index_intersection.asp