July 8, 2005 at 8:50 am
--Source table :
Create table T
(attr_id int,
property_id varchar (255),
value varchar (255))
GO
Insert into T (attr_id, property_id, value)
Select 11, 'ENT' , '-23'
UNION ALL
Select 22, 'HSD' , 'more than 532'
UNION ALL
Select 11, 'ENT' , '>34'
UNION ALL
Select 22, 'HSD' , '-23'
UNION ALL
Select 11, 'ENT' , '<3456'
UNION ALL
Select 22, 'HSD' , '-23'
GO
Select * from T
--the Source table looks like:
attr_id property_id value [varchar]
11 ENT -23
22 HSD more than 532
11 ENT >34
22 HSD -23
11 ENT <3456
22 HSD -23
--and 100000 more records like this
--Task: Create another table in which all values for every unique pair of attr_id and property_id would be MERGED; separate them by '; '; truncate the trailing ';'
--result table needs to look like:
attr_id property_id valueMerged
11 ENT -23; <3456; >34
22 HSD -23; -23; more than 532
--?: does anyone know an efficient way of doing this?
--What I ended up doing was creating a finction:
CREATE FUNCTION GetConcatString_v2(@attr_id int, @property_id varchar(255))
RETURNS varchar(1000)
AS
BEGIN
DECLARE @s-2 varchar(1000)
SET @s-2 = ''
SELECT @s-2 = @s-2 + value + '; '
FROM (select value from T WHERE attr_id = @attr_id and property_id = @property_id
AND value is not null) S1
ORDER BY value
if len(@s) = 0 set @s-2 = 'empty string'
SELECT @s-2 = left(@s, len(@s)-1)
RETURN @s-2
END
--and then using it in SELECT:
select attr_id, property_id, dbo.getConcatString_v2(attr_id, property_id) as valueMerged
--INTO _D_src_Tindep_merged
from T WHERE value is not null
group by attr_id, property_id
order by attr_id, property_id
--Could it be done with less code?
--the 'real' source table contains 100000 rows. It took the above code 15 min to precess the source data into merged strings.
Thanx
July 8, 2005 at 9:02 am
It's pretty much the only way to do this.
Check the execution plan and make sure that the function's query is as optmizied as possible.
I'd also make sure that there's a compound index on those colums :
value, attr_id, property_id (in the best order possible). This will allow to make an index seek each time with having to go to the clustered index, speeding the query greatly (especially if it's doing some sort of scan now).
July 8, 2005 at 9:14 am
thanks for yr response. I'm glad I was on a right track after all.
July 8, 2005 at 9:18 am
Did you check for the execution plan? Did it use a scan? has the compound index done anything to speed this up?
July 8, 2005 at 9:48 am
i haven't had a chance yet; at work now, busy w/ smth else
July 8, 2005 at 10:05 am
K... please post the results.. I'm curious to see if I'm right about this one.
July 8, 2005 at 12:12 pm
I was able to introduce a clustered key 'ckey' in the source table
chemid prop_id value ckey pkey
203 VC 1.71 1 95359
1655 TPT 160.75 2 24700
1290 TC 552 3 36531
1290 TC 552 3 41925
1347 HSTD -570.9 4 35296
1347 HSTD -569.024 4 37084
for every unique couple (chemid , prop_id 
when I run the above SELECT with a slightly modified function , I got the performace enhancment of ~300X! Instead of 15 min in ran 3 sec!
thanks for the tip Remi!
______________________
select ckey, dbo.GetConcatString_clusteredTest(ckey) as valueMerged
INTO _D_src_Tindep_clusterTest_del
from _D_src_Tindep_del where value is not null --and ckey < 100
group by ckey
July 8, 2005 at 12:14 pm
correction: 'chemid' in the previous post sh've been representing 'attr_id'
I slightly misrepresented the data structure , but the bottom line is clustered index helped a lot!
July 8, 2005 at 12:47 pm
The compound index could have had the exact same effect. This would have been the case because all the values needed to generate the select would have been retrieved directly from the index, therefore not reading any data page from the table.
July 8, 2005 at 1:46 pm
just realized that I should not have introduced that new field 'ckey' (and 'pkey' for that matter). I could simply create a clustered (compound?) index for ('chem_id' and 'property_id) and then run ver1 of the code with the same (can i say 'blazing' ?) speed.
Remi, could u pls explain in brief difference between clustered and compound indexes?
Thanks
July 8, 2005 at 1:50 pm
in the prevous post I meant ' create clustered index in Enterprise Manager interface'
July 8, 2005 at 1:54 pm
Doesn't even need to be clustered (but I'd like to see the speed difference, if any).
Check this for information on covering (compound) index (covering being the best term for this technic).
http://www.sql-server-performance.com/nb_index_intersection.asp
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply