May 17, 2006 at 12:37 pm
I have a table with 3m rows and have 27k dups due to removal of a key field (no longer needed or used). I need to roll up the dups, not just delete them. I was going to select into temp and group to get unique records without losing summed data and then del from main table and insert unique records. This is taking forever. Any other ideas? I thought it would be just as bad to find them and then loop through updates and deletes, but am not sure now.
Dups today looks like this:
Susan, 10
Susan, 12
Needs to look like this when I'm done:
Susan, 22
May 17, 2006 at 12:58 pm
1. Move duplicates to temp tables.
2. Delete values from base tables.
3. Insert from temp tables into base tables and use SUM and GROUP BY to perform your roll up.
May 17, 2006 at 1:00 pm
Are you inserting all 3 million records into a temp table?
If so, how about inserting ONLY records into a temp table having count(*) > 1 (after the grouping on the main table of course)?
E.g. Susan, 22 from your example
select someName, sum(age) into myTempTable
from mainTable
group by someName having count(*) > 1
May 17, 2006 at 1:01 pm
I just started trying that (always think of an answer right after I ask for help) so thanks for confirming I'm on the right track. Much appreciated!
May 17, 2006 at 1:25 pm
Unless I'm missing something, everything about the dupe rows is duped except for the column that you want summed. If that's the case, why not just SELECT INTO a new table, summing the one column, and grouping by the rest of the columns. Unless you have a ton of things tied to this table, such as low level permissions, triggers, indexes, etc.
To use your example:
SELECT
NameField
,Sum(NumberField)
INTO
NewTableName
FROM
OldTableName
GROUP BY
NameField
You'll obviously need to replace your old table with your new one, apply any indexes, triggers, permissions that were affected, but it would do the trick quickly.
If you go down this path, save the old table until you have fully verified the new one.
May 17, 2006 at 3:20 pm
-- It may be a good Idea to create an index on the Grouping/joining Column
-- create index idx_someName on SomeTame(someName)
-- go
select someName, sum(age) X
into #tmp
from mainTable
group by someName
having count(*) > 1
delete d
from SomeTable d
join
#Tmp t on d.someName = t.someName
insert into someTable( SomeName, age )
select somename, X
from #Tmp
drop table #tmp
-- Drop the index if you created it
-- drop index SomeTable.idx_someName
-- go
* Noel
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply