remove duplicates and sum column

  • Hi Professionals.

    I am enquiring if it is possible within SQLServer 2008 to remove duplicate rows in the DB and sum up a column after the removal EG

    ProductName, Product Version, Software Manufacturer, Type, Category, Installation, Licensable

    Project Standard 2007, 12.x, Microsoft, Commercial, Office Productivity, 47, Licensable

    Project Standard 2010, 14.x, Microsoft, Commercial, Not Assigned, 20, Licensable

    Project Standard 2010, 14.x, Microsoft, Commercial, Not Assigned, 37, Licensable

    Project Standard 2010, 14.x, Microsoft, Commercial, Not Assigned, 3, Licensable

    Project Standard 2010, 14.x, Microsoft, Commercial, Not Assigned, 3, Licensable

    Office Fonts, 999.x, Microsoft, unidentified, Not Assigned, 7, Non Licensable

    **********************************************************************************

    So it removes the duplicate rows calculates the installation column and groups them together like so

    **********************************************************************************

    ProductName, Product Version, Software Manufacturer, Type, Category, Installation, Licensable

    Project Standard 2007, 12.x, Microsoft, Commercial, Office Productivity, 47, Licensable

    Project Standard 2010, 14.x, Microsoft, Commercial, Not Assigned, 60, Licensable

    Office Fonts, 999.x, Microsoft, unidentified, Not Assigned, 7, Non Licensable

  • In the first/inner query, use a DISTINCT.

    SELECT DISTINCT [column list]

    and then in the outer query, do your sum

    SELECT Col1, Col2, SUM([numericColumn])

    FROM

    (SELECT DISTINCT [column list] FROM...)

    GROUP BY Col1, Col2;

  • With the use of DISTINCT you can achieve this....

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Sorry but that doesn't work

    I put

    SELECT softwaremanufacturer,productname, productversion, count(installations) as TotalInstalls, Licensable

    FROM

    (SELECT DISTINCT softwaremanufacturer,productname, productversion,Licensable from newtable)

    GROUP BY softwaremanufacturer,productname, productversion, Licensable

    and it says

    Msg 156, Level 15, State 1, Line 5

    Incorrect syntax near the keyword 'GROUP'.

  • A SUM( DISTINCT...) should do the job.

    WITH Products(ProductName, ProductVersion, SoftwareManufacturer, Type, Category, Installation, Licensable) AS( SELECT

    'Project Standard 2007', '12.x', 'Microsoft', 'Commercial', 'Office Productivity', 47, 'Licensable' UNION ALL SELECT

    'Project Standard 2010', '14.x', 'Microsoft', 'Commercial', 'Not Assigned', 20, 'Licensable' UNION ALL SELECT

    'Project Standard 2010', '14.x', 'Microsoft', 'Commercial', 'Not Assigned', 37, 'Licensable' UNION ALL SELECT

    'Project Standard 2010', '14.x', 'Microsoft', 'Commercial', 'Not Assigned', 3, 'Licensable' UNION ALL SELECT

    'Project Standard 2010', '14.x', 'Microsoft', 'Commercial', 'Not Assigned', 3, 'Licensable' UNION ALL SELECT

    'Office Fonts', '999.x', 'Microsoft', 'unidentified', 'Not Assigned', 7, 'Non Licensable')

    SELECT ProductName,

    ProductVersion,

    SoftwareManufacturer,

    Type,

    Category,

    SUM( DISTINCT Installation),

    Licensable

    FROM Products

    GROUP BY ProductName,

    ProductVersion,

    SoftwareManufacturer,

    Type,

    Category,

    Licensable

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Oracle765 (11/7/2013)


    Sorry but that doesn't work

    I put

    SELECT softwaremanufacturer,productname, productversion, count(installations) as TotalInstalls, Licensable

    FROM

    (SELECT DISTINCT softwaremanufacturer,productname, productversion,Licensable from newtable)

    GROUP BY softwaremanufacturer,productname, productversion, Licensable

    and it says

    Msg 156, Level 15, State 1, Line 5

    Incorrect syntax near the keyword 'GROUP'.

    You need to alias your derived table. Your derived table doesn't have installations column and that will come out as an error as well.

    My solution should be better.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Hi Again

    The sum does not work because it is actually a nvarchar column. Is there a way around this

    SELECT ProductName,

    ProductVersion,

    SoftwareManufacturer,

    SUM( DISTINCT Installations),

    Licensable

    FROM newtable where productname like '%project standard%'

    GROUP BY ProductName,

    ProductVersion,

    SoftwareManufacturer,

    Licensable

    Msg 8117, Level 16, State 1, Line 6

    Operand data type nvarchar(max) is invalid for sum operator.

  • Why on earth would someone use an nvarchar(max) to store integer values? :w00t:

    There's a way around this, just change the SUM for

    SUM( DISTINCT CAST( Installations AS int))

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply