'Distinct' vs 'Groupby'

  • Hi all,

    :). I used distinct in most of my queries in order to avoid duplicates from my queries,but i was told not to use distinct instead go for groupby.

    How can i use groupby inplace of distinct?

    can anyone help me in this regard.

    Thanks,

    Kiran

  • I think the first advice that is normally given is - if you know that your data doesn't have duplicates (or it doesn't matter if there are), then you don't need to use commands like distinct as they do have a performance implication.

    The first SQL statement brings up a list of unique customer IDS by using distinct, and the second statement uses Group By instead:

    select distinct custid

    from [2003customers]

    SELECT CustID

    FROM [2003Customers]

    GROUP BY CustID

    The following blog starts to talk about the differences from a technical level:

    http://msmvps.com/blogs/robfarley/archive/2007/03/24/group-by-v-distinct-group-by-wins.aspx

  • If "most" of your queries require DISTINCT or GROUP BY to eliminate duplicates then, in general terms, you're writing your queries incorrectly or the data design is wrong. While there are any number of extenuating circumstances, in most instances, you should be able to query the data from a database without relying on, frankly, a crutch like DISTINCT.

    For what it's worth, you should always check your execution plans and costs because DISTINCT will not always be slower than GROUP BY.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply