Group By query. Probably..

  • Bit tricky to explain this one...

    I have a table similar to the following:

    Ref     Value   Date 

    1       0.002   2004-04-27 00:00:00.000

    1       0.1     2004-04-25 00:00:00.000

    1       1.4     2004-04-02 00:00:00.000

    2       0.0005  2004-04-01 00:00:00.000

    2       1.3     2004-04-30 00:00:00.000 

    2       1.1     2004-04-30 00:00:00.000

    3       0.01    2004-04-23 00:00:00.000 

    3       0.56    2004-04-23 00:00:00.000

    3       1.2     2004-04-23 00:00:00.000

    3       2.8     2004-04-05 00:00:00.000

    And I want a query that returns the minimum "Value" according to the maximum "date".

    In other words, the result set should look like this:

    Ref     Value   Date

    1       0.002  2004-04-27 00:00:00.000

    2       1.1    2004-04-30 00:00:00.000

    3       0.01   2004-04-23 00:00:00.000

    I can do it using a group by or order by clause, but only if I provide a where clause

    using the "ref" column, for example:

    select min(T.value) as value, T1.date, T1.ref

    from TableName T

    join

    (

    Select max(date) as date, ref

    from TableName

    group by ref

    ) T1 on T1.date = t.date and T1.ref = T.ref

    where Ref = 3

    group by T1.date, T1.ref

    or using an order by clause:

    select top 1 value, date, ref

    from TableName

    where Ref = 3

    order by value asc, date desc

    However, This is going to be part of a larger query, so I will be providing the "Ref" via a join to another select query.

    I can get this to return either the minimum Value for a particular Ref, or the maximum Date for a particular ref,

    but not both, which is what I need.  If this is not possible, then I could do with a query that will just provide the result set

    example above.  I've tried this:

    select min(T.value) as value, R.date, R.Ref

    from tableName T

    join

    (

    select max(date) as date, ref

    from tableName

    group by ref

    ) R on R.ref = T.ref

    but as my source table has over 130000 rows, I stopped running the query after 5 minutes.  We have a SQL box with 8Gb of RAM, so I assumed this query was too badly written to bother continuing!

    Any ideas?

    Cheers, Matt

  • It seems that you left out a "GROUP BY R.date, R.ref" in the query. Other than this, the query is not badly written. If it is too slow, consider adding some indexes, for example a compound index on (ref,Date) or on (Date,ref,value) should be useful. Use Index Tuning Wizard from Query Analyzer to find the most useful indexes for your specific workload and data. Also, you may find out that the query would work much faster if used as a subquery in another query that filters the ref column using some other criterias.

    Razvan

  • try writing your sub-select statements as indexed views, then join to the indexed views.

    Corie Curcillo
    MCT, MCDBA, MCSD

  • Corie,

    An indexed view cannot be used in this scenario. You will get an error like the following: "Cannot index the view '<<view name>>'. It contains one or more disallowed constructs.". This is because an indexed view cannot contain the MIN() or MAX() aggregate functions, as documented in BOL in the following topic:

    http://msdn.microsoft.com/library/en-us/createdb/cm_8_des_06_9jnb.asp

    Another alternative would be to use triggers to update another table with the results of the query, but I don't think that's appropriate in this scenario.

    Razvan

  • At the moment, just to get it working, I've resorted to using a couple of cursors to populate temp tables, then joining the temp tables in the main query.  Obviously this has created a few performance issues but I'll look into alternatives - mainly to get away from using the cursors.  Then I'll look into indexing...

    Thanks for the advice

    Matt

  • Razvan,

    I had forgotten about the restriction on aggregate functions in INDEXED VIEWS. So following the natural progression, perhaps a Rowset UDF instead? The query plan would then be precompiled and perhaps give a performance gain. I would agree w/ you, it would still be advisable to consider indexes on the base-table columns being referenced in the aggregate function calls.

    Corie Curcillo
    MCT, MCDBA, MCSD

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply