Group by as part of a where clause?

  • I have a table, which simplified boils down to something like this:

    Year | Data

    1992 string1

    1992 string2

    1992 string3

    1993 stringN

    1993 stringX

    1993 stringZ

    1994 string P

    1994 string X

    1995 stringN

    1996 stringN

    I should point out that the stuff in the data column is insignificant - it will never be null and is simply a varchar, the entries above are just what I felt like putting in :p

    Now, I have a where clause when selecting from this table where by I specify that the other fields in the table must be > a specific value, non null etc to get the universe of data I want, this works great.

    However, I want do do one more thing...

    If I do the following:

    SELECT

    Year, count(data)

    FROM mytable

    Group by year

    Then I get something like

    1992| 4

    1993| 2

    Etc...

    Now I basically want to modify the the where clause for the first table to say that there must be > say 2 entries of data for each year - i.e. something which would with the above data set would only return the rows for 1992 and 1993.

    I've had a go with a few ideas I had but all basically didn't work what so ever...

    Any ideas?!

  • How about using the Having clause to pull only those with a count > 2...

  • I'd need a little more specifics. The table DDL, smaple data (in the form of insert statements that can be cut, pasted, and executed to populate the table), and the expected output based on the sample data.

    😎

  • HAVING is probably what you want, but please post some examples and let us know if that works or doesn't.

  • Lynn Pettis (6/26/2008)


    I'd need a little more specifics. The table DDL, smaple data (in the form of insert statements that can be cut, pasted, and executed to populate the table), and the expected output based on the sample data.

    😎

    Excuse my ignorance, but...

    DDL = ? just the column names?!

    What's the easiest way for me to generate the insert statements which you could copy / paste?!!

  • DDL = data definition language. i.e. the CREATE TABLE statement(s) that would allow us to easily recreate your table structures for testing.

    Easiest way to do that is to right-click on the table, and use the "script table as CREATE" to a new query window, then paste the results here.

    That being said - HAVING is your answer (that's been mentioned before)

    as in -

    ...

    HAVING count(*)>2

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Read this article, it will help you with everything you need to allow us to help you:

    http://www.sqlservercentral.com/articles/Best+Practices/61537/">

    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    😎

  • OK, based on that, here is my attempt! (please excuse me if I am wrong - there's a first time for everything n all that :p!)

    To create the table:

    SET ANSI_NULLS OFF

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[#tmpRR_Testing](

    [fund_vintage] [char](4) NULL,

    [fund_id] [int] NULL,

    [fund_name] [varchar](60) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    Now let's populate it with some data:

    INSERT INTO #tmprr_testing

    (fund_vintage, fund_id, fund_name)

    SELECT '1993','333','XYZ Fund II' UNION ALL

    SELECT '1989','15433','TTT Partners Fund I' UNION ALL

    SELECT '1996','29133', 'ABC Fund' UNION ALL

    SELECT '1998','31133','BGF I' UNION ALL

    SELECT '1989','45833','TRE Partners' UNION ALL

    SELECT '1999','48133','TTYE Partners IV' UNION ALL

    SELECT '2000','50333','CMYK Partners' UNION ALL

    SELECT '1995','55533',' PTYER Fund II' UNION ALL

    SELECT '1999','55833','PSPSDS Partners' UNION ALL

    SELECT '1995','59033','CVCVVVVS Partners I' UNION ALL

    SELECT '1999','59133','BNBNBN Partners II' UNION ALL

    SELECT '1997','63633','BNBNBNBNB Fund I'

    This is a reduced set of data, both in terms of rows of data and fields present, as the others don't come in to the equation really.

    I tried the above and it works, except it creates duplicates, doh! - but a select distinct gets round this... and produces 12 rows.

    You will notice there are 3 entries for fund vintage 1999, the others all have less.

    I basically want to set a value (2 in this example I suppose) which the count of the funds must exceed.

    I.e. if you run this:

    SELECT

    fund_vintage, count(fund_vintage) as NumFunds

    FROM #tmpRR_testing

    GROUP BY fund_vintage

    (OK so you get 6 due to the duplicates, but you get the idea)

    So my results set from the first query would only return the 1999 vintage funds.

    Does that make sense?!

    Thanks!

  • Steve was right, all you need is a HAVING clause as shown below:

    SELECT

    fund_vintage,

    count(fund_vintage) as NumFunds

    FROM

    #tmpRR_Testing

    GROUP BY

    fund_vintage

    HAVING

    count(fund_vintage) > 2

    I should have been able to see that as well. I guess I was having a Missouri moment (show me).

    😎

  • I could see that, it was how exactly I obtained the rest of the data in the table, so instead I did the above and dumped it into another table - joined this table to the main table where the count > X so as to remove the fund_vintages which had less than X funds.

    Perhaps a roundabout way to do it, but it seemed to work!

    Ah well, at least I know how to post table creation and population code for next time I run in to a problem!! 🙂

  • Which is a good thing. I wish more people needing help would read that article and follow the procedures outlined. Jeff Moden did an great job with that article.

    😎

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply