How do you limit records using Distinct and Count - TOP and Rowcount dont work ?

  • ifila (1/12/2010)


    I did get a couple of suggestions.

    Use

    CREATE PARTITION FUNCTION

    or

    CREATE PARTITION SCHEMA

    .....but as you all know i suck at SQL, so i really dont know exactly how to use this to my benefit.

    Lol. Whoever gave you that doesn't have a clue what they're talking about. Those are for table partitioning (enterprise edition only) and is a way of breaking a single table into many, but can still be queried as one.

    How that you're feeling helpful again, I asked you three questions over the weekend. How about answering them?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ifila (1/12/2010)


    I did get a couple of suggestions.

    I bet you did :laugh:

  • Paul White (1/13/2010)


    ifila (1/12/2010)


    I did get a couple of suggestions.

    I bet you did :laugh:

    ROFL! <<unprintable>> ๐Ÿ˜€

    Thanks Paul, that made the long slog into work through the snow very worthwhile!

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • GilaMonster (1/10/2010)


    ifila (1/10/2010)


    BTW all i am trying to do is limit the number of records i am counting....pretty simple really. I just dont know the correct syntax.

    It probably is, but personally I don't fully understand, from reading your posts, exactly what it is that you want. I've very likely done something similar many times but, if I were to tell you how I solved my problems, I may be giving you an answer to a question that you haven't asked. All that does is waste your time, waste my time and make people angry.

    You said you tried several things and the count was wrong. What would a 'correct' count be?

    Without understanding your exact requirements (which, as I said, I don't), any query I give you has got a good chance of also producing a wrong count.

    Please remember that while the problem may appear simple to you, we're not there and we don't know the problem in the detail that you do. Hence the request for some data so that we can see exactly what the correct count that you want is.

    I want to set the record limit to 1000, so the count would be 1000.

    Since i have a large number of records, SQL sometimes 'times out' when doing the Count, so i want to limit how many records it is counting.

    SELECT COUNT(DISTINCT(Email))

    FROM( SELECT TOP 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

    WHERE ((contains (originalresume, '"j2ee" and "java"'))

    AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

    )p

    UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

    WHERE LEN(Email) > 0

    Adding TOP 1000 on line 2 works fine, but obviously give a count of less than 1000.

    I have the same Query using UNION but it is slower than UNPIVOT.

  • Sample data please. Just a few rows that are in the table so we can see what we're dealing with. 3 pages on and I still don't know what you're trying to do, and you still haven't answered the questions that I asked. For the record:

    But a SELECT Count will only return 1 row if there's no grouping, so why the top?

    Do you want to count the distinct values in the top 50, or are you grouping by something, what the counts of the distinct values of everything and return only the top 50 grouped values?

    Expected results. Given the sample data that you list (see point 1), what should the result be. Not an explanation here please, just the actual values that you want to see.

    Why are you making this so hard? If you'd just posted the stuff that we've been asking for this could have been resolved and answered several days ago.

    Oh, btw, if SQL's timing out it usually means your query is inefficient or your indexes are inadequate. Putting row limitations is not the answer (especially since often TOP is the last operator applied on a query). Fixing the indexing or fixing the query is.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ifila (1/13/2010)


    GilaMonster (1/10/2010)


    ifila (1/10/2010)


    BTW all i am trying to do is limit the number of records i am counting....pretty simple really. I just dont know the correct syntax.

    It probably is, but personally I don't fully understand, from reading your posts, exactly what it is that you want. I've very likely done something similar many times but, if I were to tell you how I solved my problems, I may be giving you an answer to a question that you haven't asked. All that does is waste your time, waste my time and make people angry.

    You said you tried several things and the count was wrong. What would a 'correct' count be?

    Without understanding your exact requirements (which, as I said, I don't), any query I give you has got a good chance of also producing a wrong count.

    Please remember that while the problem may appear simple to you, we're not there and we don't know the problem in the detail that you do. Hence the request for some data so that we can see exactly what the correct count that you want is.

    I want to set the record limit to 1000, so the count would be 1000.

    Since i have a large number of records, SQL sometimes 'times out' when doing the Count, so i want to limit how many records it is counting.

    SELECT COUNT(DISTINCT(Email))

    FROM( SELECT TOP 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

    WHERE ((contains (originalresume, '"j2ee" and "java"'))

    AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

    )p

    UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

    WHERE LEN(Email) > 0

    Adding TOP 1000 on line 2 works fine, but obviously give a count of less than 1000.

    I have the same Query using UNION but it is slower than UNPIVOT.

    Please provide us with sample data and expected results. Without this it is difficult to know what you are really trying to accomplish.

  • Something like this ?

    with cteemails

    as(

    SELECT distinct(Email)

    FROM(

    SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

    WHERE ((contains (originalresume, '"j2ee" and "java"'))

    AND

    currentdateout BETWEEN '2000-01-01' AND '2010-01-06'

    ) p

    UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

    WHERE LEN(Email) > 0

    )

    Select top 1000 Email from cteemails

    Cant guarantee performance will be sparkling , it may be better to not unpivot and cross join to a numbers table..



    Clear Sky SQL
    My Blog[/url]

  • I have attached 20 sample records.

    10 are duplicates.

    Total # of unique emails are: 31

    email1 - 10

    email2- 8

    email3 - 6

    email4 - 4

    email5 - 2

    email6 - 1

    I would like only the TOP 15 unique emails.

    SELECT COUNT(DISTINCT(Email))

    FROM( SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

    WHERE ((contains (originalresume, '"test"'))

    AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

    )p

    UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

    WHERE LEN(Email) > 0

    Thanks

  • ifila (1/13/2010)


    I would like only the TOP 15 unique emails.

    Just 15 unique email addresses? No counts involved?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ifila (1/13/2010)


    GilaMonster (1/10/2010)


    ifila (1/10/2010)


    BTW all i am trying to do is limit the number of records i am counting....pretty simple really. I just dont know the correct syntax.

    It probably is, but personally I don't fully understand, from reading your posts, exactly what it is that you want. I've very likely done something similar many times but, if I were to tell you how I solved my problems, I may be giving you an answer to a question that you haven't asked. All that does is waste your time, waste my time and make people angry.

    You said you tried several things and the count was wrong. What would a 'correct' count be?

    Without understanding your exact requirements (which, as I said, I don't), any query I give you has got a good chance of also producing a wrong count.

    Please remember that while the problem may appear simple to you, we're not there and we don't know the problem in the detail that you do. Hence the request for some data so that we can see exactly what the correct count that you want is.

    I want to set the record limit to 1000, so the count would be 1000.

    Since i have a large number of records, SQL sometimes 'times out' when doing the Count, so i want to limit how many records it is counting.

    SELECT COUNT(DISTINCT(Email))

    FROM( SELECT TOP 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

    WHERE ((contains (originalresume, '"j2ee" and "java"'))

    AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

    )p

    UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

    WHERE LEN(Email) > 0

    Adding TOP 1000 on line 2 works fine, but obviously give a count of less than 1000.

    I have the same Query using UNION but it is slower than UNPIVOT.

    I know you say it is slower, but could you also post the query using UNION? I'd like to see it as well.

  • GilaMonster (1/13/2010)


    ifila (1/13/2010)


    I would like only the TOP 15 unique emails.

    Just 15 unique email addresses? No counts involved?

    Just the number '15' as a count. No email addresses necessary.

  • create TABLE #OutputEmails

    (

    Email1 nvarchar(1000),

    Email2 nvarchar(1000),

    Email3 nvarchar(1000),

    Email4 nvarchar(1000),

    Email5 nvarchar(1000),

    Email6 nvarchar(1000),

    Email7 nvarchar(1000),

    Email8 nvarchar(1000),

    Email9 nvarchar(1000),

    Email10 nvarchar(1000)

    );

    INSERT INTO #OutputEmails SELECT Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM OutputResume3

    where (contains (originalresume, '"livelink" and "java"'))

    select count(DISTINCT (Email)) FROM

    (

    SELECT top 1000 Email1 as Email FROM #OutputEmails

    UNION

    SELECT Email2 as Email FROM #OutputEmails

    UNION

    SELECT Email3 as Email From #OutputEmails

    Union

    SELECT Email4 as Email From #OutputEmails

    UNION

    SELECT Email5 as Email FROM #OutputEmails

    UNION

    SELECT Email6 as Email From #OutputEmails

    Union

    SELECT Email7 as Email From #OutputEmails

    UNION

    SELECT Email8 as Email FROM #OutputEmails

    UNION

    SELECT Email9 as Email From #OutputEmails

    Union

    SELECT Email10 as Email From #OutputEmails

    ) as T where Email like '%@%'

  • Try this then

    SELECT COUNT(Email) AS NumberOfEmails -- get the count

    FROM (

    SELECT DISTINCT TOP 15 Email -- get 15 unique emails

    FROM (

    SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

    WHERE ((contains (originalresume, '"j2ee" and "java"'))

    AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

    )p

    UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

    WHERE LEN(Email) > 0 ) sub

    I'm curious as to why. You just want the number 15? For what purpose?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ifila (1/13/2010)


    create TABLE #OutputEmails

    (

    Email1 nvarchar(1000),

    Email2 nvarchar(1000),

    Email3 nvarchar(1000),

    Email4 nvarchar(1000),

    Email5 nvarchar(1000),

    Email6 nvarchar(1000),

    Email7 nvarchar(1000),

    Email8 nvarchar(1000),

    Email9 nvarchar(1000),

    Email10 nvarchar(1000)

    );

    INSERT INTO #OutputEmails SELECT Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM OutputResume3

    where (contains (originalresume, '"livelink" and "java"'))

    select count(DISTINCT (Email)) FROM

    (

    SELECT top 1000 Email1 as Email FROM #OutputEmails

    UNION

    SELECT Email2 as Email FROM #OutputEmails

    UNION

    SELECT Email3 as Email From #OutputEmails

    Union

    SELECT Email4 as Email From #OutputEmails

    UNION

    SELECT Email5 as Email FROM #OutputEmails

    UNION

    SELECT Email6 as Email From #OutputEmails

    Union

    SELECT Email7 as Email From #OutputEmails

    UNION

    SELECT Email8 as Email FROM #OutputEmails

    UNION

    SELECT Email9 as Email From #OutputEmails

    Union

    SELECT Email10 as Email From #OutputEmails

    ) as T where Email like '%@%'

    You do realize that the TOP 1000 only applies to the first query in the union, not all the rest of them, right?

  • Could you please explain what you are trying to achieve? What is the business case for this query (now I'm starting to sound life Jeff Moden)? How are you using it?

Viewing 15 posts - 46 through 60 (of 103 total)

You must be logged in to reply to this topic. Login to reply