How do you limit records using Distinct and Count - TOP and Rowcount dont work ?

Question

How do you limit records using Distinct and Count - TOP and Rowcount dont work ?

Viewing 15 posts - 46 through 60 (of 103 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

ifila (1/12/2010)
I did get a couple of suggestions.
Use
CREATE PARTITION FUNCTION
or
CREATE PARTITION SCHEMA
.....but as you all know i suck at SQL, so i really dont know exactly how to use this to my benefit.

Lol. Whoever gave you that doesn't have a clue what they're talking about. Those are for table partitioning (enterprise edition only) and is a way of breaking a single table into many, but can still be queried as one.

How that you're feeling helpful again, I asked you three questions over the weekend. How about answering them?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Paul White SSC Guru Points: 150468 More actions · Answer 2

ifila (1/12/2010)
I did get a couple of suggestions.

I bet you did :laugh:

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Paul White (1/13/2010)
ifila (1/12/2010)
I did get a couple of suggestions.
I bet you did :laugh:

ROFL! <<unprintable>> 😀

Thanks Paul, that made the long slog into work through the snow very worthwhile!

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

isuckatsql SSChampion Points: 10381 More actions · Answer 4

GilaMonster (1/10/2010)
ifila (1/10/2010)
BTW all i am trying to do is limit the number of records i am counting....pretty simple really. I just dont know the correct syntax.
It probably is, but personally I don't fully understand, from reading your posts, exactly what it is that you want. I've very likely done something similar many times but, if I were to tell you how I solved my problems, I may be giving you an answer to a question that you haven't asked. All that does is waste your time, waste my time and make people angry.
You said you tried several things and the count was wrong. What would a 'correct' count be?
Without understanding your exact requirements (which, as I said, I don't), any query I give you has got a good chance of also producing a wrong count.
Please remember that while the problem may appear simple to you, we're not there and we don't know the problem in the detail that you do. Hence the request for some data so that we can see exactly what the correct count that you want is.

I want to set the record limit to 1000, so the count would be 1000.

Since i have a large number of records, SQL sometimes 'times out' when doing the Count, so i want to limit how many records it is counting.

SELECT COUNT(DISTINCT(Email))

FROM( SELECT TOP 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

WHERE ((contains (originalresume, '"j2ee" and "java"'))

AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

)p

UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

WHERE LEN(Email) > 0

Adding TOP 1000 on line 2 works fine, but obviously give a count of less than 1000.

I have the same Query using UNION but it is slower than UNPIVOT.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Sample data please. Just a few rows that are in the table so we can see what we're dealing with. 3 pages on and I still don't know what you're trying to do, and you still haven't answered the questions that I asked. For the record:

But a SELECT Count will only return 1 row if there's no grouping, so why the top?
Do you want to count the distinct values in the top 50, or are you grouping by something, what the counts of the distinct values of everything and return only the top 50 grouped values?

Expected results. Given the sample data that you list (see point 1), what should the result be. Not an explanation here please, just the actual values that you want to see.

Why are you making this so hard? If you'd just posted the stuff that we've been asking for this could have been resolved and answered several days ago.

Oh, btw, if SQL's timing out it usually means your query is inefficient or your indexes are inadequate. Putting row limitations is not the answer (especially since often TOP is the last operator applied on a query). Fixing the indexing or fixing the query is.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

ifila (1/13/2010)
GilaMonster (1/10/2010)
ifila (1/10/2010)
BTW all i am trying to do is limit the number of records i am counting....pretty simple really. I just dont know the correct syntax.
It probably is, but personally I don't fully understand, from reading your posts, exactly what it is that you want. I've very likely done something similar many times but, if I were to tell you how I solved my problems, I may be giving you an answer to a question that you haven't asked. All that does is waste your time, waste my time and make people angry.
You said you tried several things and the count was wrong. What would a 'correct' count be?
Without understanding your exact requirements (which, as I said, I don't), any query I give you has got a good chance of also producing a wrong count.
Please remember that while the problem may appear simple to you, we're not there and we don't know the problem in the detail that you do. Hence the request for some data so that we can see exactly what the correct count that you want is.
I want to set the record limit to 1000, so the count would be 1000.
Since i have a large number of records, SQL sometimes 'times out' when doing the Count, so i want to limit how many records it is counting.
SELECT COUNT(DISTINCT(Email))
FROM( SELECT TOP 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3
WHERE ((contains (originalresume, '"j2ee" and "java"'))
AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))
)p
UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt
WHERE LEN(Email) > 0
Adding TOP 1000 on line 2 works fine, but obviously give a count of less than 1000.
I have the same Query using UNION but it is slower than UNPIVOT.

Please provide us with sample data and expected results. Without this it is difficult to know what you are really trying to accomplish.

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 7

Something like this ?

with cteemails

as(

SELECT distinct(Email)

FROM(

SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

WHERE ((contains (originalresume, '"j2ee" and "java"'))

AND

currentdateout BETWEEN '2000-01-01' AND '2010-01-06'

) p

UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

WHERE LEN(Email) > 0

)

Select top 1000 Email from cteemails

Cant guarantee performance will be sparkling , it may be better to not unpivot and cross join to a numbers table..

Clear Sky SQL
My Blog[/url]

isuckatsql SSChampion Points: 10381 More actions · Answer 8

I have attached 20 sample records.

10 are duplicates.

Total # of unique emails are: 31

email1 - 10

email2- 8

email3 - 6

email4 - 4

email5 - 2

email6 - 1

I would like only the TOP 15 unique emails.

SELECT COUNT(DISTINCT(Email))

FROM( SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

WHERE ((contains (originalresume, '"test"'))

AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

)p

UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

WHERE LEN(Email) > 0

Thanks

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

ifila (1/13/2010)
I would like only the TOP 15 unique emails.

Just 15 unique email addresses? No counts involved?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 10

ifila (1/13/2010)
GilaMonster (1/10/2010)
ifila (1/10/2010)
BTW all i am trying to do is limit the number of records i am counting....pretty simple really. I just dont know the correct syntax.
It probably is, but personally I don't fully understand, from reading your posts, exactly what it is that you want. I've very likely done something similar many times but, if I were to tell you how I solved my problems, I may be giving you an answer to a question that you haven't asked. All that does is waste your time, waste my time and make people angry.
You said you tried several things and the count was wrong. What would a 'correct' count be?
Without understanding your exact requirements (which, as I said, I don't), any query I give you has got a good chance of also producing a wrong count.
Please remember that while the problem may appear simple to you, we're not there and we don't know the problem in the detail that you do. Hence the request for some data so that we can see exactly what the correct count that you want is.
I want to set the record limit to 1000, so the count would be 1000.
Since i have a large number of records, SQL sometimes 'times out' when doing the Count, so i want to limit how many records it is counting.
SELECT COUNT(DISTINCT(Email))
FROM( SELECT TOP 1000 Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3
WHERE ((contains (originalresume, '"j2ee" and "java"'))
AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))
)p
UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt
WHERE LEN(Email) > 0
Adding TOP 1000 on line 2 works fine, but obviously give a count of less than 1000.
I have the same Query using UNION but it is slower than UNPIVOT.

I know you say it is slower, but could you also post the query using UNION? I'd like to see it as well.

isuckatsql SSChampion Points: 10381 More actions · Answer 11

GilaMonster (1/13/2010)
ifila (1/13/2010)
I would like only the TOP 15 unique emails.
Just 15 unique email addresses? No counts involved?

Just the number '15' as a count. No email addresses necessary.

isuckatsql SSChampion Points: 10381 More actions · Answer 12

create TABLE #OutputEmails

(

Email1 nvarchar(1000),

Email2 nvarchar(1000),

Email3 nvarchar(1000),

Email4 nvarchar(1000),

Email5 nvarchar(1000),

Email6 nvarchar(1000),

Email7 nvarchar(1000),

Email8 nvarchar(1000),

Email9 nvarchar(1000),

Email10 nvarchar(1000)

);

INSERT INTO #OutputEmails SELECT Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM OutputResume3

where (contains (originalresume, '"livelink" and "java"'))

select count(DISTINCT (Email)) FROM

(

SELECT top 1000 Email1 as Email FROM #OutputEmails

UNION

SELECT Email2 as Email FROM #OutputEmails

UNION

SELECT Email3 as Email From #OutputEmails

Union

SELECT Email4 as Email From #OutputEmails

UNION

SELECT Email5 as Email FROM #OutputEmails

UNION

SELECT Email6 as Email From #OutputEmails

Union

SELECT Email7 as Email From #OutputEmails

UNION

SELECT Email8 as Email FROM #OutputEmails

UNION

SELECT Email9 as Email From #OutputEmails

Union

SELECT Email10 as Email From #OutputEmails

) as T where Email like '%@%'

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

Try this then

SELECT COUNT(Email) AS NumberOfEmails -- get the count

FROM (

SELECT DISTINCT TOP 15 Email -- get 15 unique emails

FROM (

SELECT Email1,Email2,Email3, Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM outputresume3

WHERE ((contains (originalresume, '"j2ee" and "java"'))

AND (currentdateout BETWEEN '2000-01-01' AND '2010-01-06'))

)p

UNPIVOT (Email FOR Emails IN (Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10)) as unpvt

WHERE LEN(Email) > 0 ) sub

I'm curious as to why. You just want the number 15? For what purpose?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 14

ifila (1/13/2010)
create TABLE #OutputEmails
(
Email1 nvarchar(1000),
Email2 nvarchar(1000),
Email3 nvarchar(1000),
Email4 nvarchar(1000),
Email5 nvarchar(1000),
Email6 nvarchar(1000),
Email7 nvarchar(1000),
Email8 nvarchar(1000),
Email9 nvarchar(1000),
Email10 nvarchar(1000)
);
INSERT INTO #OutputEmails SELECT Email1,Email2,Email3,Email4,Email5,Email6,Email7,Email8,Email9,Email10 FROM OutputResume3
where (contains (originalresume, '"livelink" and "java"'))
select count(DISTINCT (Email)) FROM
(
SELECT top 1000 Email1 as Email FROM #OutputEmails
UNION
SELECT Email2 as Email FROM #OutputEmails
UNION
SELECT Email3 as Email From #OutputEmails
Union
SELECT Email4 as Email From #OutputEmails
UNION
SELECT Email5 as Email FROM #OutputEmails
UNION
SELECT Email6 as Email From #OutputEmails
Union
SELECT Email7 as Email From #OutputEmails
UNION
SELECT Email8 as Email FROM #OutputEmails
UNION
SELECT Email9 as Email From #OutputEmails
Union
SELECT Email10 as Email From #OutputEmails
) as T where Email like '%@%'

You do realize that the TOP 1000 only applies to the first query in the union, not all the rest of them, right?

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 15

Could you please explain what you are trying to achieve? What is the business case for this query (now I'm starting to sound life Jeff Moden)? How are you using it?