Help with Counts

Question

Help with Counts

Viewing 15 posts - 46 through 60 (of 120 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

It looks like you're looking for either one row or one column of aggregated data, right?

Why not write each query on its own, either as inline sub-queries in a single Select statement, or as separate queries with Union operators between them?

Either:

SELECT

(Select count(*) from dbo.MyTable) as TotalRecords,

(Select count(distinct Datakey) from dbo.MyTable) as DistinctDataKeys,

(select count(*) from (select distinct Datakey, Landline from dbo.MyTable) as A) as DistinctLandlineDatakeys,

... etc.

or

select 'Total Records' as [Measure], count(*) as Qty

from dbo.MyTable

union all

select 'Datakeys', count(distinct Datakey)

from dbo.MyTable

... etc.

Can you query each item separately?

On the more complex rules, like landlines, where you want distinct values, and each can only count if it has a datakey assigned, that becomes a simple matter of querying distinct landline values where Datakey is not null, or whatever the complexity of the rules is.

I think the complexity of this task is coming from trying to bite it all off in one query. Break it up and it'll simplify itself.

Or has that already been looked into and I missed it?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

stephen99999 Ten Centuries Points: 1197 More actions · Answer 2

L' Eomot Inversé (1/13/2012)
stephen99999 (1/13/2012)
I think the best way to do this is to just test on various sample data. Really change it up, and see if you can "break" the code. If not, then I guess it works (even if we don't fully understand it).
Stephen
That's a pretty dangerous approach - trusting code you don't understand to be correct after testing on a set of small samples. If this code doesn't work, how will you ever know it doesn't work, if the data combination that breaks it is so complex that it can only arise with a sample of at least 5000 rows?

Never said "small." Also, I was just merely asking him to thoroughly test it, which in one of his very recent posts, he did, and found an error. So I would say my post was just fine.

Besides, sometimes risks must be taken. I see your point, but at the same time were not testing some quirky code involving deletes, updates, or inserts. His data is safe 🙂

Again... I see your point, I just like to live dangerously lol. hopefully, my comment does not extend this topic to 10 pages..

TomThomson SSC Guru Points: 104773 More actions · Answer 3

That's quite clear as far as it goes, but it leaves some questions open. You seem to think that it's something that can be written as a single SQL statement, but as I read it you have secified a constrained maximisation problem which may not be possible to express efficiently in such a way.

For example, if I have

datakey landline

1 100

1 200

1 300

2 200

2 300

2 100

3 300

3 100

3 200

in the data you presumably want the uniquelandline count to be 3 because it is possible to find 3 (datakey, landline) combinations no two of which have the same langline or the same datakey. And presumably you also want the count to be 3 is what I have is

1 100

1 200

1 300

2 100

2 200

2 300

3 100

because it is still possible to choose 3 combinations which are distinct in both fields: (1,300), (2,200),and (1,300).

But of course if I pick (1,100) and (2,200) there is no landline value I can pick to go wioth datakey 3, because its only lanline has already been counted with datakey 1, so to get the right answer I have to maximise the number of combinations by not choosing (1,100) in this case. And also by not choosing (2,100) in this case.

If that is the requirement (with a simlar requirement for each of the other fields) then none of the code suggested so far comes anywhere near doing what you require, because what you have is a constrained maximisation problem which probably needs trial and backtrack to resolve it, and the code so far makes no attempt at all at doing such maximisation.

If the answer 2 (instead of 3) would be acceptable in the second case, things may be simpler - but you need to find a way of expressing clearly what the requirement is before we can be sure that it is simpler. For example if your count were defined as the smallest number such that a set with that number of combinations satisfying the distinctness rule can be found from your table is not a proper subset of any set obeying the distinctness rule than can be taken from the table you have an equally nasty constrained minimisation (then, of course, 2 would be the only acceptable answer in the second case). If both 2 and 3 are acceptable, maybe the answer can be any number which is the size of some set of comginations which satisfies the uniqueness rule and is not a proper subset of another such set; in that case you genuinely can have simpler code, but you may find that if you run the query twice on the same data you can get two different answers if for example an index has been defragmented between the two runs, or if an extra CPU has been added to the server.

Tom

bicky1980 SSCrazy Points: 2079 More actions · Answer 4

L' Eomot Inversé (1/13/2012)
That's quite clear as far as it goes, but it leaves some questions open. You seem to think that it's something that can be written as a single SQL statement, but as I read it you have secified a constrained maximisation problem which may not be possible to express efficiently in such a way.
For example, if I have
datakey landline
1 100
1 200
1 300
2 200
2 300
2 100
3 300
3 100
3 200
in the data you presumably want the uniquelandline count to be 3 because it is possible to find 3 (datakey, landline) combinations no two of which have the same langline or the same datakey. And presumably you also want the count to be 3 is what I have is
1 100
1 200
1 300
2 100
2 200
2 300
3 100
because it is still possible to choose 3 combinations which are distinct in both fields: (1,300), (2,200),and (1,300).
But of course if I pick (1,100) and (2,200) there is no landline value I can pick to go wioth datakey 3, because its only lanline has already been counted with datakey 1, so to get the right answer I have to maximise the number of combinations by not choosing (1,100) in this case. And also by not choosing (2,100) in this case.
If that is the requirement (with a simlar requirement for each of the other fields) then none of the code suggested so far comes anywhere near doing what you require, because what you have is a constrained maximisation problem which probably needs trial and backtrack to resolve it, and the code so far makes no attempt at all at doing such maximisation.
If the answer 2 (instead of 3) would be acceptable in the second case, things may be simpler - but you need to find a way of expressing clearly what the requirement is before we can be sure that it is simpler. For example if your count were defined as the smallest number such that a set with that number of combinations satisfying the distinctness rule can be found from your table is not a proper subset of any set obeying the distinctness rule than can be taken from the table you have an equally nasty constrained minimisation (then, of course, 2 would be the only acceptable answer in the second case). If both 2 and 3 are acceptable, maybe the answer can be any number which is the size of some set of comginations which satisfies the uniqueness rule and is not a proper subset of another such set; in that case you genuinely can have simpler code, but you may find that if you run the query twice on the same data you can get two different answers if for example an index has been defragmented between the two runs, or if an extra CPU has been added to the server.

Thanks L' Eomot Inversé.

From what you have described , my ideal answer would have been 3. (This may be over complicating the query though)

I do not want a query where if I run it again on the same dataset it could potentially return different results. So what I need to do is define a rule to stop this from happening. What if there was a rule which ordered the results by each field (oder by landline asc)? So looking at your dataset, I would end up with (1,100), then (2,200) as 200 is the next smallest landline available? Then (3,300) - resulting in a count of 3.Would this be possible and would this make the query any easier?

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 5

bicky1980 (1/13/2012)
Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)

We need to know more about the data distribution. What do you want to do if there are two rows (and only two out of 40M) with the same datakey, and each row has a unique (unique in 40M rows) landline value? According to this rule, one or both would have to be excluded.

I've added a single row to your data set to explain:

DROP table test2

create table test2(indkey nvarchar(2),datakey nvarchar(4),landline nvarchar(11),mobile nvarchar(11),email nvarchar(20))

insert into test2 values

('1','0001','01234567890','0712345679','1@test.co.uk'),

('2','0001','01234567890','','1@test.co.uk'),

('3','0002','01234567890','','2@test.co.uk'),

('4','0002','01234567891','','2@test.co.uk'),

('5','0002','','07123456789',''),

('6','0003','01234567892','07123456791',''), -- unique landline, same datakey

('7','0003','01234567897','07123456791',''), -- unique landline, same datakey

('8','0004','01234567893','07123456792',''),

('9','0005','01234567894','07123456793','2@test.co.uk'),

('10','0008','01234567895','07123456793','9@test.co.uk')

SELECT

indkey,

datakey,

drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline),

landline,

lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

FROM test2

WHERE landline <> ''

ORDER BY datakey, landline

The description of the problem is much better - thanks - but there are gaps as Tom has pointed out and as this test shows. Also, a much larger sample data set would go a long way.

Cheers

ChrisM

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

GSquared SSC Guru Points: 260824 More actions · Answer 6

The order of the rows in a dataset doesn't really matter to set-based SQL operations on it.

I just need a clarification on this statement, "Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)"

If a landline exists multiple times in the data, and has multiple datakey values, don't count it all, or count it only one time?

E.g.:

datakey landline pairs:

1 100

2 100

Would that count as 1 or 0? Clarification of "If the same landline is anywhere else in the results its needs to be excluded from the figure". Does that mean exclude it entirely, or exclude the duplicates?

If that should count as 1, then this seems to work:

IF OBJECT_ID(N'tempdb..#test2') IS NOT NULL

DROP TABLE #test2;

CREATE TABLE #test2

(indkey NVARCHAR(2),

datakey NVARCHAR(4),

landline NVARCHAR(11),

mobile NVARCHAR(11),

email NVARCHAR(20))

INSERT INTO #test2

VALUES ('1', '0001', '01234567890', '0712345679', '1@test.co.uk'),

('2', '0001', '01234567890', '', '1@test.co.uk'),

('3', '0002', '01234567890', '', '2@test.co.uk'),

('4', '0002', '01234567891', '', '2@test.co.uk'),

('5', '0002', '', '07123456789', ''),

('6', '0003', '01234567892', '07123456791', ''),

('7', '0004', '01234567893', '07123456792', ''),

('8', '0005', '01234567894', '07123456793', '2@test.co.uk'),

('9', '0008', '01234567895', '07123456793', '9@test.co.uk') ;

SELECT 'Total' AS CountType, COUNT(*) AS Qty

FROM #test2

UNION ALL

SELECT 'Datakeys', COUNT(DISTINCT datakey)

FROM #test2

UNION ALL

SELECT 'Landlines', COUNT(DISTINCT landline)

FROM #test2

WHERE landline > ''

AND datakey > ''

UNION ALL

SELECT 'Mobiles', COUNT(DISTINCT mobile)

FROM #test2

WHERE mobile > ''

AND datakey > ''

UNION ALL

SELECT 'Emails', COUNT(DISTINCT email)

FROM #test2

WHERE email > ''

AND datakey > '';

If 0, then change the Select to this:

SELECT 'Total' AS CountType, COUNT(*) AS Qty

FROM #test2

UNION ALL

SELECT 'Datakeys', COUNT(DISTINCT datakey)

FROM #test2

UNION ALL

SELECT 'Landlines',

COUNT(DISTINCT landline)

FROM (SELECT landline

FROM #test2

WHERE landline > ''

AND datakey > ''

GROUP BY landline

HAVING COUNT(*) = 1) AS Sub

UNION ALL

SELECT 'Mobiles',

COUNT(DISTINCT mobile)

FROM (SELECT mobile

FROM #test2

WHERE mobile > ''

AND datakey > ''

GROUP BY mobile

HAVING COUNT(*) = 1) AS Sub

UNION ALL

SELECT 'Emails',

COUNT(DISTINCT email)

FROM (SELECT email

FROM #test2

WHERE email > ''

AND datakey > ''

GROUP BY email

HAVING COUNT(*) = 1) AS Sub;

(All scripts tested in SQL 2008 R2 Dev Edition.)

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

WayneS SSC Guru Points: 95461 More actions · Answer 7

stephen99999 (1/12/2012)
My question is, how did you figure out to do it this way? I have not used row_number(), over, partition by very much and really don't understand it. Can anyone elaborate? Haven't found a good blog that explains this in "idiot terms."

Ahem... http://www.sqlservercentral.com/articles/T-SQL/69717/[/url]

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

bicky1980 SSCrazy Points: 2079 More actions · Answer 8

ChrisM@home (1/13/2012)
bicky1980 (1/13/2012)
Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)
We need to know more about the data distribution. What do you want to do if there are two rows (and only two out of 40M) with the same datakey, and each row has a unique (unique in 40M rows) landline value? According to this rule, one or both would have to be excluded.
I've added a single row to your data set to explain:
DROP table test2
create table test2(indkey nvarchar(2),datakey nvarchar(4),landline nvarchar(11),mobile nvarchar(11),email nvarchar(20))
insert into test2 values
('1','0001','01234567890','0712345679','1@test.co.uk'),
('2','0001','01234567890','','1@test.co.uk'),
('3','0002','01234567890','','2@test.co.uk'),
('4','0002','01234567891','','2@test.co.uk'),
('5','0002','','07123456789',''),
('6','0003','01234567892','07123456791',''), -- unique landline, same datakey
('7','0003','01234567897','07123456791',''), -- unique landline, same datakey
('8','0004','01234567893','07123456792',''),
('9','0005','01234567894','07123456793','2@test.co.uk'),
('10','0008','01234567895','07123456793','9@test.co.uk')
SELECT
indkey,
datakey,
drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline),
landline,
lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)
FROM test2
WHERE landline <> ''
ORDER BY datakey, landline
The description of the problem is much better - thanks - but there are gaps as Tom has pointed out and as this test shows. Also, a much larger sample data set would go a long way.
Cheers
ChrisM

Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys

Thanks

bicky1980 SSCrazy Points: 2079 More actions · Answer 9

GSquared (1/13/2012)
The order of the rows in a dataset doesn't really matter to set-based SQL operations on it.
I just need a clarification on this statement, "Total number of populated Landline fields (Needs to have a unique datakey as well as be unique itself - If the same landline is anywhere else in the results its needs to be excluded from the figure)"
If a landline exists multiple times in the data, and has multiple datakey values, don't count it all, or count it only one time?
E.g.:
datakey landline pairs:
1 100
2 100
Would that count as 1 or 0? Clarification of "If the same landline is anywhere else in the results its needs to be excluded from the figure". Does that mean exclude it entirely, or exclude the duplicates?
If that should count as 1, then this seems to work:
IF OBJECT_ID(N'tempdb..#test2') IS NOT NULL
DROP TABLE #test2;
CREATE TABLE #test2
(indkey NVARCHAR(2),
datakey NVARCHAR(4),
landline NVARCHAR(11),
mobile NVARCHAR(11),
email NVARCHAR(20))
INSERT INTO #test2
VALUES ('1', '0001', '01234567890', '0712345679', '1@test.co.uk'),
('2', '0001', '01234567890', '', '1@test.co.uk'),
('3', '0002', '01234567890', '', '2@test.co.uk'),
('4', '0002', '01234567891', '', '2@test.co.uk'),
('5', '0002', '', '07123456789', ''),
('6', '0003', '01234567892', '07123456791', ''),
('7', '0004', '01234567893', '07123456792', ''),
('8', '0005', '01234567894', '07123456793', '2@test.co.uk'),
('9', '0008', '01234567895', '07123456793', '9@test.co.uk') ;
SELECT 'Total' AS CountType, COUNT(*) AS Qty
FROM #test2
UNION ALL
SELECT 'Datakeys', COUNT(DISTINCT datakey)
FROM #test2
UNION ALL
SELECT 'Landlines', COUNT(DISTINCT landline)
FROM #test2
WHERE landline > ''
AND datakey > ''
UNION ALL
SELECT 'Mobiles', COUNT(DISTINCT mobile)
FROM #test2
WHERE mobile > ''
AND datakey > ''
UNION ALL
SELECT 'Emails', COUNT(DISTINCT email)
FROM #test2
WHERE email > ''
AND datakey > '';
If 0, then change the Select to this:
SELECT 'Total' AS CountType, COUNT(*) AS Qty
FROM #test2
UNION ALL
SELECT 'Datakeys', COUNT(DISTINCT datakey)
FROM #test2
UNION ALL
SELECT 'Landlines',
COUNT(DISTINCT landline)
FROM (SELECT landline
FROM #test2
WHERE landline > ''
AND datakey > ''
GROUP BY landline
HAVING COUNT(*) = 1) AS Sub
UNION ALL
SELECT 'Mobiles',
COUNT(DISTINCT mobile)
FROM (SELECT mobile
FROM #test2
WHERE mobile > ''
AND datakey > ''
GROUP BY mobile
HAVING COUNT(*) = 1) AS Sub
UNION ALL
SELECT 'Emails',
COUNT(DISTINCT email)
FROM (SELECT email
FROM #test2
WHERE email > ''
AND datakey > ''
GROUP BY email
HAVING COUNT(*) = 1) AS Sub;
(All scripts tested in SQL 2008 R2 Dev Edition.)

I would only require this to be counted as 1 record only...From your solution for this the results would be:

Total9

Datakeys6

Landlines5

Mobiles4

Emails1

The results should be:

Total9

Datakeys6

Landlines6

Mobiles5

Emails3

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 10

bicky1980 (1/13/2012)
...
Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys
Thanks

Try running this query with a larger dataset, and have a look at rows where drn=1 and lrn=1;

SELECT

indkey,

datakey,

drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline DESC),

landline,

lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

FROM test2

WHERE landline <> ''

ORDER BY datakey, landline DESC

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

bicky1980 SSCrazy Points: 2079 More actions · Answer 11

ChrisM@home (1/13/2012)
bicky1980 (1/13/2012)
...
Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys
Thanks
Try running this query with a larger dataset, and have a look at rows where drn=1 and lrn=1;
SELECT
indkey,
datakey,
drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline DESC),
landline,
lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)
FROM test2
WHERE landline <> ''
ORDER BY datakey, landline DESC

Can you tell me what this query should be doing?

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 12

bicky1980 (1/16/2012)
ChrisM@home (1/13/2012)
bicky1980 (1/13/2012)
...
Thanks for the reply, if there were 2 duplicate datakeys with different landline values, I would only want this to be counted as one record, as I dont want duplicate datakeys
Thanks
Try running this query with a larger dataset, and have a look at rows where drn=1 and lrn=1;
SELECT
indkey,
datakey,
drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY landline DESC),
landline,
lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)
FROM test2
WHERE landline <> ''
ORDER BY datakey, landline DESC
Can you tell me what this query should be doing?

Sure - here's an excellent article by Wayne Sheffield explaining the window functions[/url].

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

bicky1980 SSCrazy Points: 2079 More actions · Answer 13

Hi Everyone, Thanks for all your help with this query.

With ChrisM's help, I think I now have the solution:

CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))

insert into test2

values ('1','0001','01234567890','0712345679','1@test.co.uk')

insert into test2

values ('2','0001','01234567890','','1@test.co.uk')

insert into test2

values ('3','0002','01234567890','','2@test.co.uk')

insert into test2

values ('4','0002','01234567891','','2@test.co.uk')

insert into test2

values ('5','0002','','07123456789','')

insert into test2

values ('6','0003','01234567892','07123456791','')

insert into test2

values ('7','0004','01234567893','07123456792','')

insert into test2

values ('8','0005','01234567894','07123456793','2@test.co.uk')

insert into test2

values ('9','0008','01234567895','07123456793','9@test.co.uk')

insert into test2

values ('10','0008','01234567895','','5@test.co.uk')

insert into test2

values ('11','0008','01234567896','','6@test.co.uk')

insert into test2

values ('12','0009','01234567896','','6@test.co.uk')

insert into test2

values ('13','0009','01234567889','','6@test.co.uk')

select 'Total' AS CountType,count(*) as AllRecs

from test2

UNION ALL

select 'Datakeys',sum(drn)

from (SELECT *,

drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY datakey)

FROM test2) a

where drn=1

and drn!=''

UNION ALL

select 'Landlines',sum(lrn)

from (SELECT *,

lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)

FROM test2) a

where lrn=1

and landline!=''

UNION ALL

select 'Mobiles',sum(mrn)

from (SELECT *,

mrn = ROW_NUMBER() OVER(PARTITION BY mobile ORDER BY datakey)

FROM test2) a

where mrn=1

and mobile!=''

UNION ALL

select 'Emails',sum(ern)

from (SELECT *,

ern = ROW_NUMBER() OVER(PARTITION BY email ORDER BY datakey)

FROM test2) a

where ern=1

and email!=''

It does appear to collate the correct results - I will test more to ensure the figure is correct. In the meantime could someone assist me with yet another part of this query. A the moment the results are returned as rows, how would I go about returning the results in columns so rather than:

Total13

Datakeys7

Landlines8

Mobiles5

Emails5

I would have

Total Datakeys Landlines Mobiles Emails

13 7 8 5 5

Thanks

Mark Cowne One Orange Chip Points: 26969 More actions · Answer 14

bicky1980 (1/16/2012)
Hi Everyone, Thanks for all your help with this query.
With ChrisM's help, I think I now have the solution:
CREATE TABLE #test2 (indkey NVARCHAR(2), datakey NVARCHAR(4), landline NVARCHAR(11), mobile NVARCHAR(11), email NVARCHAR(20))
insert into test2
values ('1','0001','01234567890','0712345679','1@test.co.uk')
insert into test2
values ('2','0001','01234567890','','1@test.co.uk')
insert into test2
values ('3','0002','01234567890','','2@test.co.uk')
insert into test2
values ('4','0002','01234567891','','2@test.co.uk')
insert into test2
values ('5','0002','','07123456789','')
insert into test2
values ('6','0003','01234567892','07123456791','')
insert into test2
values ('7','0004','01234567893','07123456792','')
insert into test2
values ('8','0005','01234567894','07123456793','2@test.co.uk')
insert into test2
values ('9','0008','01234567895','07123456793','9@test.co.uk')
insert into test2
values ('10','0008','01234567895','','5@test.co.uk')
insert into test2
values ('11','0008','01234567896','','6@test.co.uk')
insert into test2
values ('12','0009','01234567896','','6@test.co.uk')
insert into test2
values ('13','0009','01234567889','','6@test.co.uk')
select 'Total' AS CountType,count(*) as AllRecs
from test2
UNION ALL
select 'Datakeys',sum(drn)
from (SELECT *,
drn = ROW_NUMBER() OVER(PARTITION BY datakey ORDER BY datakey)
FROM test2) a
where drn=1
and drn!=''
UNION ALL
select 'Landlines',sum(lrn)
from (SELECT *,
lrn = ROW_NUMBER() OVER(PARTITION BY landline ORDER BY datakey)
FROM test2) a
where lrn=1
and landline!=''
UNION ALL
select 'Mobiles',sum(mrn)
from (SELECT *,
mrn = ROW_NUMBER() OVER(PARTITION BY mobile ORDER BY datakey)
FROM test2) a
where mrn=1
and mobile!=''
UNION ALL
select 'Emails',sum(ern)
from (SELECT *,
ern = ROW_NUMBER() OVER(PARTITION BY email ORDER BY datakey)
FROM test2) a
where ern=1
and email!=''
It does appear to collate the correct results - I will test more to ensure the figure is correct. In the meantime could someone assist me with yet another part of this query. A the moment the results are returned as rows, how would I go about returning the results in columns so rather than:
Total13
Datakeys7
Landlines8
Mobiles5
Emails5
I would have
Total Datakeys Landlines Mobiles Emails
13 7 8 5 5
Thanks

My original query

SELECT COUNT(*) AS Total,

COUNT(DISTINCT datakey) AS Uniquedatakey,

COUNT(DISTINCT CASE WHEN landline<>'' THEN landline END) AS Uniquelandline,

COUNT(DISTINCT CASE WHEN mobile<>'' THEN mobile END) AS Uniquemobile,

COUNT(DISTINCT CASE WHEN email<>'' THEN email END) AS Uniqueemail

FROM #test2

give these results as well. Coincidence?

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

DeanORenO SSC Eights! Points: 865 More actions · Answer 15

Instead of using a where clause on the conditional values you need you may want to use a CASE statement incorporated into the select list of the select statement to provide the conditional you need to count the conditional records