Group and Max combination

Question

Group and Max combination

pdanes

SSCrazy Eights

Points: 8504
More actions
September 1, 2020 at 8:33 am

#3785107
I need a sort of odd combination of grouping and max. I have a query that works, but it's ugly and slow. It seems to me this shouldn't be so difficult, but I haven't been able to come up with a good way to do it.

Data is in three fields, with an index on the three fields:
A 1 a
A 1 b
A 1 c
A 2 a
B 1 ''
B 1 a
B 1 b
B 2 ''
C 1 f
C 2 a
C 2 b

I want to group by the initial letter, then get the max of the COMBINATION of number and second letter, i.e., the bold values. When I use a simple max on each field, I naturally get things like A 2 c, B 2 b and C 1 f, which do not reflect an actual record, but get the max number and max second letter separately. In my use, A 2 a is bigger than A 1 c, B 2 is bigger than B 1 b and C 2 b is bigger than C 1 f, and that is what I want, rather than the non-existent A 2 c, B 2 b and C 2 f.

The index should have this information, but I haven't been able to come up with a clean way to get it. My existing query does a nasty join, which results in a table scan joined to an index scan. Making a max of the two fields somehow concatenated would mean a function in the condition clause, and ugly manipulations to account for the combination of number and letter.

The second letter is optional, but I don't need to deal with Nulls – the second letter field is an empty string rather than a null, where the letter is not given.
- This topic was modified 5 years, 5 months ago by pdanes.

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101235 More actions · Answer 1

Would you please provide directly usable data, i.e. CREATE TABLE and INSERT statement(s) for the sample data. [A splat of data on the screen does us no good to try to write code against.]

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

pdanes SSCrazy Eights Points: 8504 More actions · Answer 2

ScottPletcher wrote:

Would you please provide directly usable data, i.e. CREATE TABLE and INSERT statement(s) for the sample data. [A splat of data on the screen does us no good to try to write code against.]

Sure, I can do that. I thought someone might have a quick and easy answer, that I was overlooking something stupidly obvious again, but if it's actually a legitimate problem, I'll happily provide a sample dataset. How much data would be appropriate? The table in question has around 120,000 rows.

This reply was modified 5 years, 5 months ago by pdanes.

pdanes SSCrazy Eights Points: 8504 More actions · Answer 3

Okay, I made up a few sample rows. If you need more, let me know. The code is what I'm using now, and you can see both what it produces, and the bad execution plan.

use tempdb

if exists (select 1 from INFORMATION_SCHEMA.TABLEs where table_name = 'Evid') drop table evid

create table Evid(EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2))
CREATE unique NONCLUSTERED INDEX [ix_AkcDruhaEvidence] ON Evid ([EvidenceLetter] ASC, [EvidenceNumber] ASC, [EvidenceExtra] ASC)

insert into Evid (EvidenceLetter , EvidenceNumber , EvidenceExtra)

values
('A',1,'a'),
('A',2,'a'),
('A',3,''),
('A',4,'a'),
('A',4,'b'),
('A',4,'c'),
('A',4,'d'),
('A',5,''),
('A',6,'a'),
('A',6,'b'),
('A',6,'f'),
('B',1,'c'),
('B',1,'d'),
('B',2,'c'),
('B',2,'h'),
('B',3,'a'),
('B',5,'a'),
('C',1,'q'),
('C',2,'r'),
('C',2,'s'),
('C',2,'t'),
('C',3,'')




select * from evid

declare @tmp table (EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2), OK bit)
insert into @tmp (EvidenceLetter, EvidenceNumber, EvidenceExtra, OK)
SELECT jnk.EvidenceLetter, jnk.MaxOfEvidenceNumber, jnk.MaxOfEvidenceExtra, 'False'
FROM 
(SELECT     MaxCislaPred.EvidenceLetter, MaxCislaPred.MaxOfEvidenceNumber, MAX(P1.EvidenceExtra) MaxOfEvidenceExtra
FROM         (SELECT   EvidenceLetter, MAX(EvidenceNumber) AS MaxOfEvidenceNumber
                       FROM          Evid
                       Where evidenceletter <> '1e'
                       GROUP BY EvidenceLetter
                       ) MaxCislaPred INNER JOIN
                      Evid P1 ON MaxCislaPred.MaxOfEvidenceNumber = P1.EvidenceNumber AND 
                      MaxCislaPred.EvidenceLetter = P1.EvidenceLetter
GROUP BY MaxCislaPred.EvidenceLetter, MaxCislaPred.MaxOfEvidenceNumber) jnk
inner join Evid P2
on jnk.EvidenceLetter = P2.EvidenceLetter
AND jnk.EvidenceLetter = P2.EvidenceLetter
and jnk.MaxOfEvidenceNumber = P2.EvidenceNumber
and jnk.MaxOfEvidenceExtra = P2.EvidenceExtra
group by jnk.EvidenceLetter, jnk.MaxOfEvidenceNumber, jnk.MaxOfEvidenceExtra

update @tmp set OK = 1 From Evid inner join @tmp jnk ON Evid.EvidenceLetter = jnk.EvidenceLetter AND Evid.EvidenceNumber = jnk.EvidenceNumber-1 where jnk.EvidenceExtra = ''

update @tmp set OK = 1 From Evid inner join @tmp jnk ON Evid.EvidenceLetter = jnk.EvidenceLetter AND Evid.EvidenceNumber = jnk.EvidenceNumber and Ascii(Evid.EvidenceExtra) = Ascii(jnk.EvidenceExtra)-1  where jnk.EvidenceExtra <> ''

select * from @tmp order by EvidenceLetter

This reply was modified 5 years, 5 months ago by pdanes.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

SELECT a, MAX(CAST(b AS varchar(2)) + c) 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
GROUP BY a

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

pdanes SSCrazy Eights Points: 8504 More actions · Answer 5

ChrisM@Work wrote:

SELECT a, MAX(CAST(b AS varchar(2)) + c) 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
GROUP BY a

Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).

Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.

This reply was modified 5 years, 5 months ago by pdanes.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 6

pdanes wrote:

ChrisM@Work wrote:
SELECT a, MAX(CAST(b AS varchar(2)) + c) 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
GROUP BY a
Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).
Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.

Two alternatives, the second of these might eliminate a scan if you have an index on a, b, c


;WITH SampleData AS (
SELECT * 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
)
SELECT a, MAX(RIGHT(SPACE(3)+CAST(b AS varchar(3)),3) + c)  
FROM SampleData s
GROUP BY a

;WITH SampleData AS (
SELECT * 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
)
SELECT *
FROM (
SELECT a, maxb = MAX(b)  
FROM SampleData s
GROUP BY a
) d
CROSS APPLY (
SELECT maxc = MAX(c)
FROM SampleData s
WHERE s.a = d.a AND s.b = d.maxb
) x

This reply was modified 5 years, 5 months ago by ChrisM@Work.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ScottPletcher SSC Guru Points: 101235 More actions · Answer 7

The max list is easy enough to produce. Sorry, I don't fully understand the 'OK' part well enough yet to add that to the query. 'OK' just meaning that you got the correct max row, or something else?!

select EvidenceLetter , EvidenceNumber , EvidenceExtra
from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Evid
) as evid_max
where row_num = 1

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

pdanes SSCrazy Eights Points: 8504 More actions · Answer 8

ScottPletcher wrote:

The max list is easy enough to produce. Sorry, I don't fully understand the 'OK' part well enough yet to add that to the query. 'OK' just meaning that you got the correct max row, or something else?!
select EvidenceLetter , EvidenceNumber , EvidenceExtra
from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Evid
) as evid_max
where row_num = 1

Yes, thank you, that partition function looks like what I was after. I've used it a few times, copying stuff I found on the net, but I still don't understand it very well. I'm going to have to do some reading and experimenting to get a good handle on it. I'm at home now, but I will test this tomorrow morning.

The OK stuff is not relevant to the question in my post. It just tells me if that last number is the last in a series, or a singleton, like 21, 22, 23, 40, instead of 21, 22, 23, 24. Such a singleton usually indicates a data entry error, and my app flags such a case with a little orange asterisk. I left it in the copied code, because I wanted to post a working query, and didn't want to take the time to extract it and test the changed version.

pdanes SSCrazy Eights Points: 8504 More actions · Answer 9

ChrisM@Work wrote:

pdanes wrote:
ChrisM@Work wrote:
SELECT a, MAX(CAST(b AS varchar(2)) + c) 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
GROUP BY a
Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).
Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.
Two alternatives, the second of these might eliminate a scan if you have an index on a, b, c
;WITH SampleData AS (
SELECT * 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
)
SELECT a, MAX(RIGHT(SPACE(3)+CAST(b AS varchar(3)),3) + c)  
FROM SampleData s
GROUP BY a

;WITH SampleData AS (
SELECT * 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
)
SELECT *
FROM (
SELECT a, maxb = MAX(b)  
FROM SampleData s
GROUP BY a
) d
CROSS APPLY (
SELECT maxc = MAX(c)
FROM SampleData s
WHERE s.a = d.a AND s.b = d.maxb
) x

Thank you. I do have such an index, and will try this out tomorrow morning. I'm at home now.

pdanes SSCrazy Eights Points: 8504 More actions · Answer 10

pdanes wrote:

ChrisM@Work wrote:
pdanes wrote:
ChrisM@Work wrote:
SELECT a, MAX(CAST(b AS varchar(2)) + c) 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
GROUP BY a
Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).
Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.
Two alternatives, the second of these might eliminate a scan if you have an index on a, b, c
;WITH SampleData AS (
SELECT * 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
)
SELECT a, MAX(RIGHT(SPACE(3)+CAST(b AS varchar(3)),3) + c)  
FROM SampleData s
GROUP BY a

;WITH SampleData AS (
SELECT * 
FROM (VALUES 
('A', 1, 'a'),
('A', 1, 'b'),
('A', 1, 'c'),
('A', 2, 'a'),
('B', 1, ''),
('B', 1, 'a'),
('B', 1, 'b'),
('B', 2, ''),
('C', 1, 'f'),
('C', 2, 'a'),
('C', 2, 'b')
) d (a, b, c)
)
SELECT *
FROM (
SELECT a, maxb = MAX(b)  
FROM SampleData s
GROUP BY a
) d
CROSS APPLY (
SELECT maxc = MAX(c)
FROM SampleData s
WHERE s.a = d.a AND s.b = d.maxb
) x
Thank you. I do have such an index, and will try this out tomorrow morning. I'm at home now.

Nope - both work, but both still result in a table scan. The second version at least goes parallel, but it does two table scans and a hash join. The first does a single scan, but stays on one processor. Neither consults the index. I suppose that's understandable, in the first case, because of the function call, in the second one, because of the Where clause.

pdanes SSCrazy Eights Points: 8504 More actions · Answer 11

pdanes wrote:

ScottPletcher wrote:
The max list is easy enough to produce. Sorry, I don't fully understand the 'OK' part well enough yet to add that to the query. 'OK' just meaning that you got the correct max row, or something else?!
select EvidenceLetter , EvidenceNumber , EvidenceExtra
from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Evid
) as evid_max
where row_num = 1
Yes, thank you, that partition function looks like what I was after. I've used it a few times, copying stuff I found on the net, but I still don't understand it very well. I'm going to have to do some reading and experimenting to get a good handle on it. I'm at home now, but I will test this tomorrow morning.
The OK stuff is not relevant to the question in my post. It just tells me if that last number is the last in a series, or a singleton, like 21, 22, 23, 40, instead of 21, 22, 23, 24. Such a singleton usually indicates a data entry error, and my app flags such a case with a little orange asterisk. I left it in the copied code, because I wanted to post a working query, and didn't want to take the time to extract it and test the changed version.

It works, but it still does a table scan. Only a single scan, and it goes parallel, but it pays no mind to the index, which I don't understand. It seems to me the index should have all the info needed for this query, and I can't see why the optimizer won't use it. I changed the asterisk to only the three fields I want, all of which are in the index, but it still insists on a full table scan.

This reply was modified 5 years, 5 months ago by pdanes.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

I think the optimiser doesn't use the index because there's no benefit in doing so. If you scale up the row count to something sensible and add a column to the table which isn't referenced by the query and isn't included in the index, then the index is chosen.

You're always going to get a table scan for the first column. SQL Server won't use the index to skip from the first row of a partition to the first row of the next partition, which is I think what you're expecting to see, although there is a clever way of forcing SQL Server to do this (bookmark Paul White superfast distinct).

Here's a scaleup version, it's crude and I know Scott will berate me for the clustered index 🙂

IF OBJECT_ID('TempDB..#Evid') IS NOT NULL DROP TABLE #Evid;
create table #Evid(EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2), OtherStuff uniqueidentifier)

;WITH One AS (
SELECT * FROM (values
('A',1,'a', NEWID()),
('A',2,'a', NEWID()),
('A',3,'', NEWID()),
('A',4,'a', NEWID()),
('A',4,'b', NEWID()),
('A',4,'c', NEWID()),
('A',4,'d', NEWID()),
('A',5,'', NEWID()),
('A',6,'a', NEWID()),
('A',6,'b', NEWID()),
('A',6,'f', NEWID()),
('B',1,'c', NEWID()),
('B',1,'d', NEWID()),
('B',2,'c', NEWID()),
('B',2,'h', NEWID()),
('B',3,'a', NEWID()),
('B',5,'a', NEWID()),
('C',1,'q', NEWID()),
('C',2,'r', NEWID()),
('C',2,'s', NEWID()),
('C',2,'t', NEWID()),
('C',3,'', NEWID())
) d (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
)

insert into #Evid (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
SELECT a.EvidenceLetter, a.EvidenceNumber, a.EvidenceExtra, a.OtherStuff 
FROM One a, One b, One c, One d

CREATE UNIQUE CLUSTERED INDEX ucx_Stuff ON #Evid (OtherStuff)
CREATE NONCLUSTERED INDEX [ix_AkcDruhaEvidence] ON #Evid ([EvidenceLetter] ASC, [EvidenceNumber] ASC, [EvidenceExtra] ASC)

------------------------------------------------------------------------------------

SELECT EvidenceLetter, MAX(RIGHT(SPACE(3)+CAST(EvidenceNumber AS varchar(3)),3) + EvidenceExtra)  
FROM #evid s
GROUP BY EvidenceLetter


SELECT *
FROM (
SELECT EvidenceLetter, maxb = MAX(EvidenceNumber)  
FROM #evid s
GROUP BY EvidenceLetter
) d
CROSS APPLY (
SELECT maxc = MAX(EvidenceExtra)
FROM #evid s
WHERE s.EvidenceLetter = d.EvidenceLetter 
AND s.EvidenceNumber = d.maxb
) x


select EvidenceLetter , EvidenceNumber , EvidenceExtra
from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from #Evid
) as evid_max
where row_num = 1



-- No seek:
SELECT DISTINCT EvidenceLetter FROM #Evid

-- No seek:
SELECT EvidenceLetter FROM #Evid GROUP BY EvidenceLetter

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

pdanes SSCrazy Eights Points: 8504 More actions · Answer 13

ChrisM@Work wrote:

I think the optimiser doesn't use the index because there's no benefit in doing so. If you scale up the row count to something sensible and add a column to the table which isn't referenced by the query and isn't included in the index, then the index is chosen.

You're always going to get a table scan for the first column. SQL Server won't use the index to skip from the first row of a partition to the first row of the next partition, which is I think what you're expecting to see, although there is a clever way of forcing SQL Server to do this (bookmark Paul White superfast distinct).

Here's a scaleup version, it's crude and I know Scott will berate me for the clustered index 🙂

IF OBJECT_ID('TempDB..#Evid') IS NOT NULL DROP TABLE #Evid;
create table #Evid(EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2), OtherStuff uniqueidentifier)

;WITH One AS (
SELECT * FROM (values
('A',1,'a', NEWID()),
('A',2,'a', NEWID()),
('A',3,'', NEWID()),
('A',4,'a', NEWID()),
('A',4,'b', NEWID()),
('A',4,'c', NEWID()),
('A',4,'d', NEWID()),
('A',5,'', NEWID()),
('A',6,'a', NEWID()),
('A',6,'b', NEWID()),
('A',6,'f', NEWID()),
('B',1,'c', NEWID()),
('B',1,'d', NEWID()),
('B',2,'c', NEWID()),
('B',2,'h', NEWID()),
('B',3,'a', NEWID()),
('B',5,'a', NEWID()),
('C',1,'q', NEWID()),
('C',2,'r', NEWID()),
('C',2,'s', NEWID()),
('C',2,'t', NEWID()),
('C',3,'', NEWID())
) d (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
)

insert into #Evid (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
SELECT a.EvidenceLetter, a.EvidenceNumber, a.EvidenceExtra, a.OtherStuff 
FROM One a, One b, One c, One d

CREATE UNIQUE CLUSTERED INDEX ucx_Stuff ON #Evid (OtherStuff)
CREATE NONCLUSTERED INDEX [ix_AkcDruhaEvidence] ON #Evid ([EvidenceLetter] ASC, [EvidenceNumber] ASC, [EvidenceExtra] ASC)

------------------------------------------------------------------------------------

SELECT EvidenceLetter, MAX(RIGHT(SPACE(3)+CAST(EvidenceNumber AS varchar(3)),3) + EvidenceExtra)  
FROM #evid s
GROUP BY EvidenceLetter


SELECT *
FROM (
SELECT EvidenceLetter, maxb = MAX(EvidenceNumber)  
FROM #evid s
GROUP BY EvidenceLetter
) d
CROSS APPLY (
SELECT maxc = MAX(EvidenceExtra)
FROM #evid s
WHERE s.EvidenceLetter = d.EvidenceLetter 
AND s.EvidenceNumber = d.maxb
) x


select EvidenceLetter , EvidenceNumber , EvidenceExtra
from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from #Evid
) as evid_max
where row_num = 1



-- No seek:
SELECT DISTINCT EvidenceLetter FROM #Evid

-- No seek:
SELECT EvidenceLetter FROM #Evid GROUP BY EvidenceLetter

But I'm not looking at any other columns. All I'm requesting is the three fields that are all in the index.

This is the query I'm now using:

select EvidenceLetter , EvidenceNumber , EvidenceExtra
from (
    select EvidenceLetter , EvidenceNumber , EvidenceExtra, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Podrobnosti
) as evid_max
where row_num = 1

and here is the index definition:

CREATE NONCLUSTERED INDEX [ix_JenDruhaEvidence] ON [dbo].[Podrobnosti]
(
[EvidenceLetter] ASC,
[EvidenceNumber] ASC,
[EvidenceExtra] ASC
)
INCLUDE ( [AkcesAutoID],
[CountryAutoID],
[TypAutoID],
[PocetKusu],
[Poznamka],
[Description],
[UlozisteDocasne],
[Original],
[PublikovatYN],
[StratigrafieChronoID],
[StratigrafieLitoID],
[TaxonAutoID]) 
WHERE ([EvidenceLetter]<>'1e')
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 14

That index is seriously wide, and filtered. It will only be used by queries which have the same filter.

What proportion of the table columns does it include?

What's the clustered index definition?

This reply was modified 5 years, 5 months ago by ChrisM@Work.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden