Group and Max combination

  • I need a sort of odd combination of grouping and max. I have a query that works, but it's ugly and slow. It seems to me this shouldn't be so difficult, but I haven't been able to come up with a good way to do it.

     

    Data is in three fields, with an index on the three fields:

    A 1 a

    A 1 b

    A 1 c

    A 2 a

    B 1 ''

    B 1 a

    B 1 b

    B 2 ''

    C 1 f

    C 2 a

    C 2 b

     

    I want to group by the initial letter, then get the max of the COMBINATION of number and second letter, i.e., the bold values. When I use a simple max on each field, I naturally get things like A 2 c, B 2 b and C 1 f, which do not reflect an actual record, but get the max number and max second letter separately. In my use, A 2 a is bigger than A 1 c, B 2 is bigger than B 1 b and C 2 b is bigger than C 1 f, and that is what I want, rather than the non-existent A 2 c, B 2 b and C 2 f.

     

    The index should have this information, but I haven't been able to come up with a clean way to get it. My existing query does a nasty join, which results in a table scan joined to an index scan. Making a max of the two fields somehow concatenated would mean a function in the condition clause, and ugly manipulations to account for the combination of number and letter.

     

    The second letter is optional, but I don't need to deal with Nulls – the second letter field is an empty string rather than a null, where the letter is not given.

    • This topic was modified 4 years, 3 months ago by  pdanes.
  • Would you please provide directly usable data, i.e. CREATE TABLE and INSERT statement(s) for the sample data.  [A splat of data on the screen does us no good to try to write code against.]

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    Would you please provide directly usable data, i.e. CREATE TABLE and INSERT statement(s) for the sample data.  [A splat of data on the screen does us no good to try to write code against.]

    Sure, I can do that. I thought someone might have a quick and easy answer, that I was overlooking something stupidly obvious again, but if it's actually a legitimate problem, I'll happily provide a sample dataset. How much data would be appropriate? The table in question has around 120,000 rows.

    • This reply was modified 4 years, 3 months ago by  pdanes.
  • Okay, I made up a few sample rows. If you need more, let me know. The code is what I'm using now, and you can see both what it produces, and the bad execution plan.

    use tempdb

    if exists (select 1 from INFORMATION_SCHEMA.TABLEs where table_name = 'Evid') drop table evid

    create table Evid(EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2))
    CREATE unique NONCLUSTERED INDEX [ix_AkcDruhaEvidence] ON Evid ([EvidenceLetter] ASC, [EvidenceNumber] ASC, [EvidenceExtra] ASC)

    insert into Evid (EvidenceLetter , EvidenceNumber , EvidenceExtra)

    values
    ('A',1,'a'),
    ('A',2,'a'),
    ('A',3,''),
    ('A',4,'a'),
    ('A',4,'b'),
    ('A',4,'c'),
    ('A',4,'d'),
    ('A',5,''),
    ('A',6,'a'),
    ('A',6,'b'),
    ('A',6,'f'),
    ('B',1,'c'),
    ('B',1,'d'),
    ('B',2,'c'),
    ('B',2,'h'),
    ('B',3,'a'),
    ('B',5,'a'),
    ('C',1,'q'),
    ('C',2,'r'),
    ('C',2,'s'),
    ('C',2,'t'),
    ('C',3,'')




    select * from evid

    declare @tmp table (EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2), OK bit)
    insert into @tmp (EvidenceLetter, EvidenceNumber, EvidenceExtra, OK)
    SELECT jnk.EvidenceLetter, jnk.MaxOfEvidenceNumber, jnk.MaxOfEvidenceExtra, 'False'
    FROM
    (SELECT MaxCislaPred.EvidenceLetter, MaxCislaPred.MaxOfEvidenceNumber, MAX(P1.EvidenceExtra) MaxOfEvidenceExtra
    FROM (SELECT EvidenceLetter, MAX(EvidenceNumber) AS MaxOfEvidenceNumber
    FROM Evid
    Where evidenceletter <> '1e'
    GROUP BY EvidenceLetter
    ) MaxCislaPred INNER JOIN
    Evid P1 ON MaxCislaPred.MaxOfEvidenceNumber = P1.EvidenceNumber AND
    MaxCislaPred.EvidenceLetter = P1.EvidenceLetter
    GROUP BY MaxCislaPred.EvidenceLetter, MaxCislaPred.MaxOfEvidenceNumber) jnk
    inner join Evid P2
    on jnk.EvidenceLetter = P2.EvidenceLetter
    AND jnk.EvidenceLetter = P2.EvidenceLetter
    and jnk.MaxOfEvidenceNumber = P2.EvidenceNumber
    and jnk.MaxOfEvidenceExtra = P2.EvidenceExtra
    group by jnk.EvidenceLetter, jnk.MaxOfEvidenceNumber, jnk.MaxOfEvidenceExtra

    update @tmp set OK = 1 From Evid inner join @tmp jnk ON Evid.EvidenceLetter = jnk.EvidenceLetter AND Evid.EvidenceNumber = jnk.EvidenceNumber-1 where jnk.EvidenceExtra = ''

    update @tmp set OK = 1 From Evid inner join @tmp jnk ON Evid.EvidenceLetter = jnk.EvidenceLetter AND Evid.EvidenceNumber = jnk.EvidenceNumber and Ascii(Evid.EvidenceExtra) = Ascii(jnk.EvidenceExtra)-1 where jnk.EvidenceExtra <> ''

    select * from @tmp order by EvidenceLetter


    • This reply was modified 4 years, 3 months ago by  pdanes.
  • SELECT a, MAX(CAST(b AS varchar(2)) + c) 
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    GROUP BY a
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work wrote:

    SELECT a, MAX(CAST(b AS varchar(2)) + c) 
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    GROUP BY a

    Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).

    Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.

    • This reply was modified 4 years, 3 months ago by  pdanes.
  • pdanes wrote:

    ChrisM@Work wrote:

    SELECT a, MAX(CAST(b AS varchar(2)) + c) 
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    GROUP BY a

    Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).

    Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.

     

    Two alternatives, the second of these might eliminate a scan if you have an index on a, b, c

     


    ;WITH SampleData AS (
    SELECT *
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    )
    SELECT a, MAX(RIGHT(SPACE(3)+CAST(b AS varchar(3)),3) + c)
    FROM SampleData s
    GROUP BY a

    ;WITH SampleData AS (
    SELECT *
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    )
    SELECT *
    FROM (
    SELECT a, maxb = MAX(b)
    FROM SampleData s
    GROUP BY a
    ) d
    CROSS APPLY (
    SELECT maxc = MAX(c)
    FROM SampleData s
    WHERE s.a = d.a AND s.b = d.maxb
    ) x


    • This reply was modified 4 years, 3 months ago by  ChrisM@Work.
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The max list is easy enough to produce.  Sorry, I don't fully understand the 'OK' part well enough yet to add that to the query.  'OK' just meaning that you got the correct max row, or something else?!

    select EvidenceLetter , EvidenceNumber , EvidenceExtra
    from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Evid
    ) as evid_max
    where row_num = 1

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    The max list is easy enough to produce.  Sorry, I don't fully understand the 'OK' part well enough yet to add that to the query.  'OK' just meaning that you got the correct max row, or something else?!

    select EvidenceLetter , EvidenceNumber , EvidenceExtra
    from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Evid
    ) as evid_max
    where row_num = 1

    Yes, thank you, that partition function looks like what I was after. I've used it a few times, copying stuff I found on the net, but I still don't understand it very well. I'm going to have to do some reading and experimenting to get a good handle on it. I'm at home now, but I will test this tomorrow morning.

     

    The OK stuff is not relevant to the question in my post. It just tells me if that last number is the last in a series, or a singleton, like 21, 22, 23, 40, instead of 21, 22, 23, 24. Such a singleton usually indicates a data entry error, and my app flags such a case with a little orange asterisk. I left it in the copied code, because I wanted to post a working query, and didn't want to take the time to extract it and test the changed version.

  • ChrisM@Work wrote:

    pdanes wrote:

    ChrisM@Work wrote:

    SELECT a, MAX(CAST(b AS varchar(2)) + c) 
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    GROUP BY a

    Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).

    Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.

    Two alternatives, the second of these might eliminate a scan if you have an index on a, b, c


    ;WITH SampleData AS (
    SELECT *
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    )
    SELECT a, MAX(RIGHT(SPACE(3)+CAST(b AS varchar(3)),3) + c)
    FROM SampleData s
    GROUP BY a

    ;WITH SampleData AS (
    SELECT *
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    )
    SELECT *
    FROM (
    SELECT a, maxb = MAX(b)
    FROM SampleData s
    GROUP BY a
    ) d
    CROSS APPLY (
    SELECT maxc = MAX(c)
    FROM SampleData s
    WHERE s.a = d.a AND s.b = d.maxb
    ) x


    Thank you. I do have such an index, and will try this out tomorrow morning. I'm at home now.

  • pdanes wrote:

    ChrisM@Work wrote:

    pdanes wrote:

    ChrisM@Work wrote:

    SELECT a, MAX(CAST(b AS varchar(2)) + c) 
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    GROUP BY a

    Thank you, but casting the numeric value as a text string will give incorrect results as soon as the second column exceeds 9 (10 will be ranked before 9, when it is a text string).

    Also, I think that will lead to a table scan, because the field values are submitted to a function, and that function must be evaluated for every row in the table.

    Two alternatives, the second of these might eliminate a scan if you have an index on a, b, c


    ;WITH SampleData AS (
    SELECT *
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    )
    SELECT a, MAX(RIGHT(SPACE(3)+CAST(b AS varchar(3)),3) + c)
    FROM SampleData s
    GROUP BY a

    ;WITH SampleData AS (
    SELECT *
    FROM (VALUES
    ('A', 1, 'a'),
    ('A', 1, 'b'),
    ('A', 1, 'c'),
    ('A', 2, 'a'),
    ('B', 1, ''),
    ('B', 1, 'a'),
    ('B', 1, 'b'),
    ('B', 2, ''),
    ('C', 1, 'f'),
    ('C', 2, 'a'),
    ('C', 2, 'b')
    ) d (a, b, c)
    )
    SELECT *
    FROM (
    SELECT a, maxb = MAX(b)
    FROM SampleData s
    GROUP BY a
    ) d
    CROSS APPLY (
    SELECT maxc = MAX(c)
    FROM SampleData s
    WHERE s.a = d.a AND s.b = d.maxb
    ) x


    Thank you. I do have such an index, and will try this out tomorrow morning. I'm at home now.

    Nope - both work, but both still result in a table scan. The second version at least goes parallel, but it does two table scans and a hash join. The first does a single scan, but stays on one processor. Neither consults the index. I suppose that's understandable, in the first case, because of the function call, in the second one, because of the Where clause.

  • pdanes wrote:

    ScottPletcher wrote:

    The max list is easy enough to produce.  Sorry, I don't fully understand the 'OK' part well enough yet to add that to the query.  'OK' just meaning that you got the correct max row, or something else?!

    select EvidenceLetter , EvidenceNumber , EvidenceExtra
    from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Evid
    ) as evid_max
    where row_num = 1

    Yes, thank you, that partition function looks like what I was after. I've used it a few times, copying stuff I found on the net, but I still don't understand it very well. I'm going to have to do some reading and experimenting to get a good handle on it. I'm at home now, but I will test this tomorrow morning.

    The OK stuff is not relevant to the question in my post. It just tells me if that last number is the last in a series, or a singleton, like 21, 22, 23, 40, instead of 21, 22, 23, 24. Such a singleton usually indicates a data entry error, and my app flags such a case with a little orange asterisk. I left it in the copied code, because I wanted to post a working query, and didn't want to take the time to extract it and test the changed version.

    It works, but it still does a table scan. Only a single scan, and it goes parallel, but it pays no mind to the index, which I don't understand. It seems to me the index should have all the info needed for this query, and I can't see why the optimizer won't use it. I changed the asterisk to only the three fields I want, all of which are in the index, but it still insists on a full table scan.

    • This reply was modified 4 years, 3 months ago by  pdanes.
  • I think the optimiser doesn't use the index because there's no benefit in doing so. If you scale up the row count to something sensible and add a column to the table which isn't referenced by the query and isn't included in the index, then the index is chosen.

    You're always going to get a table scan for the first column. SQL Server won't use the index to skip from the first row of a partition to the first row of the next partition, which is I think what you're expecting to see, although there is a clever way of forcing SQL Server to do this (bookmark Paul White superfast distinct).

    Here's a scaleup version, it's crude and I know Scott will berate me for the clustered index 🙂

    IF OBJECT_ID('TempDB..#Evid') IS NOT NULL DROP TABLE #Evid;
    create table #Evid(EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2), OtherStuff uniqueidentifier)

    ;WITH One AS (
    SELECT * FROM (values
    ('A',1,'a', NEWID()),
    ('A',2,'a', NEWID()),
    ('A',3,'', NEWID()),
    ('A',4,'a', NEWID()),
    ('A',4,'b', NEWID()),
    ('A',4,'c', NEWID()),
    ('A',4,'d', NEWID()),
    ('A',5,'', NEWID()),
    ('A',6,'a', NEWID()),
    ('A',6,'b', NEWID()),
    ('A',6,'f', NEWID()),
    ('B',1,'c', NEWID()),
    ('B',1,'d', NEWID()),
    ('B',2,'c', NEWID()),
    ('B',2,'h', NEWID()),
    ('B',3,'a', NEWID()),
    ('B',5,'a', NEWID()),
    ('C',1,'q', NEWID()),
    ('C',2,'r', NEWID()),
    ('C',2,'s', NEWID()),
    ('C',2,'t', NEWID()),
    ('C',3,'', NEWID())
    ) d (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
    )

    insert into #Evid (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
    SELECT a.EvidenceLetter, a.EvidenceNumber, a.EvidenceExtra, a.OtherStuff
    FROM One a, One b, One c, One d

    CREATE UNIQUE CLUSTERED INDEX ucx_Stuff ON #Evid (OtherStuff)
    CREATE NONCLUSTERED INDEX [ix_AkcDruhaEvidence] ON #Evid ([EvidenceLetter] ASC, [EvidenceNumber] ASC, [EvidenceExtra] ASC)

    ------------------------------------------------------------------------------------

    SELECT EvidenceLetter, MAX(RIGHT(SPACE(3)+CAST(EvidenceNumber AS varchar(3)),3) + EvidenceExtra)
    FROM #evid s
    GROUP BY EvidenceLetter


    SELECT *
    FROM (
    SELECT EvidenceLetter, maxb = MAX(EvidenceNumber)
    FROM #evid s
    GROUP BY EvidenceLetter
    ) d
    CROSS APPLY (
    SELECT maxc = MAX(EvidenceExtra)
    FROM #evid s
    WHERE s.EvidenceLetter = d.EvidenceLetter
    AND s.EvidenceNumber = d.maxb
    ) x


    select EvidenceLetter , EvidenceNumber , EvidenceExtra
    from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from #Evid
    ) as evid_max
    where row_num = 1



    -- No seek:
    SELECT DISTINCT EvidenceLetter FROM #Evid

    -- No seek:
    SELECT EvidenceLetter FROM #Evid GROUP BY EvidenceLetter
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work wrote:

    I think the optimiser doesn't use the index because there's no benefit in doing so. If you scale up the row count to something sensible and add a column to the table which isn't referenced by the query and isn't included in the index, then the index is chosen.

    You're always going to get a table scan for the first column. SQL Server won't use the index to skip from the first row of a partition to the first row of the next partition, which is I think what you're expecting to see, although there is a clever way of forcing SQL Server to do this (bookmark Paul White superfast distinct).

    Here's a scaleup version, it's crude and I know Scott will berate me for the clustered index 🙂

    IF OBJECT_ID('TempDB..#Evid') IS NOT NULL DROP TABLE #Evid;
    create table #Evid(EvidenceLetter varchar(2), EvidenceNumber int, EvidenceExtra varchar(2), OtherStuff uniqueidentifier)

    ;WITH One AS (
    SELECT * FROM (values
    ('A',1,'a', NEWID()),
    ('A',2,'a', NEWID()),
    ('A',3,'', NEWID()),
    ('A',4,'a', NEWID()),
    ('A',4,'b', NEWID()),
    ('A',4,'c', NEWID()),
    ('A',4,'d', NEWID()),
    ('A',5,'', NEWID()),
    ('A',6,'a', NEWID()),
    ('A',6,'b', NEWID()),
    ('A',6,'f', NEWID()),
    ('B',1,'c', NEWID()),
    ('B',1,'d', NEWID()),
    ('B',2,'c', NEWID()),
    ('B',2,'h', NEWID()),
    ('B',3,'a', NEWID()),
    ('B',5,'a', NEWID()),
    ('C',1,'q', NEWID()),
    ('C',2,'r', NEWID()),
    ('C',2,'s', NEWID()),
    ('C',2,'t', NEWID()),
    ('C',3,'', NEWID())
    ) d (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
    )

    insert into #Evid (EvidenceLetter , EvidenceNumber , EvidenceExtra, OtherStuff)
    SELECT a.EvidenceLetter, a.EvidenceNumber, a.EvidenceExtra, a.OtherStuff
    FROM One a, One b, One c, One d

    CREATE UNIQUE CLUSTERED INDEX ucx_Stuff ON #Evid (OtherStuff)
    CREATE NONCLUSTERED INDEX [ix_AkcDruhaEvidence] ON #Evid ([EvidenceLetter] ASC, [EvidenceNumber] ASC, [EvidenceExtra] ASC)

    ------------------------------------------------------------------------------------

    SELECT EvidenceLetter, MAX(RIGHT(SPACE(3)+CAST(EvidenceNumber AS varchar(3)),3) + EvidenceExtra)
    FROM #evid s
    GROUP BY EvidenceLetter


    SELECT *
    FROM (
    SELECT EvidenceLetter, maxb = MAX(EvidenceNumber)
    FROM #evid s
    GROUP BY EvidenceLetter
    ) d
    CROSS APPLY (
    SELECT maxc = MAX(EvidenceExtra)
    FROM #evid s
    WHERE s.EvidenceLetter = d.EvidenceLetter
    AND s.EvidenceNumber = d.maxb
    ) x


    select EvidenceLetter , EvidenceNumber , EvidenceExtra
    from (
    select *, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from #Evid
    ) as evid_max
    where row_num = 1



    -- No seek:
    SELECT DISTINCT EvidenceLetter FROM #Evid

    -- No seek:
    SELECT EvidenceLetter FROM #Evid GROUP BY EvidenceLetter

    But I'm not looking at any other columns. All I'm requesting is the three fields that are all in the index.

    This is the query I'm now using:

    select EvidenceLetter , EvidenceNumber , EvidenceExtra
    from (
    select EvidenceLetter , EvidenceNumber , EvidenceExtra, row_number() over(partition by EvidenceLetter ORDER BY EvidenceNumber DESC, EvidenceExtra DESC) AS row_num
    from dbo.Podrobnosti
    ) as evid_max
    where row_num = 1

    and here is the index definition:

    CREATE NONCLUSTERED INDEX [ix_JenDruhaEvidence] ON [dbo].[Podrobnosti]
    (
    [EvidenceLetter] ASC,
    [EvidenceNumber] ASC,
    [EvidenceExtra] ASC
    )
    INCLUDE ( [AkcesAutoID],
    [CountryAutoID],
    [TypAutoID],
    [PocetKusu],
    [Poznamka],
    [Description],
    [UlozisteDocasne],
    [Original],
    [PublikovatYN],
    [StratigrafieChronoID],
    [StratigrafieLitoID],
    [TaxonAutoID])
    WHERE ([EvidenceLetter]<>'1e')
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
    GO
  • That index is seriously wide, and filtered. It will only be used by queries which have the same filter.

    What proportion of the table columns does it include?

    What's the clustered index definition?

    • This reply was modified 4 years, 3 months ago by  ChrisM@Work.
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply