Basic query performance

  • Hi,

    I have this pretty basic query that is starting to perform slowly.

    It returns over 22m records in about 5 minutes.

    I know its difficult without table defs but any ideas on what I can try to make this return faster?

     

    SELECT [e].[eligibility_id]
    FROM [dbo].[eligibility] AS [e]
    -- Table is updated hourly with parsed member phone numbers
    INNER JOIN [memberphonelookup] AS [mpl]
    ON [mpl].[member_id] = [e].[member_id]
    WHERE 1 = 1
  • Use EXISTS() instead of a JOIN (although SQL likely already is doing the equivalent of that in the plan, it doesn't hurt to be sure).

    The only other thing you could do would be to (page) compress the data table.  Typically that will help, unless the time is mostly on the client side.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Indexes on [dbo].[eligibility]([member_id]) INCLUDE ([eligibility_id])

    and [memberphonelookup]([member_id])

  • krypto69 wrote:

    It returns over 22m records in about 5 minutes.

    According to your query, you're returning 22 million rows to the screen...

    Question:  What makes you think that's going to be fast?   And why would you return 22 million rows to the screen?  It's not like a human is going to be able to glean much from such a large return.

    krypto69 wrote:

    I know its difficult without table defs but any ideas on what I can try to make this return faster?

    Since you already know that, why didn't you post them.  Notice that's not actually a question. 😉

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  •  

    Jeff

    the query is the slow part of a larger process.its not being posted to a screen - query is part of universe of claims process that needs to look at all claims since inception.

    And I dont think its going to be fast with that many records - was really just hoping that there might be a faster way that I wasn't thinking about. Like Scotts suggestion above.

    • This reply was modified 10 months, 1 week ago by  krypto69.
    • This reply was modified 10 months, 1 week ago by  krypto69.
  • If you want to make the query you've given as fast as possible to return the rows you could use a schema bound view:

    -- Create the view with SCHEMABINDING
    CREATE VIEW dbo.FastEligibilityView
    WITH SCHEMABINDING
    AS
    SELECT [e].[eligibility_id]
    FROM [dbo].[eligibility] AS [e]
    INNER JOIN [dbo].[memberphonelookup] AS [mpl]
    ON [mpl].[member_id] = [e].[member_id];

    -- Create a unique clustered index on the view
    CREATE UNIQUE CLUSTERED INDEX IX_FastEligibilityView
    ON dbo.FastEligibilityView (eligibility_id);
  • Moving this kind of data truly comes down to hardware (or service tiers if you're in the cloud). All the suggestions will help, but mostly at the margins. More memory, faster disks, that's going to be the biggest bang for your buck unless you can change the process to actually filter data, then query tuning opportunities can arise.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Another thing that ties into what Grant said - if this is SSIS, those 22M rows are likely trying to be stored in memory. If you don't have enough memory for them, you are going to be paging to disk and that's going to be a slow operation.

    Another thing to look at would be blocking. You say that the memberphonelookup table is updated hourly. Is it being updated while that query is running? If so, that'd be blocking causing the slowdown.

    Another thing you could try is to add the MAXDOP 1 option. I had a query in my SSIS that was going parallel and then SOMETIMES causing self blocking so a 1 minute process would sometimes run for hours. By putting the MAXDOP 1 option into the query to prevent it from going parallel, that 1 minute process finished in about 45 seconds consistently. It MAY not be the situation you are hitting, but is something you can try. In your scenario I don't think it'll help, but I've seen stranger things happen before.

    My last questions for you - what changed between when it was fast and when it was slow? And what is "fast" and what is "slow"? What is acceptable performance? Was the slowdown gradual or sudden? If it was gradual, then the slowdown is likely caused by data volume. If it was sudden it is likely caused by some change, such as to a process or software, or hardware limitation such as not enough memory and you are paging to disk.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Trying to index the view

     

    --Set the options to support indexed views.
    SET NUMERIC_ROUNDABORT OFF;
    SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT,
    QUOTED_IDENTIFIER, ANSI_NULLS ON;

    --Create view with SCHEMABINDING.
    IF OBJECT_ID ('dbo.FastEligibilityView', 'view') IS NOT NULL
    DROP VIEW dbo.FastEligibilityView ;
    GO
    CREATE VIEW dbo.FastEligibilityView
    WITH SCHEMABINDING
    AS
    SELECT [e].[eligibility_id]
    FROM [dbo].[eligibility] AS [e]
    INNER JOIN [MemberPhoneLookup] AS [mpl]
    ON [mpl].[member_id] = [e].[member_id];
    GO


    ------ Create a unique clustered index on the view
    CREATE UNIQUE CLUSTERED INDEX IX_FastEligibilityView
    ON dbo.FastEligibilityView (eligibility_id);

     

    Cant seem to get past the error:

    Msg 1935, Level 16, State 1, Line 21

    Cannot create index. Object 'eligibility' was created with the following SET options off: 'ANSI_NULLS'.

  • To resolve this issue, you need to ensure that the ANSI_NULLS option is set to ON for the table 'eligibility'. You can modify the table definition to set the ANSI_NULLS option ON using the following script:

    -- Enable ANSI_NULLS option for the eligibility table
    SET ANSI_NULLS ON;
    GO

    -- Recreate the view and the clustered index
    IF OBJECT_ID('dbo.FastEligibilityView', 'V') IS NOT NULL
    DROP VIEW dbo.FastEligibilityView;
    GO

    CREATE VIEW dbo.FastEligibilityView
    WITH SCHEMABINDING
    AS
    SELECT [e].[eligibility_id]
    FROM [dbo].[eligibility] AS [e]
    INNER JOIN [MemberPhoneLookup] AS [mpl]
    ON [mpl].[member_id] = [e].[member_id];
    GO

    -- Recreate the unique clustered index
    CREATE UNIQUE CLUSTERED INDEX IX_FastEligibilityView
    ON dbo.FastEligibilityView (eligibility_id);
    GO

    As Scott pointed out, you would probably be better off using EXISTS than INNER JOIN in the query.

    An index might also help to prevent updates to the tables being slow.

  • Just a heads up about indexed views - in SQL Server Standard, indexed views must be explicitly requested to use the index. In enterprise the query optimizer will use the index. PLUS indexed views can hurt DML performance. So if either of the tables frequently have DML (insert, update, delete operations), creating the indexed view MAY hurt performance of those queries.

    For more information about indexed views, here is a link - https://learn.microsoft.com/en-us/sql/relational-databases/views/create-indexed-views?view=sql-server-ver16

    Not saying an indexed view is a bad approach, just you need to be aware of the potential risks of using it.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Thanks Jonathon.

     

    Thanks Brian - yeah that table gets massive inserts and updates so I guess indexed view is a bad option.

     

    Tried using exists - same run time:

    select distinct
    [e].[eligibility_id]
    from
    [dbo].[eligibility] as [e]
    WHERE EXISTS (select member_id from [MemberPhoneLookup] as [mpl] where [mpl].[member_id] = [e].[member_id])
  • 22 million rows in universe is too big.... and that too related to CLAIMS.

    You need to create objects specifically for this request, so that it might pull the rows faster.

    -- Create a derived table in universe, ensure you only pull few columns that are relevant in SELECT part

    -- Add WHERE conditions

    -- Add enough INDEXES to support it.

    =======================================================================

  • krypto69 wrote:

    Jeff

    the query is the slow part of a larger process.its not being posted to a screen - query is part of universe of claims process that needs to look at all claims since inception.

    And I dont think its going to be fast with that many records - was really just hoping that there might be a faster way that I wasn't thinking about. Like Scotts suggestion above.

    Knowing how a query is being used is an important part of trying to figure things out.  You're initial post had nothing in it to suggest anything but a return to the screen.  I've logically assumed otherwise on other forums only to be told that they actually wanted it to be returned to the screen and so, in the absence of other information, I assume the screen is the target on such posts.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • krypto69 wrote:

    And I dont think its going to be fast with that many records - was really just hoping that there might be a faster way that I wasn't thinking about. Like Scotts suggestion above.

    Shifting gears to that suggestion, how'd that work out for you?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply