Please Help me improve this SPROC!!! It runs too slow!

  • I use this Sproc to load data based on a month passed in by the user. Because there is a large amount of data in the table (60k+ records per month) it is taking the query a long time to pull the dataset back and populate the form. Can anyone give me some suggestions on how I can decrease the overhead associated with querying the large dataset in the Sproc below:

    Note, I have created an identity key on the main table but haven't yet indexed any other fields (Though I could use some suggestions on how I should index).

    Please see the following example of the Stored Proc.

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_NULLS ON

    GO

    -- =============================================

    -- Author:<Author,,Name>

    -- ALTER date: <ALTER Date,,>

    -- Description:<Description,,>

    -- =============================================

    ALTER PROCEDURE [spGetSalesCode] @Period varchar(15)

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    SELECT [SALES CODE], UPPER(VP_ATTUID) AS VP_ATTUID, UPPER(GM_ATTUID) AS GM_ATTUID,

    UPPER(SVP_ATTUID) AS SVP_ATTUID, UPPER(SM_ATTUID) AS SM_ATTUID, UPPER(COACH_ATTUID) AS COACH_ATTUID,

    UPPER(REP_ATTUID) AS REP_ATTUID

    FROM dbo.tHeirarchyData

    WHERE PERIOD = @Period AND [SALES CODE] IS NOT NULL

    ORDER BY [SALES CODE]

    END

    Here is the Table Structure:

    CREATE TABLE [tHeirarchyData] (

    [ID] uniqueidentifier ROWGUIDCOL NULL CONSTRAINT [DF_tHeirarchyData_ID] DEFAULT (newid()),

    [SALES CODE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [REP_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [REP_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [REP_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [COACH_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [COACH_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [COACH_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SM_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SM_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SM_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [CENTER_CODE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [CITY] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [STATE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [GM_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [GM_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [GM_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [VP_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [VP_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [VP_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SVP_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SVP_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SVP_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [REGIONAL_SYS] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SVP] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [VP] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [GM] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SM] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [CITY_STATE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [COACH] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [REP] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SALES_CODE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [CENTER_TYPE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [CENTER_NAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [VPL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [TACRFT_SEG] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [VP_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [GM_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SM_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [COACH_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [REP_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [UID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SOURCE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [JOB_FUNCTION] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [HAS_ACTING_COACH] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [SALESCODE_STATUS] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [HOME_SYSTEM] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [Period] [datetime] NULL

    ) ON [PRIMARY]

    GO

    As you can see, I am new to T-SQL programming and could use any criticism's and suggestions possible.

    Thanks in advance for help in ways of improving my code.

    Greg

  • For starters, you could an index that uses the two columns in the WHERE clause. That will probably be a huge improvement.

  • Two questions...

    - How many rows are in the table?

    - How many rows is expected this query to retrieve?

    Anyway, how about creating an index on PERIOD column?

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Ok, I'm going to try indexing the Period column. This table has 211k records in it through June, with July & Aug that would bring it to about 350k. I'll be retrieving approximately 60-70k rows from my query based on which month I'm selecting.

  • The period column is a datetime datatype and your argument is a varchar. This might be better:

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_NULLS ON

    GO

    -- =============================================

    -- Author: <Author,,Name>

    -- ALTER date: <ALTER Date,,>

    -- Description: <Description,,>

    -- =============================================

    ALTER PROCEDURE [spGetSalesCode] @Period varchar(15)

    AS

    BEGIN

    -- SET NOCOUNT ON added to prevent extra result sets from

    -- interfering with SELECT statements.

    SET NOCOUNT ON;

    declare @p_lookup datetime

    set @p_lookup = cast(@period as datetime)

    SELECT [SALES CODE], UPPER(VP_ATTUID) AS VP_ATTUID, UPPER(GM_ATTUID) AS GM_ATTUID,

    UPPER(SVP_ATTUID) AS SVP_ATTUID, UPPER(SM_ATTUID) AS SM_ATTUID, UPPER(COACH_ATTUID) AS COACH_ATTUID,

    UPPER(REP_ATTUID) AS REP_ATTUID

    FROM dbo.tHeirarchyData

    WHERE PERIOD = @P_lookup AND [SALES CODE] IS NOT NULL

    ORDER BY [SALES CODE]

    END


    And then again, I might be wrong ...
    David Webb

  • I did that, and it did improve it some. Where you speeking of a single index or a combined index with the two columns?

  • Thank you very much for the advice. This table and many others like it in this database were inherited. I will restructure this table and try to build it properly.

  • gregory.perry2 (8/17/2010)


    Where you speeking of a single index or a combined index with the two columns?

    Single index on the two columns in the where clause.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The indexing advice will certainly improve the performance of the query. Many of us have inherited similar structures and I've found that it's usually worth the effort, in the long run, to change those structures, as you have said you would do. There are a lot of people on this forum who will ( as they have on this thread) help with your stated problem and then help you to make the changes you should make to the table structure. It's an opportunity to ask about things you don't understand and get the benefit of years of dealing with similar problems. As long as you are willing to try something first, most people here will be willing to work with you. Please don't be discouraged by the folks who might seem overly harsh and overly critical. They mean well, and most genuinely care about the level of knowledge of the practitioners of our craft, even though sometimes they sound as though they have been raised by wolves without any normal human contact during their formative years.

    So, here's a place to start. Grab a good book on data modeling and a book on relational theory and get started. While you're reading, think about 2 things:

    Each column should have a data type that accurately reflects the kind and size of data that the coulmn will contain

    Columns in a table should all relate to the primary key of the table, the key being the column or set of columns that uniquely identifies the row. In your case, it looks like you have several things that might need their own table (SMs, GMs, VPs, etc).

    Start a new thread asking for a design review of your changes and provide some information about the requirements of the system, and see what happens.


    And then again, I might be wrong ...
    David Webb

  • gregory.perry2 (8/17/2010)


    Because there is a large amount of data in the table (60k+ records per month) it is taking the query a long time to pull the dataset back and populate the form.

    Woa! Are you telling me that you intend to display 60K rows on the user end of the world?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Not exactly. With this problem, I have to give the user access to choose an employee that belongs to the data record from a pool of 60k available employees and thier corresponding Hierarchy or Org Structure. So in the end I'll only be displaying two records on this form. The employee that was credited for a transaction and the one that should have gotten credit for the transaction. This is a transaction adjustment application.

  • One other piece of advice: Columns that should always contain data should not be NULLable. Reseve NULL columns only for those that may legitimately have no value.

    You will learn to appreciate the reason for this as time goes by.

    The probability of survival is inversely proportional to the angle of arrival.

  • So, if you're going to filter this by criteria other than just period, what are you going to use. That may have a large impact on the index you might create to speed the extraction. The more details you can provide, the better the solution we can help you develop.


    And then again, I might be wrong ...
    David Webb

  • I had originally planned on Creating a base query that would pull all 60k potential reps from the Hierarchy table (Once I get it Re-built), this query would be based on Period (because rep data changes only monthly). The I was going to have a combo box at the application level (Say VB.NET) that would further filter the selection based on what the end user selected.

    I have to provide the entire dataset initiall so the user can choose which rep they want.

  • So why pull back all 60K rows?

    select distinct rep_lname from [tHeirarchyData] where Period = @P_lookup

    will get you just a list of rep last names from the period you are filtering on. Display that instead of delivering all 60k rows to the application (unless you have 60K unique reps in those 60K rows).

    The trick here is to use the power of the DBMS to give to the application only the data it needs.


    And then again, I might be wrong ...
    David Webb

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply