August 17, 2010 at 11:44 am
I use this Sproc to load data based on a month passed in by the user. Because there is a large amount of data in the table (60k+ records per month) it is taking the query a long time to pull the dataset back and populate the form. Can anyone give me some suggestions on how I can decrease the overhead associated with querying the large dataset in the Sproc below:
Note, I have created an identity key on the main table but haven't yet indexed any other fields (Though I could use some suggestions on how I should index).
Please see the following example of the Stored Proc.
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
-- =============================================
-- Author:<Author,,Name>
-- ALTER date: <ALTER Date,,>
-- Description:<Description,,>
-- =============================================
ALTER PROCEDURE [spGetSalesCode] @Period varchar(15)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
SELECT [SALES CODE], UPPER(VP_ATTUID) AS VP_ATTUID, UPPER(GM_ATTUID) AS GM_ATTUID,
UPPER(SVP_ATTUID) AS SVP_ATTUID, UPPER(SM_ATTUID) AS SM_ATTUID, UPPER(COACH_ATTUID) AS COACH_ATTUID,
UPPER(REP_ATTUID) AS REP_ATTUID
FROM dbo.tHeirarchyData
WHERE PERIOD = @Period AND [SALES CODE] IS NOT NULL
ORDER BY [SALES CODE]
END
Here is the Table Structure:
CREATE TABLE [tHeirarchyData] (
[ID] uniqueidentifier ROWGUIDCOL NULL CONSTRAINT [DF_tHeirarchyData_ID] DEFAULT (newid()),
[SALES CODE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REP_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REP_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REP_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[COACH_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[COACH_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[COACH_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SM_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SM_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SM_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CENTER_CODE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CITY] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[STATE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[GM_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[GM_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[GM_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VP_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VP_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VP_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SVP_LNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SVP_FNAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SVP_ATTUID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REGIONAL_SYS] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SVP] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VP] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[GM] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SM] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CITY_STATE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[COACH] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REP] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SALES_CODE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CENTER_TYPE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CENTER_NAME] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VPL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[TACRFT_SEG] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VP_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[GM_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SM_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[COACH_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[REP_EMAIL] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[UID] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SOURCE] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[JOB_FUNCTION] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[HAS_ACTING_COACH] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[SALESCODE_STATUS] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[HOME_SYSTEM] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Period] [datetime] NULL
) ON [PRIMARY]
GO
As you can see, I am new to T-SQL programming and could use any criticism's and suggestions possible.
Thanks in advance for help in ways of improving my code.
Greg
August 17, 2010 at 11:50 am
For starters, you could an index that uses the two columns in the WHERE clause. That will probably be a huge improvement.
August 17, 2010 at 11:53 am
Two questions...
- How many rows are in the table?
- How many rows is expected this query to retrieve?
Anyway, how about creating an index on PERIOD column?
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.August 17, 2010 at 12:16 pm
Ok, I'm going to try indexing the Period column. This table has 211k records in it through June, with July & Aug that would bring it to about 350k. I'll be retrieving approximately 60-70k rows from my query based on which month I'm selecting.
August 17, 2010 at 12:45 pm
The period column is a datetime datatype and your argument is a varchar. This might be better:
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
-- =============================================
-- Author: <Author,,Name>
-- ALTER date: <ALTER Date,,>
-- Description: <Description,,>
-- =============================================
ALTER PROCEDURE [spGetSalesCode] @Period varchar(15)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
declare @p_lookup datetime
set @p_lookup = cast(@period as datetime)
SELECT [SALES CODE], UPPER(VP_ATTUID) AS VP_ATTUID, UPPER(GM_ATTUID) AS GM_ATTUID,
UPPER(SVP_ATTUID) AS SVP_ATTUID, UPPER(SM_ATTUID) AS SM_ATTUID, UPPER(COACH_ATTUID) AS COACH_ATTUID,
UPPER(REP_ATTUID) AS REP_ATTUID
FROM dbo.tHeirarchyData
WHERE PERIOD = @P_lookup AND [SALES CODE] IS NOT NULL
ORDER BY [SALES CODE]
END
August 17, 2010 at 12:49 pm
I did that, and it did improve it some. Where you speeking of a single index or a combined index with the two columns?
August 17, 2010 at 2:48 pm
Thank you very much for the advice. This table and many others like it in this database were inherited. I will restructure this table and try to build it properly.
August 17, 2010 at 3:13 pm
gregory.perry2 (8/17/2010)
Where you speeking of a single index or a combined index with the two columns?
Single index on the two columns in the where clause.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 17, 2010 at 4:29 pm
The indexing advice will certainly improve the performance of the query. Many of us have inherited similar structures and I've found that it's usually worth the effort, in the long run, to change those structures, as you have said you would do. There are a lot of people on this forum who will ( as they have on this thread) help with your stated problem and then help you to make the changes you should make to the table structure. It's an opportunity to ask about things you don't understand and get the benefit of years of dealing with similar problems. As long as you are willing to try something first, most people here will be willing to work with you. Please don't be discouraged by the folks who might seem overly harsh and overly critical. They mean well, and most genuinely care about the level of knowledge of the practitioners of our craft, even though sometimes they sound as though they have been raised by wolves without any normal human contact during their formative years.
So, here's a place to start. Grab a good book on data modeling and a book on relational theory and get started. While you're reading, think about 2 things:
Each column should have a data type that accurately reflects the kind and size of data that the coulmn will contain
Columns in a table should all relate to the primary key of the table, the key being the column or set of columns that uniquely identifies the row. In your case, it looks like you have several things that might need their own table (SMs, GMs, VPs, etc).
Start a new thread asking for a design review of your changes and provide some information about the requirements of the system, and see what happens.
August 17, 2010 at 8:18 pm
gregory.perry2 (8/17/2010)
Because there is a large amount of data in the table (60k+ records per month) it is taking the query a long time to pull the dataset back and populate the form.
Woa! Are you telling me that you intend to display 60K rows on the user end of the world?
--Jeff Moden
Change is inevitable... Change for the better is not.
August 18, 2010 at 7:35 am
Not exactly. With this problem, I have to give the user access to choose an employee that belongs to the data record from a pool of 60k available employees and thier corresponding Hierarchy or Org Structure. So in the end I'll only be displaying two records on this form. The employee that was credited for a transaction and the one that should have gotten credit for the transaction. This is a transaction adjustment application.
August 18, 2010 at 1:47 pm
One other piece of advice: Columns that should always contain data should not be NULLable. Reseve NULL columns only for those that may legitimately have no value.
You will learn to appreciate the reason for this as time goes by.
The probability of survival is inversely proportional to the angle of arrival.
August 18, 2010 at 2:02 pm
So, if you're going to filter this by criteria other than just period, what are you going to use. That may have a large impact on the index you might create to speed the extraction. The more details you can provide, the better the solution we can help you develop.
August 18, 2010 at 2:08 pm
I had originally planned on Creating a base query that would pull all 60k potential reps from the Hierarchy table (Once I get it Re-built), this query would be based on Period (because rep data changes only monthly). The I was going to have a combo box at the application level (Say VB.NET) that would further filter the selection based on what the end user selected.
I have to provide the entire dataset initiall so the user can choose which rep they want.
August 18, 2010 at 2:17 pm
So why pull back all 60K rows?
select distinct rep_lname from [tHeirarchyData] where Period = @P_lookup
will get you just a list of rep last names from the period you are filtering on. Display that instead of delivering all 60k rows to the application (unless you have 60K unique reps in those 60K rows).
The trick here is to use the power of the DBMS to give to the application only the data it needs.
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply