November 2, 2009 at 6:53 am
Hi All,
I have a few queries around sql server optimization. a couple are quite specific and a couple are not so 🙂
Firstly, I have a database that I have designed and built as a data store type solution for a number of my companies clients. It has a web front end on it to show the data in a nice format and I create views, on top of my CRUD views, for the application to get this data. This has all been great until the latest project when the quantity of data has hit the 150K - 200K record mark.
I've started to experiance performance issues just doing straight select * from a number of the views the application uses.
One of my major issues seems to be one of my paticular table structures. I basically have a table that stores name and value pair data, I call it a variables table, but effectivly its like a bunch of virtual fields and thier values assocated with the relevant "flat" record by an ID column. I then Pivot this data into a view which presents a record for each "flat" row with all the virtual columns. See table and index defs below for "variables" table:
CREATE TABLE [dbo].[variable_value](
[id] [int] IDENTITY(1,1) NOT NULL,
[vrbl_id] [int] NOT NULL,
[acct_id] [int] NULL,
[cont_id] [int] NULL,
[addr_id] [int] NULL,
[lead_id] [int] NULL,
[last_updated] [smalldatetime] NOT NULL,
[date_value] [smalldatetime] NULL,
[numeric_value] [int] NULL,
[unicode_string_value] [nvarchar](max) NULL,
[string_value] [varchar](max) NULL,
[switch_value] [bit] NULL,
CONSTRAINT [PK_variable_value] PRIMARY KEY NONCLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[variable_value] WITH NOCHECK ADD CONSTRAINT [FK_variable_value_account] FOREIGN KEY([acct_id])
REFERENCES [dbo].[account] ([id])
GO
ALTER TABLE [dbo].[variable_value] CHECK CONSTRAINT [FK_variable_value_account]
GO
ALTER TABLE [dbo].[variable_value] WITH NOCHECK ADD CONSTRAINT [FK_variable_value_address] FOREIGN KEY([addr_id])
REFERENCES [dbo].[address] ([id])
GO
ALTER TABLE [dbo].[variable_value] CHECK CONSTRAINT [FK_variable_value_address]
GO
ALTER TABLE [dbo].[variable_value] WITH NOCHECK ADD CONSTRAINT [FK_variable_value_contact] FOREIGN KEY([cont_id])
REFERENCES [dbo].[contact] ([id])
GO
ALTER TABLE [dbo].[variable_value] CHECK CONSTRAINT [FK_variable_value_contact]
GO
ALTER TABLE [dbo].[variable_value] WITH NOCHECK ADD CONSTRAINT [FK_variable_value_lead] FOREIGN KEY([lead_id])
REFERENCES [dbo].[lead] ([id])
GO
ALTER TABLE [dbo].[variable_value] CHECK CONSTRAINT [FK_variable_value_lead]
GO
ALTER TABLE [dbo].[variable_value] WITH NOCHECK ADD CONSTRAINT [FK_variable_value_variable] FOREIGN KEY([vrbl_id])
REFERENCES [dbo].[variable] ([id])
GO
ALTER TABLE [dbo].[variable_value] CHECK CONSTRAINT [FK_variable_value_variable]
I then hook the pivoted version of this table into a couple of other views and these are the ones taking the time (currently circa 35 seconds for 150k rows). Looking at the execution plan it always does an idex scan of the above table and I can for love nor money figure out how to get it to do index seeks which I assume would speed things up no end
See below the code for the view that includes this pivoted data, views are called vw_contact_variables and also vw_account_variables:
CREATE view [ssApp_views].[contacts] as
select co.external_id as [Siebel_Contact_ID], co.id as [contact_id], av.deleted as [Account_Deleted], ac.id as [Account_ID], cv.[Created],
cv.[Created_By], co.last_updated as [Updated], cv.[Updated_By], co.title as [Mr_Mrs], co.first_name as [First_Name],
co.last_name, ad.line_1 as [Contact_Address_1], ad.line_2 as [Contact_Address_2], ad.town as [Contact_City],
ad.postcode as [Contact_Post_Code], ad.country as [Contact_Country], co.email as [email_address], mn.formatted_number as [Mobile_Phone],
wn.formatted_number as [Work_Phone], cv.[Intl_Channel_Seg], cv.[Intl_Partner_Flag], cv.[Local_Channel_Seg], cv.[Source],
cv.[Certification], cv.[Validated_On], jo.title as [Business_Card_Title], cv.[Comments], cv.[Business_Function],
co.call_pref as [call_Permission], cv.[Category], cv.[Category_Value], cv.[Contact_Team],
case coalesce(co.email, '') when '' then 0 else 1 end as [email_populated],
co.email_pref as [email_Permission], cv.[Inactive_Flag], cv.[Level], co.mail_pref as [Mail_Permission],
cv.[Source_Description], cv.[Audience], ad.site_employees, cv.Demand_Generation_Campaign_UID,
cv.purl, cv.pin, cv.intimis_contact_id
from vw_contact as co join
vw_account as ac on ac.id = co.acct_id join
vw_contact_variables as cv on cv.cont_id = co.id left outer join
vw_address as ad on ad.id = co.addr_id left outer join
vw_telephone_full as mn on mn.cont_id = co.id and mn.description = 'Mobile' left outer join
vw_telephone_full as wn on wn.cont_id = co.id and wn.description = 'DDI' join
vw_account_variables as av on av.acct_id = ac.id left outer join
vw_job as jo on jo.id = co.job_id
So basically I need help! I've made some head way into the optimization but I think I need some advice. I need to make these views run quicker. 35 seconds doesn't sound a lot but when I join this view into another "accounts" view it can take up to 5 minutes to run, and thats not acceptable in anybodys book.
Also as a side not the query analyser has suggested a some stuff, some of which I've applied and some of which made absolutly no sense to apply.
Any help would be greatly welcomed!
Thanks 🙂
Dave
November 2, 2009 at 9:32 am
Sorry I don't have any definite answers but maybe someone else does.., unfortunately this is one of the problems that arise when using an EAV model to store all the data.
http://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-big-design-mistakes.html%5B/url%5D
I would recommend Normalizing the database to improve performance, but since this table seems like it is being used currently this may not be possible,
You may get better performance if you change the pivot table views from using pivot tables to use CASE.. WHEN statements.
Can you post an example of the view used to create one of the Pivot views?
November 2, 2009 at 9:46 am
Hi, Thanks for the response. See below one of the pivot view defs.
There are circa 3.2million rows in the variable value table.
The view is re created dynamically whenever a new record is added to the variable table (i.e. a new field) via triggers
Thanks again
ALTER view [dbo].[vw_contact_variables] as select * from(
select co.id as cont_id, v.name as variable_name,
case v.data_type
when 0 then cast(vv.string_value as varchar(max))
when 1 then cast(vv.numeric_value as varchar(max))
when 2 then cast(vv.date_value as varchar(max))
when 3 then cast(vv.switch_value as varchar(max))
when 4 then cast(vv.unicode_string_value as varchar(max))
end as variable_value
from vw_contact as co full outer join
vw_variable_value as vv on vv.cont_id = co.id full outer join
vw_variable as v on v.id = vv.vrbl_id and v.applies_at in ('C', 'B')
) as T
pivot (max(variable_value) for variable_name in ([comments], [current_customer], [demand_gen_campaign_tier], [certification], [created], [created_by], [intl_channel_seg], [intl_partner_flag], [local_channel_seg], [source], [updated_by], [validated_on], [business_function], [category], [category_value], [contact_team], [inactive_flag], [level], [source_description], [audience], [purl], [pin], [demand_generation_campaign_uid])) PVT
November 2, 2009 at 10:22 am
Can you post execution plans?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2009 at 9:11 pm
Hi
You can use @nalytics Performance Free Data Collector for Microsoft SQL Server & Windows
Server, this tool can help you to solve your performance problems, I have tested it works
excellent and it is very easy to configure and implement it.
Regards
@Analytics Peformance - Microsoft SQL Server & Windwos Server Data Collector
December 14, 2009 at 6:32 am
Hi Guys, just wanted to share my solution to this issue in case it may come in use for others.
Very simply I changed my code to build an indexed table, instead of producing a view. Although this method uses a bit more disk space, effectivly the large unpleasent code that has to run to return and pivot all this data needs only to be run once, not each time the data is requested. I also then placed a trigger on the [variable_value] table to maintain these views should there be any between refresh inserts / deletes of this data.
This method turned a 10+ min query into a 30 ish second one, job done in my book! 😀
Cheers
Dave
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply