January 3, 2012 at 5:15 pm
Hey Everyone,
I am in no way an expert at SQL Server and I will try to provide as much detail as possible. I have been having several deadlock issues over the last couple weeks. I believe the problem has always been there but now that the amount of data is increasing, the issue pops up.
Here is my database:
TradeVolumes: (Everything is NOT NULL)
TradeID int
AnalysisTypeID int
VolumeTypeID int
Hour int
Value decimal(19, 2)
I do NOT have a primary key set. The primary key would be: TradeID, AnalysisTypeID, VolumeTypeID, Hour (pointing to a data value for a data type, for a specific hour)
TradeID corresponds to a trade table row entry.
AnalysisTypeID and VolumeTypeID are references to other tables so I know which data type im looking at/querying
Hour is a value that is 1 to 24
The Value is a decimal value that represents a data for exactly one hour.
This table was built to store multiple data types that all share the same structure.
Now here is where things get crazy. I have a client application that is multi-threaded. When two threads go to insert/update values, I start getting the deadlock errors. From reading up on these issues I see that there must be some sort of indexing trick or locking trick that I am missing (or my design could just be totally wrong). This table now has over 10 million rows.
Here is the stored procedure that I am using:
ALTER PROCEDURE [dbo].[mysp_TradeVolumes_I]
@IN_TradeIDint,
@IN_AnalysisTypeIDint,
@IN_VolumeTypeIDint,
@IN_VolumesXML
AS
-- Delete any prior values since they are updated
DELETE from TradeVolumes
WHERE TradeID = @IN_TradeID AND AnalysisTypeID = @IN_AnalysisTypeID AND VolumeTypeID = @IN_VolumeTypeID
-- Insert the new values
INSERT INTO TradeVolumes (TradeID, AnalysisTypeID, VolumeTypeID, Hour, Value)
SELECT @IN_TradeID as TradeID, @IN_AnalysisTypeID, @IN_VolumeTypeID,
Volumes.Vols.value('./@Hr','int') [Hour],
Volumes.Vols.value('./@Vol','numeric(19,2)') Volume
FROM @IN_Volumes.nodes('/Volumes/Hr') as Volumes(Vols)
The breakdown on the sproc: I pass in XML for the Hours/Volumes so I do not have to call the DB 24 times for each instance. Here is an example of how I call this guy:
EXEC mysp_TradeVolumes_I 123, 1, 1, '<Volumes><Hr Hr="1" Vol="20"/><Hr Hr="2" Vol="50"/><Hr Hr="20" Vol="100"/><Hr Hr="24" Vol="50"/></Volumes>'
I appreciate any help and please let me know if I missed any critical pieces of information. Feel free to tell me I'm completely wrong and get me in the right direction! :w00t:
Cheers,
ATC
January 3, 2012 at 5:31 pm
What's your clustered index? That's the important part here. I notice you mention no PK but that's not necessarily the same thing.
If you don't have indexing going you're going to constantly table scan. This is problematic.
I would definately add in a clustered index on these fields, or at the least a non-clustered index:
TradeID | AnalysisTypeID | VolumeTypeID
Now, let's add to that... why are you thrashing the disk with multiple writes? Also, how are you dealing with conflicts for the same data? It's theoretically possible for two users to drop off the same xml simultaneously and then you have the same combination in there twice.
What I would do is the following:
1: Attempt to Update the existing rows.
2: Check the Rowcount, if 0:
2a: Insert the rows.
This will help with the multi-statement necessity by only hitting data once when required. Proper indexing and PK usage will avoid duplications.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 3, 2012 at 6:55 pm
Hey Craig,
Thanks for the speedy help!
I created a clustered index per your suggestion. A side question before I dive into this more. With a table of ten million entries and growing by about 60,000 a day, would it be wise to reindex this table every night? More? Less?
As for your point about duplicate data, I had an index created on the table per the columns I listed that would be the PK (unique). That prevented duplicates while I was still designing this. Ugly, I know...I'm a C# guy, not a DBA.
As for the thrashing point and the statement about attempting to update and referencing the @@rowcount. I don't believe I can so this (or maybe I just don't know how). Here is a certain situation:
At some point in the day, I make a call with the sproc with these params:
EXEC mysp_TradeVolumes_I 1234, 1, 1, '<Volumes><Hr Hr="1" Vol="20"/><Hr Hr="2" Vol="50"/><Hr Hr="3" Vol="100"/><Hr Hr="4" Vol="50"/></Volumes>'
Let's say none of those rows existed and @@rowcount returned 0. Then they would all be inserted.
Result:
1234, 1, 1, 1, 20
1234, 1, 1, 2, 50
1234, 1, 1, 3, 100
1234, 1, 1, 4, 50
Later in the future, I call it again with these params:
EXEC mysp_TradeVolumes_I 1234, 1, 1, '<Volumes><Hr Hr="1" Vol="20"/><Hr Hr="2" Vol="50"/><Hr Hr="3" Vol="100"/><Hr Hr="4" Vol="50"/><Hr Hr="5" Vol="50000"/></Volumes>'
@@rowcount would return 4 and I have 5 in the list (notice I added: <Hr Hr="5" Vol="50000"/>)
I am totally lost on how to handle this whole thing. How would I go about updating from XML? Thanks again for all of the help.
January 4, 2012 at 2:22 am
atchristian (1/3/2012)
I created a clustered index per your suggestion. A side question before I dive into this more. With a table of ten million entries and growing by about 60,000 a day, would it be wise to reindex this table every night? More? Less?
Depends on how badly it fragments. Rebuild at around 30% logical fragmentation (that's a guideline, not a directive)
How would I go about updating from XML? Thanks again for all of the help.
Are you absolutely fixated on XML? If not, read up on table-type parameters.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 4, 2012 at 5:32 am
Have you tried using MERGE for the upsert, instead of your current two-stage process?
Shredding XML isn't instant - you may gain by separating this from the table changes too:
SELECT
TradeID = @IN_TradeID,
AnalysisTypeID = @IN_AnalysisTypeID,
VolumeTypeID = @IN_VolumeTypeID,
[Hour] = Volumes.Vols.value('./@Hr','int'),
[Value] = Volumes.Vols.value('./@Vol','numeric(19,2)')
INTO #TradeVolumes
FROM @IN_Volumes.nodes('/Volumes/Hr') as Volumes(Vols)
;MERGE INTO TradeVolumes AS Target
USING #TradeVolumes AS Source (TradeID, AnalysisTypeID, VolumeTypeID, [Hour], [Value])
ON Target.TradeID = Source.TradeID
AND Target.AnalysisTypeID = Source.AnalysisTypeID
AND Target.VolumeTypeID = Source.VolumeTypeID
AND Target.[Hour] = Source.[Hour]
WHEN MATCHED THEN
UPDATE SET [Value] = Source.[Value]
WHEN NOT MATCHED BY TARGET THEN
INSERT (TradeID, AnalysisTypeID, VolumeTypeID, [Hour], [Value])
VALUES (Source.TradeID, Source.AnalysisTypeID, Source.VolumeTypeID, Source.[Hour], Source.[Value])
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 5, 2012 at 9:10 am
Thanks for the merge tip. When I tried to create a new sproc, I got this error:
"TradeID" is not a recognized table hints option. If it is intended as a parameter to a table-valued function or to the CHANGETABLE function, ensure that your database compatibility mode is set to 90.
I looked online and I saw some notes about changing the DB level but that doesn't seem correct with me. (SQL 2008 btw)
Here is the sproc (pretty much a copy and paste):
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- EXEC mysp_TradeVolumes_I_Test -1, 1, 1, '<Volumes><Hr Hr="1" Vol="20"/><Hr Hr="2" Vol="50"/><Hr Hr="20" Vol="100"/><Hr Hr="24" Vol="50"/></Volumes>'
ALTER PROCEDURE [dbo].[mysp_TradeVolumes_I_Test]
@IN_TradeIDint,
@IN_AnalysisTypeIDint,
@IN_VolumeTypeIDint,
@IN_VolumesXML
AS
SELECT
TradeID = @IN_TradeID,
AnalysisTypeID = @IN_AnalysisTypeID,
VolumeTypeID = @IN_VolumeTypeID,
[Hour] = Volumes.Vols.value('./@Hr','int'),
[Value] = Volumes.Vols.value('./@Vol','numeric(19,2)')
INTO #TradeVolumes
FROM @IN_Volumes.nodes('/Volumes/Hr') as Volumes(Vols)
;MERGE INTO rwID_TradeVolumes AS Target
USING #TradeVolumes AS Source (TradeID, AnalysisTypeID, VolumeTypeID, [Hour], [Value])
ON Target.TradeID = Source.TradeID
AND Target.AnalysisTypeID = Source.AnalysisTypeID
AND Target.VolumeTypeID = Source.VolumeTypeID
AND Target.[Hour] = Source.[Hour]
WHEN MATCHED THEN
UPDATE SET [Value] = Source.[Value]
WHEN NOT MATCHED BY TARGET THEN
INSERT (TradeID, AnalysisTypeID, VolumeTypeID, [Hour], [Value])
VALUES (Source.TradeID, Source.AnalysisTypeID, Source.VolumeTypeID, Source.[Hour], Source.[Value])
January 5, 2012 at 9:23 am
Is the database compatibility mode set to 100?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 5, 2012 at 9:32 am
Not sure. Perhaps a column list isn't required with a table source
USING #TradeVolumes AS Source --(TradeID, AnalysisTypeID, VolumeTypeID, [Hour], [Value])
other than that, the syntax looks ok.
I'd recommend writing and testing outside the scope of the sproc, though - it makes debugging much easier.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 5, 2012 at 10:18 am
That worked! Thanks everyone for the help! Things have been smooth the last 24 hours!
January 5, 2012 at 10:20 am
atchristian (1/5/2012)
That worked! Thanks everyone for the help! Things have been smooth the last 24 hours!
Compatibility mode? Fantastic 😀
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 5, 2012 at 10:57 am
Sorry, I should have been more specific.
USING #TradeVolumes AS Source --(TradeID, AnalysisTypeID, VolumeTypeID, [Hour], [Value])
That little puppy did the job.
January 5, 2012 at 11:14 am
ChrisM@home (1/4/2012)
Have you tried using MERGE for the upsert, instead of your current two-stage process?Shredding XML isn't instant - you may gain by separating this from the table changes too:
Good call on the Merge Chris. I really need to get somewhere that's upgraded so I can get more familiar with the new tools, I keep missing some obvious things... sigh...
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 5, 2012 at 12:51 pm
Evil Kraig F (1/5/2012)
ChrisM@home (1/4/2012)
Have you tried using MERGE for the upsert, instead of your current two-stage process?Shredding XML isn't instant - you may gain by separating this from the table changes too:
Good call on the Merge Chris. I really need to get somewhere that's upgraded so I can get more familiar with the new tools, I keep missing some obvious things... sigh...
Thanks. I feel for you mate - I was on 2k for waaay too long.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 5, 2012 at 1:03 pm
atchristian (1/3/2012)
...I do NOT have a primary key set. The primary key would be: TradeID, AnalysisTypeID, VolumeTypeID, Hour...
If this is the edge case it appears to be, then you may benefit significantly from creating a unique clustered index on these four columns. I worked with something similar recently, a table with 4 fk's and a value (a table in a forecasting db). Data was accessed either by all four key columns, or by column 1 (TradeID), or by column 1 and 2 (TradeID, AnalysisTypeID). Notice that TradeID is the first column in each case (the "leading edge" of the index).
You get - for free - a tiny space requirement. However, this will only work well if there's never likely to be a requirement for a NC index on say VolumeTypeID, because non-clustered indexes include the clustered index keys - you would end up with a massive NC index.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply