August 31, 2009 at 1:16 am
Hi all,
In my database, there is transaction table with two million records inserting every day.
So currently I am taking backup of that table every day and deleting those records to reduce number of records for imporving performance.
Is there anything that I can do apart from this.?
Is table partitioning helpful in this case?
ResultID bigint (PK)
ScheduleUserID bigint
TopicID bigint
QuestionID bigint
MaxScore numeric
Score numeric
Attempted bit
HintsUsed tinyint
AnswerCorrect bit
QuestionSerialNumberint
ManuallyEvaluated bit
EvaluatedDate datetime
MailSentAfterOfflineEvaluation bit
August 31, 2009 at 1:54 am
Do you have indexes to support the frequently run queries? Can the queries use those indexes?
2 million rows isn't that much. SQL can handle that with ease. If you're having performance problems, can you be more specific about what the problem is.
Partitioning may help, or it may not. Depends on the problems you're having and the specifics of your system.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 31, 2009 at 2:13 am
Thanks Gail,
There are couple of indexes which will boost query performance.
But daily there will be upto two million records insertion. This will become huge in couple of weeks.
August 31, 2009 at 3:32 am
I've worked with unpartitioned tables with over 150 million rows in. That's equivalent to 2 and a half months of your data. It's manageable and a few million is not beyond SQL's capabilities.
What help are you looking for? There's not enough information here to make recommendations regarding partitioning. There's not enough info to help you with performance problems.
What's the required retention period of this data?
Where are you encountering performance problems? Insert, querying or deleting?
What are the common queries against the table (and which, if any, are slow)
What does the data look like?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 31, 2009 at 4:03 am
Thanks Gail,
Ours is a online assessment application where students will take tests everyday.
Suppose there are 100 questions in one test, then 100 records will get inserted to the mentioned table.
Table structure
------------------
ResultID-bigint (pk)
ScheduleUserID-bigint (nonclustered index)
TopicID-bigint
QuestionID-bigint (nonclustered index)
MaxScore-numeric
Score-numeric
Attempted-bit
HintsUsed-tinyint
AnswerCorrect-bit
QuestionSerialNumber-int
ManuallyEvaluated-bit
EvaluatedDate-datetime
MailSentAfterOfflineEvaluation-bit
If 100000 students take the test, there will be 100000 X 100 = 1,00,00,000 rows will get inserted at the starting of test and 50% of rows will be updated once test is submitted. This updation is for the score that the student has taken for each question.
After this student can view his/her report by selecting the data from "Results" table.
This is an itterative process for 25 to 50 tests.
Student has option to check the result of any of his/her test.
And test will be conducted for every week. The concurrency level is 1500 users.
1. Problem is while udpating, its taking lot of time
2. Common queries are update and select
3. Can i have a denormalized table for viewing previous results?
Please let me know anything else requried here.
August 31, 2009 at 4:06 am
One more thing I would like to add...
There will be one new test for one week.
Tests will be conducted throughout the week.
August 31, 2009 at 4:21 am
You still haven't mentioned the retention period. How long does the data have to be kept? Is older data changed, or is it read-only?
You say updates are slow. Have you checked the execution plans? Are they using indexes?
Is there blocking? Are there waits and if so on what resource?
Are the selects slow? If so, what does the exec plan look like?
Yes, I'm sure it's possible to have a denormalised table for historical results. Maybe a job once a week to update one will work?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 31, 2009 at 4:36 am
Retention period will be 2 months.
Selection of records is little bit slower, but this can be overcome by having one denormalized table.
and here is the execution plan
August 31, 2009 at 4:38 am
just my 2ct:
- your rows indeed aren't that wide, so this number of rows should be manageble.
- does your table have a clustering index (I guess it does, if you created the pk just using the defaults)
- My guess, you only indexed the FKs for this table and off course the (identity or so) PK.
(why isn't TopicID indexed)
- Just a shot in the dark : you need an index on EvaluatedDate to determine "previous"
- As Gail stated, we need "typical" (update) queries to assist on these performance troubles.
- other factors that will influence performance :
-- how does your application connect (isolation level !)
-- Are you avoiding cursors ( if not, what kind of cursors are being used)
-- are the bit-columns used for filtering ?
-- what's of box is this hosted on (#cores / disks / ram)
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
August 31, 2009 at 4:57 am
another guess:
Wat's the relation for ScheduleUserID to the actual user and the actual test (s)he is conducting ?
Can a person take the same test more than once ?
You update query is missing the "conducting test" filter if these are not 1 to 1 !
Provide as much filer info as you can to assist your query !
e.g.
- topicid
- quesionid
- EvaluatedDate (IX !)
UPDATE R
SET score = T.Score
, attempted = 1
, AnswerCorrect = T.Answercorrect
FROM results R with ( nolock )
INNER JOIN (
select ( CASE WHEN QA.Choice = URT.UserResponse
AND QA.IsCorrect = 1 THEN QA.Score
ELSE QA.NegativeScore
END ) Score
, ( CASE WHEN QA.Choice = URT.UserResponse
AND QA.IsCorrect = 1 THEN 1
ELSE 0
END ) AnswerCorrect
, Q.Questionid
, URT.ScheduleUserID
, URT.UserResponse
from [UserResponse_Temp] URT with ( nolock )
inner join Question Q with ( nolock )
on URT.QuestionGUID = Q.QuestionGUID
INNER JOIN QuestionAnswers_temp QA with ( nolock )
ON QA.QuestionID = Q.QuestionId
where QA.Choice = URT.UserResponse
and URT.ScheduleUserID = @ScheduleUserID
/* here and extra index may help out ! */
and [UserResponse_Temp].EvaluatedDate >= dateadd(d,datediff(d, 0,getdate())-1 ,0) -- -1 added to avoid datechange issues
) T
ON R.scheduleuserid = T.ScheduleUserID
and R.questionid = T.questionid
AND R.scheduleuserid = @ScheduleUserID
/* an extra index may help out ! */
and R.EvaluatedDate = T.EvaluatedDate
/* an extra index may help out ! */
where R.EvaluatedDate >= dateadd(d,datediff(d, 0,getdate()) -1 ,0) -- -1 added to avoid datechange issues
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
August 31, 2009 at 4:58 am
That exec plan looks pretty optimal. Not much that can be done other than widening indexes to get rid of the lookups.
How long is that taking?
Is there blocking?
What are the IO and Time statistics for it?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 31, 2009 at 5:04 am
Hi,
This is the update statement
----------------------
UPDATE results SET score = T.Score,
attempted = 1,
AnswerCorrect = T.Answercorrect
FROM results R with(nolock)
INNER JOIN
(
select
(CASE WHEN QA.Choice = URT.UserResponse AND QA.IsCorrect=1 THEN QA.Score ELSE QA.NegativeScore END) Score,
(CASE WHEN QA.Choice = URT.UserResponse AND QA.IsCorrect=1 THEN 1 ELSE 0 END) AnswerCorrect,
Q.Questionid,URT.ScheduleUserID,URT.UserResponse from
[UserResponse_Temp] URT with(nolock)
inner join Question Q with(nolock) on URT.QuestionGUID=Q.QuestionGUID
INNER JOIN QuestionAnswers_temp QA with(nolock) ON QA.QuestionID=Q.QuestionId
where URT.ScheduleUserID = @ScheduleUserID AND QA.Choice = URT.UserResponse
)
T ON R.scheduleuserid = T.ScheduleUserID AND
R.scheduleuserid=@ScheduleUserID
and R.questionid = T.questionid
--
Table is having clustered index on primary key
As you said we can have index on Topic ID, but EvaluatedDate is not currently being used.
1. Isolation level - Read Commited
2. There are no cursors
3. Bit columns are used for filtering
4. The db is deployed on amazon cloud server. ( 15 GB RAM and 4 Core processor)
And this is the select statement
-----------------
SELECT DISTINCT T.SectionID,T.SectionOrder,TB.BlockName,
(SELECT COUNT(QuestionId) FROM Results WHERE QuestionId IN(SELECT QuestionId FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration
WHERE SectionId=TB.BlockId))AND ScheduleUserId=@ScheduleUserId)AS QuestionCount,
(SELECT COUNT(QuestionId) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId
FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration
WHERE SectionId=TB.BlockId))AND Attempted=1) AS SectionNoQuestionsAttempted,
(SELECT COUNT(QuestionId) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId
FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration WHERE
SectionId=TB.BlockId)) AND Attempted=0 ) AS SectionNoQuestionsNotAttempted,
(SELECT COUNT(QuestionId) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId
FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration
WHERE SectionId=TB.BlockId)) AND AnswerCorrect=1 ) AS SectionCorrectAnswered,
(SELECT SUM(MaxScore) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId
FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration
WHERE SectionId=TB.BlockId))) AS SectiontotalMarks,
(SELECT SUM(Score) FROM Results WHERE ScheduleuserId=@ScheduleuserId AND QuestionId IN(SELECT QuestionId
FROM TblAssessmentQuestion WHERE ConfigurationId IN(SELECT ConfigurationId FROM TblAssessmentConfiguration
WHERE SectionId=TB.BlockId))) AS MarksScored
FROM tblassessmentconfiguration T
INNER JOIN tblAssessmentBlock TB ON T.SectionID=TB.BlockID
WHERE T.AssessmentID=@AssessmentId
GROUP BY SectionId,SectionOrder,BlockName,TB.BlockId
ORDER BY SectionOrder
August 31, 2009 at 5:35 am
this is the time and io statistics
Table 'ScheduleUser'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'ScheduleDetails'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#3A4CA8FD'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentConfiguration'. Scan count 1, logical reads 1814, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentQuestion'. Scan count 18, logical reads 5418, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Results'. Scan count 18, logical reads 3447, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentBlock'. Scan count 0, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 546 ms, elapsed time = 559 ms.
Table '#3A4CA8FD'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#3B40CD36'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentBlock'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#3B40CD36'. Scan count 0, logical reads 75, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'QuestionAnswers'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Question'. Scan count 0, logical reads 163, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentQuestion'. Scan count 75, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentBlock'. Scan count 0, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentConfiguration'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 4 ms.
Table '#3B40CD36'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 8 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 562 ms, elapsed time = 571 ms.
August 31, 2009 at 6:56 am
manohar (8/31/2009)
this is the time and io statisticsTable 'ScheduleUser'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'ScheduleDetails'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#3A4CA8FD'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentConfiguration'. Scan count 1, logical reads 1814, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentQuestion'. Scan count 18, logical reads 5418, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Results'. Scan count 18, logical reads 3447, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAssessmentBlock'. Scan count 0, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 546 ms, elapsed time = 559 ms.
Well this looks like the problematic part, and this does not match the execution plan that you posted earlier. In fact, none of these execution characteristics appear to match the execution plan that you posted earlier (which has tables UserResponse_Temp, Results, QuestionAnswersTemp and Question)
Can you post the execution plan that matches these execution characteristics?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 31, 2009 at 7:26 am
can this be your Select query ... a little bit rewritten to avoid the row level count/sum selects ?
SELECT T.SectionID
, T.SectionOrder
, TB.BlockName
, COUNT( distinct TheTests.QuestionId) as QuestionCount
, sum(TheTests.SectionNoQuestionsAttempted) as SectionNoQuestionsAttempted
, sum(TheTests.SectionNoQuestionsNotAttempted) as SectionNoQuestionsNotAttempted
, sum(TheTests.SectionCorrectAnswered) as SectionCorrectAnswered
, SUM(TheTests.MaxScore) as SectiontotalMarks
, SUM(TheTests.Score) as MarksScored
, TheTests.ScheduleuserId
FROM tblassessmentconfiguration T
INNER JOIN tblAssessmentBlock TB
ON T.SectionID = TB.BlockID
/* can be altered to INNER JOIN if you don't want scheduleuserid that didn't conduct tests as nulls */
left join (
SELECT CI.SectionId, R.ScheduleuserId, R.QuestionId, R.MaxScore, R.Score
, sum( case Attempted when 1 then 1 else 0 end )SectionNoQuestionsAttempted
, sum( case Attempted when 0 then 1 else 0 end ) as SectionNoQuestionsNotAttempted
, sum( case AnswerCorrect when 1 then 1 else 0 end ) AS SectionCorrectAnswered
FROM Results R
inner join TblAssessmentQuestion TAQ
on TQA.QuestionId = R.QuestionId
inner join ConfigurationId CI
on CI.ConfigurationId = TAQ.ConfigurationId
WHERE ScheduleuserId = @ScheduleuserId
group by CI.SectionId, R.ScheduleuserId, R.QuestionId, R.Score ) TheTests
on TheTests.SectionId = TB.BlockId
WHERE T.AssessmentID = @AssessmentId
GROUP BY SectionId
, SectionOrder
, BlockName
, TB.BlockId
ORDER BY SectionOrder
If it is... I think it is more readable and may outperform your initial one 😉
Then again... I may be wrong :Whistling:
edited:
and maybe it would even be better to inclode TB.BlockId in your result just in case you get duplicate BlockNames ! You may even get some advantage with selecting min(BlockName) in stead of adding it to the group by.
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply