January 18, 2005 at 5:00 pm
I am working on a project and need to optimize the query below. It's taking 30 or so seconds to run. I am not a sql expert and am looking for any tips some of the pros here might have looking at the design of the query below - perhaps there are obvious better ways of designing to enhance performance. I know it's a bit lengthy and complex - so it may be hard to provide simple guidance .. but I thought I'd give it a shot here and see if someone can at least point me in a direction of how to rewrite or reapproach. THX.
Danielle
---------------
SELECT ST.student_id, ST.last_name + ', ' + ST.first_name + '(' + ST.grade + ')'
AS Name, R.program_provider_activity_id, PPA.program_activity_name
+ ' Currently Reg : '
+ (CASE WHEN (R.registration_is_active = 1)
THEN 'Yes' ELSE ' No' END) AS program_activity_name, R.program_id,
(SELECT SUM(datediff(hh, s.session_start_time, s.session_end_time))
FROM session S
WHERE s.session_id IN
(SELECT a.[session_id]
FROM [Attendance] A
WHERE a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN
(SELECT [session_id]
FROM [Session]
WHERE [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))
AS TotalHours,
(SELECT COUNT(DISTINCT (session_start_time))
FROM session S
WHERE s.session_id IN
(SELECT a.[session_id]
FROM [Attendance] A
WHERE a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN
(SELECT [session_id]
FROM [Session]
WHERE [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))
AS TotalDays,
(SELECT COUNT(DISTINCT (datepart(ww, session_start_time)))
FROM session S
WHERE s.session_id IN
(SELECT a.[session_id]
FROM [Attendance] A
WHERE a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN
(SELECT [session_id]
FROM [Session]
WHERE [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))
AS TotalWeeks
FROM dbo.ProgramProviderActivity PPA INNER JOIN
dbo.Registration R INNER JOIN
dbo.Student ST ON ST.student_id = R.student_id ON PPA.program_provider_activity_id =
R.program_provider_activity_id
ORDER BY ST.last_name + ' ' + ST.first_name, ST.grade
--------------------------
January 18, 2005 at 8:25 pm
For starters the subqueries you're using for TotalHours, TotalDays and TotalWeeks
(SELECT COUNT(DISTINCT (datepart(ww, session_start_time))) FROM session S WHERE s.session_id IN (SELECT a.[session_id] FROM [Attendance] A WHERE a.[attendence_status] = 1 AND a.[student_id] = ST.student_id AND a.[session_id] IN (SELECT [session_id] FROM [Session] WHERE [program_provider_activity_id] = R.program_provider_activity_id AND [program_id] = R.program_id)))
are practically the same. This means it's reading through those tables multiple times. I'd see about making that a single query and putting it in the from clause as a derived table instead. You might get a better execution plan that way.
--------------------
Colt 45 - the original point and click interface
January 20, 2005 at 5:05 pm
I'm not quite sure about what all is in your tables.. but Phill is correct.. avoid the correlated subqueries.. go for derived tables instead.
Sometimes (for debugging), it's helpful to just create a #temp table showing the TOTALS you wish to see, grouped by Student, Provider, etc.. then you can JOIN to that #temp table.
To use a derived table in the JOIN try this: (you need to fix the rest of the JOIN since I don't know how you're dealing with SessionID, but this should get you on the right track...
SELECT
st.student_id,
Name = st.last_name + ', ' + st.first_name + '(' + st.grade + ')' ,
r.program_provider_activity_id,
program_activity_name = ppa.program_activity_name + ' Currently Reg : ' + (CASE WHEN r.registration_is_active = 1
THEN 'Yes' ELSE ' No' END),
r.program_id,
TotalHours = dt.TotalHours,
TotalDays = dt.TotalDays,
TotalWeeks = dt.TotalWeeks
FROM dbo.ProgramProviderActivity ppa
INNER JOIN dbo.Registration r
ON r.program_provider_activity_id = ppa.program_provider_activity_id
INNER JOIN dbo.Student st
ON st.student_id = r.student_id
INNER JOIN (SELECT
a.Session_ID,
a.Student_ID,
s.Program_Provider_Activity_ID,
s.Program_ID,
Hrs = SUM(DATEDIFF(hh, s.session_start_time, s.session_end_time)),
Days = COUNT(DISTINCT(session_start_time)),
Weeks = COUNT(DISTINCT(DATEPART(ww, session_start_time)))
FROM Session s
INNER JOIN Attendance a
ON a.session_id = s.session_id
WHERE a.Attendence_Status = 1
GROUP BY a.Session_ID, a.Student_ID, s.Program_Provider_Activity_ID, s.Program_ID ) dt
ON dt.StudentID = r.student_id
AND dt.SessionID = ??
AND dt.Program_Provider_Activity_ID = ppa.Program_Provider_Activity_ID
AND ...
AND ...
Consider leaving the ORDER BY out.. it only causes delays...
hope this helps..
Mark Gelatt
January 20, 2005 at 5:08 pm
Sorry 'bout that.. I changed the name of the fields while I was writing.. you should use
TotalHours = dt.Hrs,
etc.. etc..
Just see what I called the columns in the derived table.
January 21, 2005 at 7:32 am
Barring me having botched order here since I don't have anything to test against this should be pretty colse to what you are after.
SELECT
ST.student_id,
ST.last_name + ', ' + ST.first_name + '(' + ST.grade + ')' AS [Name],
R.program_provider_activity_id, PPA.program_activity_name + ' Currently Reg : ' + (CASE WHEN (R.registration_is_active = 1) THEN 'Yes' ELSE ' No' END) AS program_activity_name,
R.program_id,
SUM(datediff(hh, s.session_start_time, s.session_end_time)) TotalHours,
COUNT(DISTINCT (session_start_time)) TotalDays,
COUNT(DISTINCT (datepart(ww, session_start_time))) TotalWeeks
FROM
dbo.ProgramProviderActivity PPA
INNER JOIN
dbo.Registration R
INNER JOIN
dbo.Student ST
ON
ST.student_id = R.student_id
ON
PPA.program_provider_activity_id = R.program_provider_activity_id
LEFT JOIN
dbo.Attendance A
INNER JOIN
dbo.Session S
ON
S.program_provider_activity_id = R.program_provider_activity_id AND
S.program_id = R.program_id AND
S.Session_id = A.Session_id
ON
A.attendence_status = 1 AND
A.student_id = ST.student_id
GROUP BY
ST.student_id,
ST.last_name + ', ' + ST.first_name + '(' + ST.grade + ')',
R.program_provider_activity_id, PPA.program_activity_name + ' Currently Reg : ' + (CASE WHEN (R.registration_is_active = 1) THEN 'Yes' ELSE ' No' END),
R.program_id
-- Noter may have to do some work still to get Order By right because of Group By
ORDER BY
ST.last_name + ' ' + ST.first_name,
ST.grade
January 21, 2005 at 9:46 am
Thank you all very much - I will try to implement these revisions and report back. Much appreciated.
Danielle
January 21, 2005 at 3:35 pm
You also should run Profiler and then a trace with this query. After that use Index Wizard to see if it finds any new indexes, etc.
Quand on parle du loup, on en voit la queue
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply