March 3, 2008 at 1:10 am
table have 10 million records
the following my statement :
(SELECT COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount, RecordedCalls.Channel AS [Name],
RecordedCalls.ServerName FROM RecordedCalls INNER JOIN Servers ON RecordedCalls.ServerName = Servers.Name LEFT OUTER JOIN
Tags INNER JOIN RecordedCallsTags ON Tags.ID = RecordedCallsTags.TagID ON RecordedCalls.ID = RecordedCallsTags.CallID
WHERE RecordedCalls.ID <= '9369907' AND
(RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime ) AND
cast ( '01 Jan 2210 00:00:00:000' as datetime )) AND
(RecordedCalls.Duration BETWEEN 0 AND 1000000) AND RecordedCalls.AgentID NOT IN('1000010000') AND RecordedCalls.IsDeleted='false'
GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName
ORDER BY RecordedCalls.Channel)
Thanks in advance
March 3, 2008 at 2:04 am
I'm curious about some of the conditions, like
RecordedCalls.CallDate BETWEEN cast('01 Jan 1910 00:00:00:000' as datetime)
AND cast('01 Jan 2210 00:00:00:000' as datetime)
Do you really need this? Or is this part of your schema just as an example?
Similarly the
RecordedCalls.Duration BETWEEN 0 AND 1000000
What is the purpose of these checks?
Also, you have an inner join with RecordedCallsTags. Are you using this to narrow the return row set? If not, then why is it there? (you are not referencing it)
What is the datatype of RecordedCalls.ID? You are comparing it with a string literal. Is it numeric?
Regards,
Andras
March 3, 2008 at 2:41 am
-the condition date, Duration and another conditions based on User criteria i got it from executing another SP inside my SP then i concat the result inside my statement put i sent the result statement (where conditions from executing another sp)
- i am using distinct RecordedCalls.ID FK with RecorededCallsTags.CallID
- RecordedCalls.ID [int]
* I need each relation in the statement in case i got another conditions
March 3, 2008 at 3:44 am
anyone!!
March 3, 2008 at 4:32 am
You're not going to get anywhere by shouting. I don't even see a question in your original post, so what is it that you're asking?
Please start off by posting the DDL of your table(s) in the form of CREATE TABLE statements, and some sample data in the form of INSERT statements. If your query is producing the wrong result set, then please show what you would like it to do instead.
John
March 3, 2008 at 4:53 am
first things to check:
- this query comes within brackets ... is it a kind of subquery or nested table expression ?
SELECT COUNT(DISTINCT ( RecordedCalls.ID )) AS CallsCount
, RecordedCalls.Channel AS [Name]
, RecordedCalls.ServerName
FROM RecordedCalls
INNER JOIN Servers
ON RecordedCalls.ServerName = Servers.Name
LEFT OUTER JOIN Tags
INNER JOIN RecordedCallsTags
ON Tags.ID = RecordedCallsTags.TagID
ON RecordedCalls.ID = RecordedCallsTags.CallID
WHERE RecordedCalls.ID <= '9369907'
AND ( RecordedCalls.CallDate BETWEEN cast('01 Jan 1910 00:00:00:000' as datetime)
AND cast('01 Jan 2210 00:00:00:000' as datetime) )
AND ( RecordedCalls.Duration BETWEEN 0 AND 1000000 )
AND RecordedCalls.AgentID NOT IN ( '1000010000' )
AND RecordedCalls.IsDeleted = 'false'
GROUP BY RecordedCalls.Channel
, RecordedCalls.ServerName
ORDER BY RecordedCalls.Channel
- Get rid of unneeded predicates / joins
- provide indexes for all join predicates where appropriate
- within a sproc... use parameters of the correct datatype (match your tables /views) ( avoid casting in your query ! )
- maybe indexes for some extra columns are needed:
.... RecordedCalls.CallDate
.... RecordedCalls.AgentID
.... RecordedCalls.ServerName + RecordedCalls.Channel
- try to replace the 'not in' with <> or !=
if the in-list is long, maybe you'd be better of replacing it with a
@tmpTb using a split-function (examples at SSC ) using a "not exists"
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
March 4, 2008 at 8:56 am
1) Assuming RecordedCalls is the table with 10 million records, and given this table has a DateTime column (CallDate) that you're filtering on in your WHERE clause, why don't you convert RecordedCalls to a partitioned table with the partition based on CallDate's year?
2) I noticed you're doing a LEFT OUTER JOIN on the Tags table without using an ON clause. This effectively performs a cross join between RecordedCalls and Tags. Is this really what you want to do?
3) Good luck! You'll need it!
March 4, 2008 at 10:05 am
His joins all have an ON clause. He has something like join-on-join-join-on-on.
March 4, 2008 at 10:11 am
what purpose does this serve?
LEFT OUTER JOIN Tags
INNER JOIN RecordedCallsTags
ON Tags.ID = RecordedCallsTags.TagID
ON RecordedCalls.ID = RecordedCallsTags.CallID
You are not selecting anything from it, and the DISTINCT might not be needed if you remove the extra joins.
March 4, 2008 at 10:25 am
All (especially Ralph):
There are three JOIN clauses and only two ON clauses. Each JOIN clause should have a corresponding ON clause.
Again, good luck all!!!!
March 5, 2008 at 1:42 am
Thanks for response
The table RecordedCalls contains 1000,000,000 Records, the other tables used in the join will not contain more than 15 Records for each one.
The table lookups contains like 200 Records so that’s why I put the records I want from it in a temp table (#tempLookUps) before joining it with the table RecordedCalls.
there is a clustered index (Primary key) on the RecordedCalls.ID and non-clustered indexes on the columns that are used in the Where statement and the group by field (CallType), I can’t remove any join with other tables or any condition on the where statements hence it is very dynamic and concatenated from other strored procedure and i can't remove the DISTINCT Word.
I found that when using temp tables to put the results in then imply joining on then is more efficient than using Derived Tables.
and i tired to partition the recordedcalls table based on CallDate's year but i coud'nt coz i am using sql a standard edition, i got error msg tell that the partition should be on enterprise edition is there any help here!!
the following SP takes around 1:15 minutes to finish execution against SQL Server 2005.
DECLARE @max-2 int
SELECT @max-2 = MAX(RecordedCalls.ID) FROM RecordedCalls
CREATE TABLE #tempLookups (ID int identity(0,1),Code NVARCHAR(100),NameE NVARCHAR(500),NameA NVARCHAR(500))
CREATE TABLE #tempTable (ID int identity(0,1),TypesCount INT,CallsType NVARCHAR(50))
INSERT INTO #tempLookups SELECT Code, NameE, NameA FROM lookups WHERE [Type] = 'CALLTYPES' ORDER BY Ordering ASC
INSERT INTO #tempTable SELECT COUNT(DISTINCT(RecordedCalls.ID)) As TypesCount,RecordedCalls.CallType as CallsType
FROM Servers INNER JOIN RecordedCalls ON Servers.Name = RecordedCalls.ServerName
LEFT OUTER JOIN Tags INNER JOIN RecordedCallsTags ON Tags.ID = RecordedCallsTags.TagID
ON RecordedCalls.ID = RecordedCallsTags.CallID
WHERE RecordedCalls.ID <= @max-2
AND (RecordedCalls.CallDate BETWEEN CAST ('01 Jan 1910 00:00:00:000' AS DATETIME ) AND CAST('01 Jan 2210 00:00:00:000' AS DATETIME ))
AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)
AND RecordedCalls.AgentID NOT IN('1000010000')
AND RecordedCalls.IsDeleted='FALSE'
GROUP BY RecordedCalls.CallType
SELECT IsNull(#tempTable.TypesCount, 0) AS TypesCount, CASE('English')
WHEN 'Arabic' THEN #tempLookups.NameA
ELSE #tempLookups.NameE
END AS CallsType FROM
#tempTable RIGHT OUTER JOIN #tempLookups ON #tempTable.CallsType = #tempLookups.Code
DROP TABLE #tempLookups
DROP TABLE #tempTable
March 5, 2008 at 9:41 am
JLSSCH (3/4/2008)
All (especially Ralph):There are three JOIN clauses and only two ON clauses. Each JOIN clause should have a corresponding ON clause.
Again, good luck all!!!!
Nope, the joins are correctly specified, Ralph is spot on...
SELECT
COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount,
RecordedCalls.Channel AS [Name],
RecordedCalls.ServerName
FROM RecordedCalls
INNER JOIN Servers
ON RecordedCalls.ServerName = Servers.[Name]
INNER JOIN RecordedCallsTags
ON RecordedCalls.ID = RecordedCallsTags.CallID
LEFT OUTER JOIN Tags
ON Tags.ID = RecordedCallsTags.TagID
WHERE RecordedCalls.ID <= '9369907'
AND (RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime ) AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))
AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)
AND RecordedCalls.AgentID NOT IN('1000010000')
AND RecordedCalls.IsDeleted='false'
GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName
ORDER BY RecordedCalls.Channel
... and he's also correct in suggesting that the TAGS table could be omitted since it's not referenced in the SELECT, and it's a left join.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 5, 2008 at 10:19 am
asmaa.sbaihi, you should run some experiments to establish exactly what you need in your query.
Something along the lines of...
This is your baseline, run it and record the results:
SELECT COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount,
RecordedCalls.Channel AS [Name],
RecordedCalls.ServerName
FROM RecordedCalls
INNER JOIN Servers
ON RecordedCalls.ServerName = Servers.[Name]
INNER JOIN RecordedCallsTags
ON RecordedCalls.ID = RecordedCallsTags.CallID
LEFT OUTER JOIN Tags
ON Tags.ID = RecordedCallsTags.TagID
WHERE RecordedCalls.ID <= '9369907'
AND (RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime )
AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))
AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)
AND RecordedCalls.AgentID NOT IN('1000010000')
AND RecordedCalls.IsDeleted='false'
GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName
ORDER BY RecordedCalls.Channel
Then comment out the Tags table and run again...
SELECT COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount,
RecordedCalls.Channel AS [Name],
RecordedCalls.ServerName
FROM RecordedCalls
INNER JOIN Servers
ON RecordedCalls.ServerName = Servers.[Name]
INNER JOIN RecordedCallsTags
ON RecordedCalls.ID = RecordedCallsTags.CallID
--LEFT OUTER JOIN Tags
--ON Tags.ID = RecordedCallsTags.TagID
WHERE RecordedCalls.ID <= '9369907'
AND (RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime )
AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))
AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)
AND RecordedCalls.AgentID NOT IN('1000010000')
AND RecordedCalls.IsDeleted='false'
GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName
ORDER BY RecordedCalls.Channel
Experiment with the transaction isolation level, it may not affect your results, and if not, your query will run a little faster...bring in table aliases too, it usually makes the code more readable
SELECT COUNT(DISTINCT(rc.ID)) AS CallsCount,
rc.Channel AS [Name],
rc.ServerName
FROM RecordedCalls rc (NOLOCK)
INNER JOIN Servers s (NOLOCK)
ON rc.ServerName = s.[Name]
INNER JOIN RecordedCallsTags rct (NOLOCK)
ON rc.ID = rct.CallID
--LEFT OUTER JOIN Tags
--ON Tags.ID = RecordedCallsTags.TagID
WHERE RecordedCalls.ID <= '9369907'
AND (rc.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime )
AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))
AND (rc.Duration BETWEEN 0 AND 1000000)
AND rc.AgentID NOT IN('1000010000')
AND rc.IsDeleted='false'
GROUP BY rc.Channel, rc.ServerName
ORDER BY rc.Channel
Comment out the other two joins one at a time and again compare results with the original query. Run the original query again because your result may have changed while you've been working.
This is just a start but hopefully it will point you in the right direction, as Andras and others have shown, there are several issues with this query and each needs to be addressed individually.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 5, 2008 at 1:47 pm
Have you looked at the explain plan, statistics, Memory usage, etc.
Actually have you looked at the performance of the machine? This isn't the most complex query I have ever seen, but you are talking about a fair amount of data.
You mentioned that it was standard edition, but CPU's, Memory, Disks? How much. How is the system configured.
In general, this query isn't really written poorly assuming that the underlying system supports it.
the DDL of the table was requested. 100,000,000 rows in a clustered table could be 10g of data, or it could be 500G of data. 500G grouped sorted etc without indexing in 1:15 is not that unreasonable.
March 5, 2008 at 1:55 pm
The more I look at this, there is almost nothing that you are going to be able to do to help this query. The only suggestion I have for you is
an index
Create index xxx on RecordedCalls(CallType,ID)
SELECT COUNT(DISTINCT(RecordedCalls.ID)) As TypesCount,RecordedCalls.CallType as CallsType
...
group by RecordedCalls.CallType
That's about the only thing that you can count on being useful since this is obviously a dynamically generated query.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply