January 18, 2011 at 10:21 pm
there is a kind of call log table(with 450k rows)
I would like to select only the lastest call record for each phone number
somehow the query processing time is too long, is there any way to improve it
select *
from tblCallLog r
where DateCall=
(select max(DateCall)
from tblCallLog
where phNu=r.phNu
group by phNu) and logStatus in(@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)
and DateCall >= @callFrom and DateCall < @callTo
order by DateCall
thx!
January 18, 2011 at 10:37 pm
[font="Comic Sans MS"][/font]
WITH S As (select max(DateCall) MaxDateCall from tblCallLog where phNu=r.phNu group by phNu)
S1 As ((select @a as status1 union select @b-2 union select @b2 union select @C union select @d union select @d2 union select @d3 union select @n union select @n2 union select @n3 union select @n4 union select @r union select @R2 union select @s-2 union select @s2 union select @U union select @w union select @n5)
select
*
from
tblCallLog r
inner join S1 on S1.status1 = r.logStatus
where
DateCall = S.MaxDateCall
And DateCall in between @callFrom and @callTo
order by DateCall
Try this one......If you send the query to create the tables and fill data in those then it will be quite easier to analyse the problem.
Regards
Deeptiprasad Nayak
January 18, 2011 at 11:11 pm
thank you for the reply
but something are not understand
like the union part, and why inner join with the logStatus.
WITH S As (select max(DateCall) MaxDateCall from tblCallLog where phNu=r.phNu group by phNu)
S1 As ((select @a as status1 union select @b-2 union select @b2 union select @C union
select @d union select @d2 union select @d3 union select @n union select @n2 union
select @n3 union select @n4 union select @r union select @R2 union select @s-2 union select @s2 union select @U union select @w union select @n5)
select *
from
tblCallLog r
inner join S1 on S1.status1 = r.logStatus
where
DateCall = S.MaxDateCall
And DateCall in between @callFrom and @callTo
order by DateCall
January 19, 2011 at 12:30 am
We need to minimize the records on which we are going to apply where conditions
Like:
Case-1: For max datecall, We are going to get only one record. But in ur query u r checking the same condition with almost all records.
Same as for logstatus. Evenif the status required is not there we are doing a check on that. If we filter it out before applying where clause then it will be faster.
one more thing u have to check....which query part is taking too much time.....
1. retrieving max datecall
2. logstatus
3. Date in between condition.
U need to check each part independently for analysis.
Can u provide the table with data???
January 19, 2011 at 1:06 am
Please post ddl and sqlplans.
Please see this article http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
January 19, 2011 at 1:12 am
Please provide the Table structure and what exact output you want.
January 19, 2011 at 9:39 am
Your query is using hidden RBAR which is why it is inefficient. You're better off using the Row_Number() function with the Partition By option to find the last call for each number. It wasn't clear whether you wanted to first apply the date filter and then find the max or vice versa.
WITH Calls AS (
SELECT *, Row_Number() OVER( PARTITION BY phNu ORDER BY DateCall DESC ) AS Sequence
FROM tblCallLog
WHERE logStatus IN (@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)
AND DateCall >= @callFrom
AND DateCall < @callTo
)
SELECT *
FROM Calls
WHERE Sequence = 1
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
January 19, 2011 at 4:50 pm
below is the table and some data
tblCallLog
id (int, not null)PK
DateCall(datetime, null)
logStatus(nchar(10), null)
phNu(int, not null)
pplName(nchar(100), null)
id, DataCall, logStatus, phNu, pplName
400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson
400124, 2011-01-10 15:16:00.000, b, 46555501, c white
400125, 2011-01-10 15:17:00.000, b, 38654926, k may
.........
.........
401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson
output
400124, 2011-01-10 15:16:00.000, b, 46555501, c white
400125, 2011-01-10 15:17:00.000, b, 38654926, k may
.........
.........
401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson
one phone number may appear one or few times in the log table
we are always want to know the lastest record for each phone number only
and we want to know the lastest records in the rang of date
also we're only interesting in some of log status
so i need to know all the leastest records first, then choose the rang of date and log status
it will be great, if anyone can tell me how to check which query part is taking too much time.....
-_-'' i just know the server i am working on is 2000, not 2005, so i can't use WITH in the query
January 19, 2011 at 5:25 pm
Is it safe to say that the latest call for any phone number will also correspond to the highest "id" for that phone number?
SELECT
log.id,
log.DateCall,
log.logStatus,
log.phNu,
log.pplName
FROM
tblCallLog AS log
JOIN
(
SELECT
MAX(grp.id) AS id
FROM
tblCallLog AS grp
GROUP BY grp.phNu
) gri
ON
log.id = gri.id
WHERE
log.logStatus in(@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)
AND
log.DateCall >= @callFrom and log.DateCall < @callTo
ORDER BY log.DateCall
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 19, 2011 at 6:04 pm
mister.magoo (1/19/2011)
quote]
No, the high 'id' doesn't mean the lastest call, in this case.
it named callLog, but the log is not added automatically, the only way to know is by dateCall
what is the different to use 'join' and 'where' in this?
thx!
January 20, 2011 at 4:43 pm
dlam 18073 (1/19/2011)
below is the table and some datatblCallLog
id (int, not null)PK
DateCall(datetime, null)
logStatus(nchar(10), null)
phNu(int, not null)
pplName(nchar(100), null)
id, DataCall, logStatus, phNu, pplName
400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson
400124, 2011-01-10 15:16:00.000, b, 46555501, c white
400125, 2011-01-10 15:17:00.000, b, 38654926, k may
.........
.........
401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson
output
400124, 2011-01-10 15:16:00.000, b, 46555501, c white
400125, 2011-01-10 15:17:00.000, b, 38654926, k may
.........
.........
401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson
one phone number may appear one or few times in the log table
we are always want to know the lastest record for each phone number only
and we want to know the lastest records in the rang of date
also we're only interesting in some of log status
so i need to know all the leastest records first, then choose the rang of date and log status
it will be great, if anyone can tell me how to check which query part is taking too much time.....
-_-'' i just know the server i am working on is 2000, not 2005, so i can't use WITH in the query
Since you are only interested in the latest calls, per phone number, and only if the latest calls are within a certain range and have certain statuses, then any number whose latest call is later than the top of your range isn't even worth considering -- in fact any number with any call falling later than you upper bound isn't worth considering.
Given that, there is no need to find the latest call for every phone number (nor even, any phone number). You just need to find the latest calls per phone number from within the subset of calls which meet your criteria. So, try this:
select *
from tblCallLog r
Where DateCall =
(
select max(DateCall)
from tblCall r1
where r1.phNu=r.phNu
and logStatus in (@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)
and DateCall >=@callFrom and DateCall < @callTo
and not exists
(
select 1
FROM tblCallLog
WHERE phNu=r1.phNu
AND DateCall > @callTo
)
group by phNu
)
January 20, 2011 at 5:12 pm
yes, ur right i only want the phone number within the rang,
so DateCall >=@callFrom and DateCall < @callTo
is alreadly set the rang,
one thing here iam not understand, do we need to use
not exists
(
select 1
FROM tblCallLog
WHERE phNu=r1.phNu
AND DateCall > @callTo
)
to set the upbound again?, or they are doing different things?
January 20, 2011 at 6:21 pm
dlam 18073 (1/20/2011)
yes, ur right i only want the phone number within the rang,so
DateCall >=@callFrom and DateCall < @callTo
is alreadly set the rang,
one thing here iam not understand, do we need to use
not exists
(
select 1
FROM tblCallLog
WHERE phNu=r1.phNu
AND DateCall > @callTo
)
to set the upbound again?, or they are doing different things?
They are doing different things. without that clause, you will find all of the latest calls within that range for each phone number which has a call within that range, but that will include phone numbers which have calls which are later than that range -- using the clause ensures that you only get phone numbers whose latest calls are within the range.
For example, say there are only three phone calls in your database -- two from one number and one from another number:
400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson
400124, 2011-01-10 15:16:00.000, b, 46555501, c white
401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson
Now, let's say that the date range you are interested in is: 2011-01-10 00:00:00 to 2011-01-11 00:00:00
With that clause in there, your query should return only one call entry:
400124, 2011-01-10 15:16:00.000, b, 46555501, c white
No entry would be returned for phone number 31695555 because, though it had a call within your specified range, it's latest call is outside your range.
If you remove that clause, you should get two entries:
400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson
400124, 2011-01-10 15:16:00.000, b, 46555501, c white
Both calls are within the range you specified and both are, for their respective phone numbers, the latest calls within that range. But you indicated that you were only interested in the latest calls for the phone numbers, and then only if they met the other criteria. The latest call for 31695555 is:
401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson
which falls outside of the range -- as I understood your posts, you would not be interested in seeing any entries for that number because its latest call is not within the range.
- Les
January 20, 2011 at 10:22 pm
DateCall >=@callFrom and DateCall < @callTo
<<< does this part of code already set the range of dateCall
without
not exists
(
select 1
FROM tblCallLog
WHERE phNu=r1.phNu
AND DateCall > @callTo
)
also it select all of the max date call with in the range over all of the max date call
for example:
test data
402111, 2010-01-10 15:15:00.000, n, 32226001, b smith
402222, 2010-03-21 15:16:00.000, n, 32226001, b smith
402333, 2010-10-19 15:17:00.000, n, 32226001, b smith
402444, 2010-11-21 16:01:00.000, n, 32226001, b smith
with this query
select *
from tblCallLog r
Where DateCall =
(
select max(DateCall)
from tblCall r1
where r1.phNu=r.phNu
and logStatus in (@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)
and DateCall >=@callFrom and DateCall < @callTo
group by phNu
)
call date range
DateCall >='2010-10-01' and DateCall < '2010-11-30 23:59'
the outcome is >> 402444, 2010-11-21 16:01:00.000, n, 32226001, b smith
but is there any different between puting the where clause in the sub-query and out side the sub-query
January 21, 2011 at 7:46 am
dlam 18073 (1/20/2011)
but is there any different between puting the where clause in the sub-query and out side the sub-query
Absolutely. The WHERE clause in the main query would be equivalent to a HAVING clause in the subquery. In both cases, the filter is applied on the aggregate, but the WHERE clause in the subquery the filter is applied before aggregating.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply