November 19, 2008 at 2:33 pm
I have the following columns in a table: AccountNumber, TelephoneNumber, DateCalled.
I'd like to find and show all of the occurrences where the same AccountNumber and TelephoneNumber were called on the same date. There is no primary key on the table. There should only be one telephone call per day, but I want to list out the info when there are more than one and include all three columns in the list. Since DateCalled is a datetime to the seconds position there are no occurrences where two rows woudl exist with exactly the same datetime down to the second.
November 19, 2008 at 3:51 pm
Hi,
I have written the following query using the temp table #calls. Just change it to the name of your table and check the column names are correct. Basically I am putting all the duplicates into a CTE and then joining that back to the original records. This allows you to see all the details for duplicate records (ie, the specific datetime called).
This bit of code:
DATEADD(DAY,DATEDIFF(DAY,1,dateCalled),1)
converts the datetime into the same date and 00:00:00 time
with cteCalls(AccountNumber, TelephoneNumber, callDay, callDayCount)
as
(select AccountNumber , TelephoneNumber, DATEADD(DAY,DATEDIFF(DAY,1,dateCalled),1) as callDay, count(DATEADD(DAY,DATEDIFF(DAY,1,dateCalled),1) ) as callDayCount
from #calls
group by AccountNumber , TelephoneNumber, DATEADD(DAY,DATEDIFF(DAY,1,dateCalled),1)
having count(DATEADD(DAY,DATEDIFF(DAY,1,dateCalled),1) ) > 1)
select #calls.* from #calls inner join cteCalls on
#calls.AccountNumber = cteCalls.AccountNumber and #calls.TelephoneNumber = cteCalls.TelephoneNumber and DATEADD(DAY,DATEDIFF(DAY,1,#calls.dateCalled),1) = cteCalls.callday
B
November 19, 2008 at 4:07 pm
Here's a way to do it without the CTE.
DECLARE @Calls TABLE (AccountNumber int, TelephoneNumber varchar(10), DateCalled datetime)
INSERT INTO @Calls
SELECT 1, 4026146434, '2008-11-17 17:01:27.023' UNION ALL
SELECT 2, 4026115201, '2008-11-19 17:01:27.023' UNION ALL
SELECT 1, 4026146434, '2008-11-19 17:01:27.023' UNION ALL
SELECT 3, 4026782001, '2008-11-19 17:01:27.023' UNION ALL
SELECT 1, 4026146434, '2008-11-19 15:01:27.023'
SELECT C1.*
FROM @Calls C1
INNER JOIN (
SELECT AccountNumber,
TelephoneNumber,
CAST(CONVERT(varchar,DateCalled,101) as datetime) as DateCalled
FROM @Calls
GROUP BY AccountNumber, TelephoneNumber, CAST(CONVERT(varchar,DateCalled,101) as datetime)
HAVING COUNT(*) > 1
) C2
ON C1.AccountNumber = C2.AccountNumber AND C1.TelephoneNumber = C2.TelephoneNumber
AND CAST(CONVERT(varchar,C1.DateCalled,101) as datetime) = C2.DateCalled
November 19, 2008 at 4:15 pm
I get the same execution plan for both queries and the exact same run-time. But I am running it on a measly 5 records. I'd be interested to know if the original poster gets any different results from the two solutions.... but I am guessing they will be identical.
November 19, 2008 at 9:16 pm
Thanks all for the help. They both run in the exact same time, 1 minute 4 seconds, and return ~76,000 records.
November 20, 2008 at 9:59 am
I'd be curious to see how that runs on a large table > 1M rows. I don't have the time to test it myself, but I'd be curious. I've seen where the optimizer translates CTE's into correlated subqueries. I would guess that both options would be similar with a small number of rows, but the larger the table becomes, the more concern I would have on the CTE.
November 20, 2008 at 10:53 am
Using SSCrazy data you might want to try:
SELECT 1, 4026146434, '2008-11-17 17:01:27.023' UNION ALL
SELECT 2, 4026115201, '2008-11-19 17:01:27.023' UNION ALL
SELECT 1, 4026146434, '2008-11-19 17:01:27.023' UNION ALL
SELECT 3, 4026782001, '2008-11-19 17:01:27.023' UNION ALL
SELECT 1, 4026146434, '2008-11-19 15:01:27.023'
;with numbered as(SELECT rowno=row_number() over
(partition by AccountNumber, TelephoneNumber order by AccountNumber),AccountNumber,TelephoneNumber, DateCalled
from @Calls)
select * from numbered WHERE Rowno > 1
This will give you:
rownoAccountNumberTelephoneNumberDateCalled
2 1 4026146434 2008-11-19 17:01:27.023
3 1 4026146434 2008-11-19 15:01:27.023
November 21, 2008 at 8:03 am
I think bitbucket's attempt will fall short whenever the same account number dials the same phone number once on one day, once the next, and then twice on another day after that. Here's my shot at it:
DECLARE @PHONE_ACTIVITY TABLE (
AccountNumber int,
TelephoneNumber bigint,
DateCalled DateTime
)
INSERT INTO @PHONE_ACTIVITY
SELECT 1, 4026146434, '2008-11-17 17:01:27.023' UNION ALL
SELECT 2, 4026115201, '2008-11-19 17:01:27.023' UNION ALL
SELECT 1, 4026146434, '2008-11-19 17:01:27.023' UNION ALL
SELECT 3, 4026782001, '2008-11-19 17:01:27.023' UNION ALL
SELECT 1, 4026146434, '2008-11-19 15:01:27.023'
SELECT Y.AccountNumber, Y.TelephoneNumber, PA.DateCalled
FROM (
SELECT X.AccountNumber, X.TelephoneNumber, MIN(DateCalled) AS DateCalled
FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY AccountNumber, TelephoneNumber,
CAST(CAST(YEAR(DateCalled) AS char(4))+'-'+RIGHT('0'+CAST(MONTH(DateCalled) AS varchar(2)),2)+'-'+
RIGHT('0'+CAST(DAY(DateCalled) AS varchar(2)),2) AS DateTime)
ORDER BY AccountNumber, TelephoneNumber) AS RN, AccountNumber, TelephoneNumber,
CAST(CAST(YEAR(DateCalled) AS char(4))+'-'+RIGHT('0'+CAST(MONTH(DateCalled) AS varchar(2)),2)+'-'+
RIGHT('0'+CAST(DAY(DateCalled) AS varchar(2)),2) AS DateTime) AS DateCalled
FROM @PHONE_ACTIVITY
) AS X
GROUP BY AccountNumber, TelephoneNumber,
CAST(CAST(YEAR(DateCalled) AS char(4))+'-'+RIGHT('0'+CAST(MONTH(DateCalled) AS varchar(2)),2)+'-'+
RIGHT('0'+CAST(DAY(DateCalled) AS varchar(2)),2) AS DateTime)
HAVING MAX(X.RN) > 1
) AS Y LEFT OUTER JOIN @PHONE_ACTIVITY AS PA
ON Y.AccountNumber = PA.AccountNumber AND
Y.TelephoneNumber = PA.TelephoneNumber AND
PA.DateCalled BETWEEN Y.DateCalled AND DATEADD(ms,-3,DATEADD(d,1,Y.DateCalled))
ORDER BY Y.AccountNumber, Y.TelephoneNumber, PA.DateCalled
It's been tested, although I'm not sure about all the extra function use not affecting performance negatively.
Steve
(aka smunson)
:):):)
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
November 21, 2008 at 9:17 am
Smunson:
think bitbucket's attempt will fall short whenever the same account number dials the same phone number once on one day, once the next, and then twice on another day after that.
You are CORRECT my suggestion will report incorrectly.
To all sorry about that, fell victim to writing code before thinking. To make it work correctly it would have to be modified to:
;with numbered as(SELECT rowno=row_number() over
(partition by AccountNumber, TelephoneNumber, DATEPART(yy, DateCalled),DATEPART(mm,DateCalled),DATEPART(dd,DateCalled)
order by DateCalled),AccountNumber,TelephoneNumber, DateCalled
from @Calls)
Smunson - Thank you for pointing out the problem
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply