March 22, 2010 at 4:39 pm
Hey guys, feel like this one should be obvious to me but it's just not working. I'm trying to get the maximum date that would similarly occur amongst the clients that I search for (which I'm not getting working), and also occur in another table.
declare @Table1 table(Name nvarchar(25),date datetime)
declare @Table2 table(Item nvarchar(25),date datetime)
declare @Date datetime
insert into @Table1
select 'John','2010-03-22'
union
select 'Mary','2010-03-22'
union
-- Note Mary doesn't have a '2010-03-23' value
select 'John','2010-03-23'
union
select 'Mark','2010-03-22'
union
-- Duplicates can occur
select 'Mark','2010-03-22'
union
select 'Mark','2010-03-23'
union
select 'Mark','2009-09-01'
union
-- Not searching for Neville, just indicating that there are other
-- entries that are ignored
select 'Neville','2009-09-01'
union
-- As there would be earlier values where Clients could have the same date
select 'Mary','2009-01-01'
union
select 'Mark','2009-01-01'
union
select 'John','2009-01-01'
insert into @Table2
select 'Table','2010-03-22'
union
select 'Table','2000-03-22'
union
select 'Table','2010-03-23'
set @Date = (select max(nm.date) from @Table1 nm, @Table2 obj
where nm.date = obj.date
and exists (select date from @Table1 where name = 'John')
and exists (select date from @Table1 where name = 'Mary')
and exists (select date from @Table1 where name = 'Mark')
)
select @date
So effectively I'm looking for '2010-03-22' (not '2010-03-23'), as the Maximum Date where all three clients that I'm looking for would return values, and then only if the other table has that date - EDIT - that is, of this date doesn't occur in @Table2, it should return an earlier date where the date exists for each of the Clients in @Table1 and also @Table2 - , which I believe works (hence not populating a set of Data that meets the Table 1 criteria but not the Table2 - joining the tables should work but checking the existance of two, three values in a column is tripping me up).
Thanks!
March 23, 2010 at 6:50 am
Hi Nugby, try this. But let me warn you, this is a crooked way of doing it.
And it may not give good performance for large tables (see the functions in the where clause).
create table #t(name varchar(25), date datetime)
insert into #t
select * from @table1 where name = 'Mary' or name = 'John' or name = 'Mark'
union
select * from @table2 where item = 'Mary' or item = 'John' or item = 'Mark'
;with cte as
(
select date, (select name + ',' from #t where date = t.date for xml path(''))'Names'
from #t t
group by date
)
select max(date) from cte
where
charindex('mark',names,0) > 0
and charindex('mary',names,0) > 0
and charindex('john',names,0) > 0
drop table #t
https://sqlroadie.com/
March 23, 2010 at 7:49 am
select
@date = max(obj.date)
from
@Table2 obj
join
@Table1 t1 on t1.date = obj.date and t1.name = 'John'
join
@Table1 t2 on t2.date = obj.date and t2.name = 'Mary'
join
@Table1 t3 on t3.date = obj.date and t3.name = 'Mark'
March 23, 2010 at 9:23 am
SELECT MAX(date)
FROM (
SELECT date
FROM @Table1 T1
WHERE Name IN ('John', 'Mark', 'Mary')
AND EXISTS (SELECT T1.date INTERSECT SELECT T2.date FROM @Table2 T2)
GROUP BY date
HAVING COUNT(*) = 3
) SQ;
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 23, 2010 at 9:36 am
Hi Paul, as you just mentioned in another thread, "Never use a UNION where a UNION ALL will do"! I believe the test script contains an error because the comments say that duplicates are allowed but UNION instead of UNION ALL is used.
If you change UNION to UNION ALL in the test script your solution returns 2009-01-01.
Peter
March 23, 2010 at 9:39 am
And the modified version returns NULL
March 23, 2010 at 11:08 am
Peter Brinkhaus (3/23/2010)
I believe the test script contains an error
Oh well. Never mind then.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 23, 2010 at 2:48 pm
Thanks guys!
I'm glad you mentioned the performance aspect, Arjun. Unfortunately (for me!) @Table1 reflects the largest table I ever have to use. Merely doing a "Select * from" just now took 1min03secs (Thank God for Indexes!) so I'll probably use Peter's solution seeing they both work perfectly.
Had never seen a table joined to itself before (have always used Subqueries for similar Join tasks, which would have given me each client's individual Max Date and therefore not worked in this case). Will definitely find this technique useful in future!
March 23, 2010 at 2:59 pm
If you are looking for performance I should try Paul's solution as well, with a minor change (replace COUNT(*) with COUNT(DISTINCT name)). Given the small test data set and no indexing I am not sure if my solution will perform better than Paul's.
Peter
March 23, 2010 at 3:19 pm
Sure thing. I'd taken Paul's "Never mind then" comment as conceding it didn't work having not tested it myself. If he's in NZ, Code I'd written at that time of the morning (I'm in Australia) wouldn't have worked either!
Once I get knock over these damn VBA Macros i'm going to be stuck with this morning I'll test the solutions out against the Production data and see if there's a difference in performance in the solutions and report back. Incidentally, I do have an Index against that table on the Date and Name fields so I'm still expecting to see this execute in the 3-4 second range.
March 23, 2010 at 4:21 pm
If he's in NZ, Code I'd written at that time of the morning (I'm in Australia) wouldn't have worked either!
Paul made a mistake based on an improper data set. Even then, his contribution was original and worth to look at. Anyway, good luck with the tests and glad we all could help.
Peter
March 24, 2010 at 12:54 am
Thanks for taking the time to fix my solution Peter 🙂
I have another chance to look at this, and find the following interesting:
Setup:
CREATE TABLE #Table1
(
Name NVARCHAR(25) NOT NULL,
date DATETIME NOT NULL
);
CREATE TABLE #Table2
(
Item NVARCHAR(25) NOT NULL,
date DATETIME NOT NULL
);
INSERT #Table1
SELECT 'John', '2010-03-22' UNION ALL
SELECT 'Mary', '2010-03-22' UNION ALL
SELECT 'John', '2010-03-23' UNION ALL
SELECT 'Mark', '2010-03-22' UNION ALL
SELECT 'Mark', '2010-03-22' UNION ALL
SELECT 'Mark', '2010-03-23' UNION ALL
SELECT 'Mark', '2009-09-01' UNION ALL
SELECT 'Neville', '2009-09-01' UNION ALL
SELECT 'Mary', '2009-01-01' UNION ALL
SELECT 'Mark', '2009-01-01' UNION ALL
SELECT 'John','2009-01-01';
INSERT #Table2
SELECT 'Table', '2010-03-22' UNION ALL
SELECT 'Table', '2000-03-22' UNION ALL
SELECT 'Table', '2010-03-23';
-- Indexes help!
CREATE INDEX nc1 ON #Table1 (date, name);
CREATE INDEX nc2 ON #Table2 (date);
Solution:
SELECT MAX(SQ.date)
FROM (
SELECT T1.date,
T1.Name,
rnk = RANK() OVER (PARTITION BY Date ORDER BY Name)
FROM #Table1 T1
WHERE Name IN (N'John', N'Mark', N'Mary')
GROUP BY
T1.date,
T1.Name
) SQ
WHERE SQ.rnk = 3
HAVING EXISTS (SELECT date FROM #Table2 T2 WHERE T2.date = MAX(SQ.date));
Execution plan:
I like this plan because there is only one (semi) join, no blocking operators, and no memory grant.
Four logical reads on the sample data provided!
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 24, 2010 at 4:22 pm
Hi Paul. I don't think that solution works if Table 2 doesn't contain the Max Date from Table 1. Your other solution works perfectly though.
Finally managed to test all of these (I had managed to make a simple mistake with Arjun's code looking at the Production data which managed to take half an hour to locate :pinch:). Each takes around 1-2 seconds normally, though when I test this against Production (where @Table2 Item column becomes effectively left(datasource,9) = "etcetcetc"), I got the following averages from about 40 attempts each (10 sets of Running each four times and leaving for a bit).
Arjun's - Min 2, Ave 5, Max 10
Peter's - Min 2, Ave 5, Max 8
Paul's (1st Solution) - Min 1, Ave 3, Max 7
Given I'd added the left(datasource) etc, I ended up removing the Having Count line from Paul's solution, but have otherwise kept the structure.
If you guys would like to see the specific code I used, you're welcome to PM me, but I'm pretty happy with Paul's query's performance given that the table it's hitting is the only thing slowing it down!
Thanks again guys, sorry I didn't get back yesterday (like I thought I would).
March 24, 2010 at 10:22 pm
Nugby (3/24/2010)
Hi Paul. I don't think that solution works if Table 2 doesn't contain the Max Date from Table 1.
Did you try it? I think it meet the criteria originally set out - return the highest date that all three people have in common, but only if that date exists in the other table. I would also expect it to be fastest.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 24, 2010 at 10:52 pm
Of course! I promise I'd never be so laissez-faire when someone is using their valuable time to help me on here.
I found that while it would work perfectly for this data, if I changed the 22nd of the 3rd references in @Table2 (this is part of a function hence my constantly reverting to Table Variables) to the 21st and added a 21st reference for each of the Clients in @Table1 (e.g. so the maximum date for the three clients was the 22nd but there was no match in @Table 2 for that date) I'd get nothing back despite there being a potential match for the 21st. In hindsight, I probably didn't make that requirement clear enough though (that I'd need @Table2 to contain that value also rather than just joining what data is available).
Your existing solution works perfectly for me though and never took even 2 seconds to run by itself when I tested them all individually against this data (it's the join to a 1.6 Million row table that slows it down to the 3 Second average despite the indexes on that table).
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply