Maximum Date that exists for two or three parties and also another table

Question

Post reply

Maximum Date that exists for two or three parties and also another table

Nugby

SSCommitted

Points: 1612
More actions
March 22, 2010 at 4:39 pm

#143707

Hey guys, feel like this one should be obvious to me but it's just not working. I'm trying to get the maximum date that would similarly occur amongst the clients that I search for (which I'm not getting working), and also occur in another table.
declare @Table1 table(Name nvarchar(25),date datetime)
declare @Table2 table(Item nvarchar(25),date datetime)
declare @Date datetime
insert into @Table1
select 'John','2010-03-22'
union
select 'Mary','2010-03-22'
union
-- Note Mary doesn't have a '2010-03-23' value
select 'John','2010-03-23'
union
select 'Mark','2010-03-22'
union
-- Duplicates can occur
select 'Mark','2010-03-22'
union
select 'Mark','2010-03-23'
union
select 'Mark','2009-09-01'
union
-- Not searching for Neville, just indicating that there are other
-- entries that are ignored
select 'Neville','2009-09-01'
union
-- As there would be earlier values where Clients could have the same date
select 'Mary','2009-01-01'
union
select 'Mark','2009-01-01'
union
select 'John','2009-01-01'
insert into @Table2
select 'Table','2010-03-22'
union
select 'Table','2000-03-22'
union
select 'Table','2010-03-23'
set @Date = (select max(nm.date) from @Table1 nm, @Table2 obj
where nm.date = obj.date
and exists (select date from @Table1 where name = 'John')
and exists (select date from @Table1 where name = 'Mary')
and exists (select date from @Table1 where name = 'Mark')
)
select @date
So effectively I'm looking for '2010-03-22' (not '2010-03-23'), as the Maximum Date where all three clients that I'm looking for would return values, and then only if the other table has that date - EDIT - that is, of this date doesn't occur in @Table2, it should return an earlier date where the date exists for each of the Clients in @Table1 and also @Table2 - , which I believe works (hence not populating a set of Data that meets the Table 1 criteria but not the Table2 - joining the tables should work but checking the existance of two, three values in a column is tripping me up).
Thanks!

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

Arjun Sivadasan SSCertifiable Points: 5902 More actions · Answer 1

Hi Nugby, try this. But let me warn you, this is a crooked way of doing it.

And it may not give good performance for large tables (see the functions in the where clause).

create table #t(name varchar(25), date datetime)

insert into #t

select * from @table1 where name = 'Mary' or name = 'John' or name = 'Mark'

union

select * from @table2 where item = 'Mary' or item = 'John' or item = 'Mark'

;with cte as

(

select date, (select name + ',' from #t where date = t.date for xml path(''))'Names'

from #t t

group by date

)

select max(date) from cte

where

charindex('mark',names,0) > 0

and charindex('mary',names,0) > 0

and charindex('john',names,0) > 0

drop table #t

https://sqlroadie.com/

matzer SSCrazy Eights Points: 9727 More actions · Answer 2

select

@date = max(obj.date)

from

@Table2 obj

join

@Table1 t1 on t1.date = obj.date and t1.name = 'John'

join

@Table1 t2 on t2.date = obj.date and t2.name = 'Mary'

join

@Table1 t3 on t3.date = obj.date and t3.name = 'Mark'

Paul White SSC Guru Points: 150468 More actions · Answer 3

SELECT MAX(date)

FROM (

SELECT date

FROM @Table1 T1

WHERE Name IN ('John', 'Mark', 'Mary')

AND EXISTS (SELECT T1.date INTERSECT SELECT T2.date FROM @Table2 T2)

GROUP BY date

HAVING COUNT(*) = 3

) SQ;

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

matzer SSCrazy Eights Points: 9727 More actions · Answer 4

Hi Paul, as you just mentioned in another thread, "Never use a UNION where a UNION ALL will do"! I believe the test script contains an error because the comments say that duplicates are allowed but UNION instead of UNION ALL is used.

If you change UNION to UNION ALL in the test script your solution returns 2009-01-01.

Peter

matzer SSCrazy Eights Points: 9727 More actions · Answer 5

matzer

SSCrazy Eights

Points: 9727

March 23, 2010 at 9:39 am

#1138330

And the modified version returns NULL

Paul White SSC Guru Points: 150468 More actions · Answer 6

Peter Brinkhaus (3/23/2010)
I believe the test script contains an error

Oh well. Never mind then.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Nugby SSCommitted Points: 1612 More actions · Answer 7

Thanks guys!

I'm glad you mentioned the performance aspect, Arjun. Unfortunately (for me!) @Table1 reflects the largest table I ever have to use. Merely doing a "Select * from" just now took 1min03secs (Thank God for Indexes!) so I'll probably use Peter's solution seeing they both work perfectly.

Had never seen a table joined to itself before (have always used Subqueries for similar Join tasks, which would have given me each client's individual Max Date and therefore not worked in this case). Will definitely find this technique useful in future!

matzer SSCrazy Eights Points: 9727 More actions · Answer 8

If you are looking for performance I should try Paul's solution as well, with a minor change (replace COUNT(*) with COUNT(DISTINCT name)). Given the small test data set and no indexing I am not sure if my solution will perform better than Paul's.

Peter

Nugby SSCommitted Points: 1612 More actions · Answer 9

Sure thing. I'd taken Paul's "Never mind then" comment as conceding it didn't work having not tested it myself. If he's in NZ, Code I'd written at that time of the morning (I'm in Australia) wouldn't have worked either!

Once I get knock over these damn VBA Macros i'm going to be stuck with this morning I'll test the solutions out against the Production data and see if there's a difference in performance in the solutions and report back. Incidentally, I do have an Index against that table on the Date and Name fields so I'm still expecting to see this execute in the 3-4 second range.

matzer SSCrazy Eights Points: 9727 More actions · Answer 10

If he's in NZ, Code I'd written at that time of the morning (I'm in Australia) wouldn't have worked either!

Paul made a mistake based on an improper data set. Even then, his contribution was original and worth to look at. Anyway, good luck with the tests and glad we all could help.

Peter

Paul White SSC Guru Points: 150468 More actions · Answer 11

Thanks for taking the time to fix my solution Peter 🙂

I have another chance to look at this, and find the following interesting:

Setup:

CREATE TABLE #Table1

(

Name NVARCHAR(25) NOT NULL,

date DATETIME NOT NULL

);

CREATE TABLE #Table2

(

Item NVARCHAR(25) NOT NULL,

date DATETIME NOT NULL

);

INSERT #Table1

SELECT 'John', '2010-03-22' UNION ALL

SELECT 'Mary', '2010-03-22' UNION ALL

SELECT 'John', '2010-03-23' UNION ALL

SELECT 'Mark', '2010-03-22' UNION ALL

SELECT 'Mark', '2010-03-23' UNION ALL

SELECT 'Mark', '2009-09-01' UNION ALL

SELECT 'Neville', '2009-09-01' UNION ALL

SELECT 'Mary', '2009-01-01' UNION ALL

SELECT 'Mark', '2009-01-01' UNION ALL

SELECT 'John','2009-01-01';

INSERT #Table2

SELECT 'Table', '2010-03-22' UNION ALL

SELECT 'Table', '2000-03-22' UNION ALL

SELECT 'Table', '2010-03-23';

-- Indexes help!

CREATE INDEX nc1 ON #Table1 (date, name);

CREATE INDEX nc2 ON #Table2 (date);

Solution:

SELECT MAX(SQ.date)

FROM (

SELECT T1.date,

T1.Name,

rnk = RANK() OVER (PARTITION BY Date ORDER BY Name)

FROM #Table1 T1

WHERE Name IN (N'John', N'Mark', N'Mary')

GROUP BY

T1.date,

T1.Name

) SQ

WHERE SQ.rnk = 3

HAVING EXISTS (SELECT date FROM #Table2 T2 WHERE T2.date = MAX(SQ.date));

Execution plan:

I like this plan because there is only one (semi) join, no blocking operators, and no memory grant.

Four logical reads on the sample data provided!

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Nugby SSCommitted Points: 1612 More actions · Answer 12

Hi Paul. I don't think that solution works if Table 2 doesn't contain the Max Date from Table 1. Your other solution works perfectly though.

Finally managed to test all of these (I had managed to make a simple mistake with Arjun's code looking at the Production data which managed to take half an hour to locate :pinch:). Each takes around 1-2 seconds normally, though when I test this against Production (where @Table2 Item column becomes effectively left(datasource,9) = "etcetcetc"), I got the following averages from about 40 attempts each (10 sets of Running each four times and leaving for a bit).

Arjun's - Min 2, Ave 5, Max 10

Peter's - Min 2, Ave 5, Max 8

Paul's (1st Solution) - Min 1, Ave 3, Max 7

Given I'd added the left(datasource) etc, I ended up removing the Having Count line from Paul's solution, but have otherwise kept the structure.

If you guys would like to see the specific code I used, you're welcome to PM me, but I'm pretty happy with Paul's query's performance given that the table it's hitting is the only thing slowing it down!

Thanks again guys, sorry I didn't get back yesterday (like I thought I would).

Paul White SSC Guru Points: 150468 More actions · Answer 13

Nugby (3/24/2010)
Hi Paul. I don't think that solution works if Table 2 doesn't contain the Max Date from Table 1.

Did you try it? I think it meet the criteria originally set out - return the highest date that all three people have in common, but only if that date exists in the other table. I would also expect it to be fastest.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Nugby SSCommitted Points: 1612 More actions · Answer 14

Of course! I promise I'd never be so laissez-faire when someone is using their valuable time to help me on here.

I found that while it would work perfectly for this data, if I changed the 22nd of the 3rd references in @Table2 (this is part of a function hence my constantly reverting to Table Variables) to the 21st and added a 21st reference for each of the Clients in @Table1 (e.g. so the maximum date for the three clients was the 22nd but there was no match in @Table 2 for that date) I'd get nothing back despite there being a potential match for the 21st. In hindsight, I probably didn't make that requirement clear enough though (that I'd need @Table2 to contain that value also rather than just joining what data is available).

Your existing solution works perfectly for me though and never took even 2 seconds to run by itself when I tested them all individually against this data (it's the join to a 1.6 Million row table that slows it down to the 3 Second average despite the indexes on that table).