September 18, 2009 at 1:02 pm
Hi Everyone,
This probably has a simple answer that I just can't see now- I have 2 tables with course numbers, table 1 in the format "1234", table 2 in the format "1234 5678" (or more).
What I am trying to do via select is identify the data in table 1 that cannot be found in the values in table 2.
I've tried several simpler queries with no success. I am thinking the only way this might be possible is via nested cursors to compare each table 1 value to each table 2 value via charindex, and if charindex equals 0 when fetch status equals zero the table 1 value was not found in table 2 so return the table 1 value.
If anyone knows a quicker and easier way to accomplish this I would be very impressed as well as appreciative.
Thanks!
September 18, 2009 at 1:19 pm
Check EXCEPT operator, like it was designed to solve your conundrum.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.September 18, 2009 at 8:17 pm
You definitely do not need to use a one-at-a-time solution with a cursor or WHILE loop. Besides using EXCEPT to find the members of one resultset not found in another, you also can find rows that don't match specified criteria (in this case, whether CharIndex returns >0) by putting the match condition in the ON clause of a LEFT OUTER JOIN that relates the two tables and then using the WHERE clause to test for NULL in one of the fields of the table to be searched.
Select TableA.ColumnA
from TableA a
left outer join TableB b on -- condition here - your charindex comparison, for instance
where b.anycolumn is null
In general, procedural solutions are less efficient in SQL Server than specifying a set-based result definition and letting the optimizer work out the best way to find the answer.
September 19, 2009 at 12:20 am
This is a fast and dirty way of doing what you are asking for. However, this query won't win any speed contensts, so if you have a lot of records, you may look at something else.
--table1 is the table with the search for values (eg '1234')
--table2 is the table with the values being searched (eg '1234 5678')
select * from table1 a
inner join table2 b on b.Col1 like '%'+a.Col1+'%'
September 21, 2009 at 10:14 am
Carleton (9/19/2009)
This is a fast and dirty way of doing what you are asking for. However, this query won't win any speed contensts, so if you have a lot of records, you may look at something else.
--table1 is the table with the search for values (eg '1234')
--table2 is the table with the values being searched (eg '1234 5678')
select * from table1 a
inner join table2 b on b.Col1 like '%'+a.Col1+'%'
Yes, the "LIKE" operator is another tool that would help, and may be easier to understand than the charindex approach. As is, this code snippet doesn't identify the table1 entries NOT found in table2 (the original problem), but perhaps one could compare table1 data with the output of this SELECT with an EXCEPT operation to get that final resultset.
And, I don't really see how LIKE would work here anyway. Isn't it now looking for literal percent signs (inside quotes)? As they're quoted, they're no longer wildcards.
As for winning speed contests, none of the approaches suggested would be able to use an index to find matches in table2, so you'll always have a full scan in the execution plan. If this is for a real-world application, you may want to explore normalizing that data -- putting each value that's in the "already there" table into its own row. That way an index could be established and queries would be much faster.
September 21, 2009 at 10:38 am
My appologies, I overlooked that you wanted records that were NOT a match. In this case adding the word NOT to my original script suggestion should work:
select * from #table1 a
inner join #table2 b on b.Col1 NOT like '%'+a.Col1+'%'
John, I must respectfully disagree on the wildcard comment you mentioned:
And, I don't really see how LIKE would work here anyway. Isn't it now looking for literal percent signs (inside quotes)? As they're quoted, they're no longer wildcards.
If you want to search on the literal characters % or _ you must enclose them in brackets ([%] and [_]). If you run this query, you will find that the % characters are treated as wildcards even tho they are in the single quotes:
create table #Table1 (Col1 varchar(max))
Create table #Table2 (Col1 varchar(max))
insert into #table1 values ('1234')
insert into #table1 values ('1')
insert into #table1 values ('2')
insert into #table1 values ('3')
insert into #table2 values ('1234 5678')
insert into #table2 values ('5678 1234')
insert into #table2 values ('1234')
insert into #table2 values ('0')
select * from #table1 a
inner join #table2 b on b.Col1 NOT like '%'+a.Col1+'%'
Cheers,
Carleton
September 21, 2009 at 11:37 am
Carleton,
Thank you for that correction. You're absolutely right about the wildcards. I'm here to learn and appreciate the chance for dialog.
Now, however, the NOT LIKE still doesn't do the trick here. When I run the code you posted, each of the four rows in table1 finds a row in table2 that meets the condition of not being "like" the table1 value, so all four are in the resultset, even though the intention is not to include '1234', for instance. One could still use the LIKE operator to define a filter result set to be compared via EXCEPT.create table #Table1 (Col1 varchar(max))
Create table #Table2 (Col1 varchar(max))
insert into #table1 values ('1234')
insert into #table1 values ('2345')
insert into #table1 values ('2')
insert into #table1 values ('3')
insert into #table2 values ('1234 5678')
insert into #table2 values ('5678 1234')
insert into #table2 values ('1234')
insert into #table2 values ('0')
Select * from #table1
EXCEPT
select a.* from #table1 a
inner join #table2 b on b.Col1 like '%'+a.Col1+'%'
An additional gotcha turns up here: table1 values of "1", "2", and "3" are found in table2 even though the intention appears to be to match only on full four-digit numbers. Again, I'd encourage the OP to get that data normalized if he's dealing with a real-world database.
September 21, 2009 at 11:55 am
Well there you have it then, a true example of teaming! So John, you and I can split the revenue for this solution.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply