Join to only one record

Question

Post reply

Join to only one record

mrogers-674473

Old Hand

Points: 391
More actions
June 19, 2008 at 12:54 pm

#70864

Back in the day, when I had to join tables manually through code, I could handle single row joins. but in SQL I don't know how.
I want to join a driver table to a second table.. but really only as a check. I only want to see if a match exists. I don't need to join to all the records it matches. Is there are way to do this?
Psuedoquery follows
select * from customer
join orders on 'first' orders.cust number = customer.cust-number
i don't want to join to all the other records for efficiency reasons. this example above is not the real world situation, but essentially the problem is I don't have a master table to join to so when the join happens I have to make MANY unnecessary joins (thousands) just to see if there is a match. Both the driver and the joined table are huge. The combined joined result is obviously even larger and takes forever to complete (if it even does)
Thanks

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Johan Bijnens SSC Guru Points: 135380 More actions · Answer 1

read Jeff Modens article on "tally table"

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

tbeadle SSCommitted Points: 1635 More actions · Answer 2

[font="Arial"]

Hello,

If I understand your issue correctly, you have a driver table and a related table.

Let's say:

create table #driverTable

( rowId int identity

columnOne varchar(10) null,

column99 int )

Let's say the columnOne contains a customer's name as in the defined table Customer Table below.

Lets say there's an existing Customertable you want to restrict the driver table to for only those records

that have a matching customerName in it:

Customer

( customer_table_id int identity,

customerName varchar(10 ) null )

select columnOne, column99

from #driverTable

where columnOne in ( select distinct customerName from Customer )

Is that what you mean?

Regards,

Terry

[/font]

mrogers-674473 Old Hand Points: 391 More actions · Answer 3

Essentially, yes. However picture the customer table you refer to having 40 million records in it and thousands of potential matches on customer name. But perhaps this is still the most efficient way to do it. And I was under the impression i should avoid the 'IN' if the list was to be larger than say 100 total entries. I will have tens of thousands.

Jeffrey Williams SSC Guru Points: 90313 More actions · Answer 4

mrogers (6/19/2008)
Back in the day, when I had to join tables manually through code, I could handle single row joins. but in SQL I don't know how.
I want to join a driver table to a second table.. but really only as a check. I only want to see if a match exists. I don't need to join to all the records it matches. Is there are way to do this?
Psuedoquery follows
select * from customer
join orders on 'first' orders.cust number = customer.cust-number
i don't want to join to all the other records for efficiency reasons. this example above is not the real world situation, but essentially the problem is I don't have a master table to join to so when the join happens I have to make MANY unnecessary joins (thousands) just to see if there is a match. Both the driver and the joined table are huge. The combined joined result is obviously even larger and takes forever to complete (if it even does)
Thanks

What would define the 'first' row from Orders? You can do something like:

SELECT *

FROM Customer AS c

JOIN Orders AS o ON o.cust_number = c.cust-number

AND o.'column that defines first' = 'first value';

Jeff

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

tbeadle SSCommitted Points: 1635 More actions · Answer 5

[font="Arial"]

For a 40 million row process, I would create a single column table of the unique customer names/id's. Create a clustered key of those customer names/id's and use it as the table to compare to. That will get you fast as it can go with no duplicate entries to compare. Obviously this would have to be built each time you ran the process but that's not a problem.

If you can add a column to the 40 million row table, I would also use an update instead of a table scan process to populate the 40 million row added colmun; (like customer_id_verified int null ). That way you could screen out the records in the 40 that weren't verified and then use the result for your process ( if any ).

Regarding your comment of the limit of 'IN' use to 100 records...that's probably a comment that's is applicable to smaller table results and not big ones like 40 million. It's a good idea to keep it's use to smaller result sets but several thousand is more like it. If it gets to a design criteria above that, that's when the create a temp or processing table with the key field as explained above is better. You may have 40 million client names to validate but the escential question is how many unique customer names. The unique values can be quickly checked and then you can decide which way is best.

Regards,

Terry

[/font]

mrogers-674473 Old Hand Points: 391 More actions · Answer 6

See that's the thing. There's nothing in the orders file that I can limit to. I just want to know that there is at least one record in there that matches on customer number

mrogers-674473 Old Hand Points: 391 More actions · Answer 7

I think you may be right... creating a temp table with the disctinct matches may be the best way. I was trying to get it in one statement and that's probably just not practical.

Thanks

Sergiy SSC Guru Points: 110208 More actions · Answer 8

Easy.

You need what?

List of customers which have existing orders?

Just ask SQL Server for it!

select * from customer

WHERE EXISTS (select 1 from orders where orders.cust-number = customer.cust-number)

_____________
Code for TallyGenerator

Jeffrey Williams SSC Guru Points: 90313 More actions · Answer 9

Then maybe something like:

SELECT *

FROM Customer c

INNER JOIN (SELECT DISTINCT cust_number

FROM Orders) o ON o.cust_number = c.cust-number;

Now, if you also want to see the Customers without any orders you would change the join to an outer join (LEFT OUTER JOIN). Performance will probably not be the best - and if not, you could probably try using a temp table:

CREATE TABLE #tOrders (cust_number int PRIMARY KEY CLUSTERED);

INSERT INTO #tOrders

SELECT DISTINCT cust_number FROM Orders);

SELECT *

FROM Customer c

INNER JOIN #tOrders o ON o.cust_number = c.cust-number;

Jeff

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

mrogers-674473 Old Hand Points: 391 More actions · Answer 10

Sergiy (6/19/2008)
Easy.
You need what?
List of customers which have existing orders?
Just ask SQL Server for it!
select * from customer
WHERE EXISTS (select 1 from orders where orders.cust-number = customer.cust-number)

That's the ticket! Exactly what I was trying to do.

Thanks for all the input from everyone