April 1, 2021 at 10:09 pm
Hi, possible to give me a tip on how do I join these 2 tables?
TIA
April 1, 2021 at 10:52 pm
OK I did a LEFT JOIN into a temp table and used CTE with ROW_NUMBER() OVER PARTITION BY taking only ROW=1
Maybe there's a better way?
April 2, 2021 at 3:12 pm
You could also do it with a cross applied subquery. Whether or not it this a "better" way usually means which runs faster. Performance will depend on the existence of appropriate supporting indexes for the query (TableB needs an index over OrderID and DateTime, not separate indexes for each.) It will also depend on the relative volumes between TableA and TableB.
Select a.OrderID, a.OrderDateTime, a.OrderStatus, b.DateTime
from TableA a
cross apply (select top(1) DateTime
from TableB b
where b.OrderID = a.OrderID
and b.DateTime >= a.OrderDateTime
order by DateTime) b
One word of advice for the future. Questions phrased in conversational language don't get answered as quickly as questions supported by data and the code that you are using. This is because the people answering the questions are just doing it out of a sense of community, and would rather not spend a lot of time having to read between the lines to understand your intent, then create the data so they can code and test a solution.
Next time, don't show pictures of the data. Instead, set up a script to create and populate the sample tables and then show the expected results.
Similarly with your code. When you say you did a left join and used the row_number() function in a CTE, you aren't giving any significant detail. Why not just post the code you used for your solution, along with the data creation scripts? Help the people you are asking to help you, and you will get a wealth of information and tested solutions back. I strongly urge you to read this article and follow its guidelines in the future.
Best of luck.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
April 2, 2021 at 3:59 pm
Very similar, but using an OUTER APPLY because you said you used a LEFT JOIN. OUTER APPLY is like a "LEFT APPLY" and CROSS APPLY is like an "INNER APPLY".
Don't worry about using "*" in the inner/B query, it is NOT a performance issue because SQL will still only pull the column(s) from the table that are referenced in the outer query: you can verify this by looking at the query plan. Thus, I almost always use * for APPLYs.
SELECT A.*, B.OrderId, B.DateTime
FROM dbo.TABLE_A A
OUTER APPLY (
SELECT TOP (1) B.*
FROM dbo.TABLE_B B
WHERE B.OrderID = A.OrderID AND
B.DateTime >= A.DateTime
ORDER BY B.DateTime
) AS B
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
April 3, 2021 at 8:43 pm
Good catch, Scott. I wasn't thinking about his use of a LEFT Join.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
April 7, 2021 at 3:14 pm
In the future, you might want to follow the netiquette that is been established for over 35 years in SQL forums. That means posting DDL. Because of your lack of even basic manners, we have to get keys constraints data types and do the work that you were supposed to do before posting. Right now you don't know the difference between a row and a record, the table needs to have a key, and that naming a table is important. Also, a data element should have one and only one name in the entire schema. Your two timestamps are different data elements, and therefore they need different names.
Look at the currently available data types in SQL Server; there is no reason to use the old Sybase DATETIME data type. I also seriously doubt if you keep anything accurate to nanoseconds.
CREATE TABLE Invoices
(order_id CHAR(10) NOT NULL,
order_timestamp DATETIME2(0) NOT NULL, -- learn the current data types in SQL Server
PRIMARY KEY (order_id, order_timestamp), -- required, not an option
order_status CHAR(12) NOT NULL
CHECK (order_status IN ('in process', ???));
CREATE TABLE Schedule
(order_id CHAR(10) NOT NULL,
schedule_timestamp DATETIME2(0) NOT NULL,
PRIMARY KEY (order_id, schedule_timestamp));
What you are doing is called a T-Join and versions of it were invented by Dr. Codd in his last book. This was many years ago, but everything lives on the Internet. Her approach of ordering the two sets and then finding matches should be quite workable for the problem you presented. You're going to have some problems when one set is bigger than the other, or you run out of pairings or have timeslots in the schedule that are identical according to your rules. Try the suggestions you got and see if they work.
Please post DDL and follow ANSI/ISO standards when asking for help.
April 7, 2021 at 4:03 pm
Thanks all! The OUTER APPLY was very helpful. Will keep in mind proper posting etiquette in the future.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply