Join with where clause?

  • I was able to search the forums for how to join two databases using the following statement:

    select A.*, B.*

    FROM DBA.dbo.TABLEA A INNER JOIN

    DBB.dbo.TABLEB B ON A.Id = B.FId

    I'm a newb, and need to find out if I can add some type of where clause to this so I'm only pulling records from last month. Is this possible? If so can someone help with how to author the syntax?

    Thanks in advance...

    Ron

  • JOINS are processed and then the WHERE clause. You can treat the ON clause the same as the WHERE clause, depending how you want to organize your query. I'd advise you to buy a basic TSQL book.

  • Thanks for the advice.

  • If I understand you want to make the join in the WHERE clause,... something like this:

    select A.*, B.*

    FROM TABLEA A, TABLEB B 

    WHERE A.Id = B.FId

    which is a 100% valid sql statement.

    ------------
    When you 've got a hammer, everything starts to look like a nail...

  • Nope, he's asking (from my reading) how to add a filter to pull only last month's records.

    You can only do that if one, or both tables contains a field that tells when it was inserted. If you do (assuming it's in DBA.dbo.TableA and called DateEntered) then the syntax is as follws

    select A.*, B.*

    FROM DBA.dbo.TABLEA A INNER JOIN

    DBB.dbo.TABLEB B ON A.Id = B.FId

    WHERE DBA.dbo.TableA.DateEntered >= '2006/03/01' -- retrieves all records where the DateEnterf field is after the first of March.

    Word of advice for a beginner, don't write select *, take a few moments longer and list the fields that you're intterested in.

    Either get yourself a good intro to T-SQL book, or spend some time reading through the SQL Books Online (SQL help file)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Accurate.... Gila, accurate!

    ------------
    When you 've got a hammer, everything starts to look like a nail...

  • Several things that I've read on these forums seem to indicate that you can get even better results combining your WHERE filter into your JOIN clause. For example, if you were using this to create some sort of view, you might use something like (using Gail's example):

    select A.column1, A.column2, B.column1, B.column2

    FROM DBA.dbo.TABLEA A INNER JOIN

    DBB.dbo.TABLEB B ON A.Id = B.FId

    AND DBA.dbo.TableA.DateEntered >= DATEADD(m,-1,getdate())

    -- in this case, the filter is part of the JOIN, so you're joining ON A.id=B.Fid as well as on your date clause. This prevents the query from gathering all the data in the tables and then disgarding what doesn't fit the WHERE clause. Instead this will only gather the data that also fits the date criteria and you've constrained your data more tightly.

    -- retrieves all records where the DateEntered field is >= 1 month ago from today

  • You can use the WHERE clause with INNER JOIN with no problems:

    SELECT <fields>

    FROM a INNER JOIN b ON a.id = b.id

    WHERE a.x > 0 AND b.x > 0

    If you want to use LEFT JOIN or RIGHT JOIN then you have to put some filter conditions in the ON clause instead, or it will produce the same results as an INNER JOIN.

    SELECT <fields>

    FROM a LEFT JOIN b ON a.id = b.id AND b.x > 0

    WHERE a.x > 0

    You can also use derived tables to specifiy pre-join, join, and post-join conditions.  You probably wouldn't want to do this much writing every time you do a join, but possibly use this form to check the results of something written more concisely with complex joins and/or filters.

    SELECT <fields>

    FROM (

        SELECT <fields> FROM a

        WHERE a.x > 0  -- pre-join filters on a

    ) a

    LEFT JOIN (

        SELECT <fields> FROM b

        WHERE b.x > 0 -- pre-join filters on b

    ) b ON a.id = b.id  -- and other join conditions

    WHERE -- more filter condtitions

  • If you all really want to know my they created the "ON" clause, it's for left/right joins...

    SELECT a.*

    FROM a

    LEFT JOIN b

    on a.id = b.id

    and b.val > 1

    left join c

    on isnull(b.id,0) = c.id

    where c.val = 8

  • Don't look now, but your "where c.val = 8" just turned all the left joins into inner joins.

  • Makes absolutely no difference on an inner join. On a left or right join, filters the joined table before it's joined in. Doing this can be difficult to understand, at least for T-SQL beginners.

    I personally always advocate that if you want to filter a table before it's joined, use a subquery in the from clause. (but then I work with a lot of C# developers who think they can write SQL   )

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Consider the three equivalent queries (produces the same results set):

    --query 1: Using all consitions in the ON clause.

    select o.Orderid, o.line, i.item

    from items i inner join orderlines o

    ON i.Itemid = o.Itemid

    AND o.Qty >= 100

    --query 2: Using ON and WHERE clause.

    select o.Orderid, o.line, i.item

    from items i inner join orderlines o

    ON i.Itemid = o.Itemid

    where o.Qty >= 100

    --query 3: Using only WHERE clause.

    select o.Orderid, o.line, i.item

    from items i, orderlines o

    WHERE i.Itemid = o.Itemid

    AND o.Qty >= 100

    Using the above queries in QA with 'Statistics' and 'Show execution plan' seem that query 3 is... better (contrary to what many people believe).

    My result set produces 500,545 rows for all three (in SQL S2000).

    ------------
    When you 've got a hammer, everything starts to look like a nail...

  • What database did you use for that? I'd like to see what the change in the plan was for the 3rd query.

    I tried the following in Northwind. All 3 came out with identical execution plans and stats. Rows are much less, only 23 records

    -- Using ON

    select

    o.Orderid, o.ProductID, p.ProductName

    from Products p inner join [Order Details] o

    ON p.Productid = o.Productid

    AND o.Quantity >= 100

    -- Using ON and WHERE

    select o.Orderid, o.ProductID, p.ProductName

    from Products p inner join [Order Details] o

    ON p.Productid = o.Productid

    where o.Quantity >= 100

    -- Using WHERE

    select o.Orderid, o.ProductID, p.ProductName

    from Products p, [Order Details] o

    WHERE p.Productid = o.Productid

    AND o.Quantity >= 100

    |--Nested Loops(Inner Join, OUTER REFERENCES: ([o].[ProductID])) 11 3 1 Nested Loops Inner Join OUTER REFERENCES: ([o].[ProductID]) NULL 23 0 9.614E-05 108 0.05219676 [p].[ProductName], [o].[OrderID], [o].[ProductID] NULL PLAN_ROW 0 1

    |--Clustered Index Scan(OBJECT: ([Northwind].[dbo].[Order Details].[PK_Order_Details] AS [o]), WHERE: ([o].[Quantity]>=100)) 11 4 3 Clustered Index Scan Clustered Index Scan OBJECT: ([Northwind].[dbo].[Order Details].[PK_Order_Details] AS [o]), WHERE: ([o].[Quantity]>=100) [o].[Quantity], [o].[OrderID], [o].[ProductID] 23 0.04350442 0.002449 37 0.04595343 [o].[Quantity], [o].[OrderID], [o].[ProductID] NULL PLAN_ROW 0 1

    |--Clustered Index Seek(OBJECT: ([Northwind].[dbo].[Products].[PK_Products] AS [p]), SEEK: ([p].[ProductID]=[o].[ProductID]) ORDERED FORWARD) 11 5 3 Clustered Index Seek Clustered Index Seek OBJECT: ([Northwind].[dbo].[Products].[PK_Products] AS [p]), SEEK: ([p].[ProductID]=[o].[ProductID]) ORDERED FORWARD [p].[ProductName] 1 0.003203425 7.9603E-05 79 0.005112794 [p].[ProductName] NULL PLAN_ROW 0 23

    Table 'Products'. Scan count 23, logical reads 46, physical reads 0, read-ahead reads 0.

    Table 'Order Details'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0.

    Edit: fixed smilies and missing query

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I'm also curious how #3 differs from the others.

    Adding to that, example #3 uses the 'soon-to-be-deprecated' legacy join syntax and is highly discouraged to use these days.

    /Kenneth

  • ok,... lets start from the beginning:

    The tables that i use are:

    CREATE TABLE [dbo].[Items] (

    [ItemID] [int] NOT NULL ,

    [Item] [varchar] (50) COLLATE Greek_CI_AS NULL ,

    [Price] [money] NULL

    ) ON [PRIMARY]

    with PK the [ItemID] having 1001 rows.

    and

    CREATE TABLE [dbo].[OrderLines] (

    [OrderID] [int] NOT NULL ,

    [Line] [int] NOT NULL ,

    [ItemID] [int] NULL ,

    [Qty] [int] NULL

    ) ON [PRIMARY]

    with PK the [OrderID]+[Line] having 500545 rows.

    1st... i didnt say that the above queries differ to their 'execution plan' but to their performance... how is this possible...?

    here is the show statistics result between query#1 & query#3:

    FOR QUERY 1:

    -------------

    Application Profile Statistics

    Timer resolution (milliseconds)00

    Number of INSERT, UPDATE, DELETE statements00

    Rows effected by INSERT, UPDATE, DELETE statements00

    Number of SELECT statements22

    Rows effected by SELECT statements500549500549

    Number of user transactions55

    Average fetch time00

    Cumulative fetch time00

    Number of fetches00

    Number of open statement handles00

    Max number of opened statement handles00

    Cumulative number of statement handles00

    Network Statistics

    Number of server roundtrips33

    Number of TDS packets sent33

    Number of TDS packets received21852182,86

    Number of bytes sent472472

    Number of bytes received8,9283e+0068,92727e+006

    Time Statistics

    Cumulative client processing time20,571429

    Cumulative wait time on server replies3130908,143

    FOR QUERY 3:

    -------------

    Application Profile Statistics

    Timer resolution (milliseconds)00

    Number of INSERT, UPDATE, DELETE statements00

    Rows effected by INSERT, UPDATE, DELETE statements00

    Number of SELECT statements20,777778

    Rows effected by SELECT statements500549333697

    Number of user transactions51,44444

    Average fetch time00

    Cumulative fetch time00

    Number of fetches00

    Number of open statement handles00

    Max number of opened statement handles00

    Cumulative number of statement handles00

    Network Statistics

    Number of server roundtrips31,22222

    Number of TDS packets sent31,22222

    Number of TDS packets received21851454,22

    Number of bytes sent426232,444

    Number of bytes received8,92827e+0065,95024e+006

    Time Statistics

    Cumulative client processing time10,222222

    Cumulative wait time on server replies29751998,78

    ------------
    When you 've got a hammer, everything starts to look like a nail...

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply