Query/SubQuery unexpected behavior

  • Is this correct subquery behavior or a bug?

    CREATE TABLE CurrentCollectDate

    ( Collect_Date datetime)

    CREATE TABLE History

    ( CollectDate datetime,

    FieldA nvarchar(128), -- or any

    FieldB int -- or any

    )

    SELECT * FROM History

    WHERE CollectDate =

    ( SELECT CollectDate FROM CurrentCollectDate)

    Subquery alone errors due to CollectDate being invalid - CollectDate not in table CurrentCollectDate.

    The query/subquery as a whole not only does NOT error, but returns a result set in which CollectDate is NOT equal Collect_Date. Apparently it is comparing CollectDate to CollectDate (self) and returning all rows.

    I would expect subquery within parens, IF it was correct, to evaluate first to establish the value to compare and filter results. If not that, then at least it should fail, as it does standalone, due to the invalid field name.

  • Isnt this just an oversight from you?

    One table has "Collect_Date" as column name and the other one has "CollectDate" as column name. So it will return that error.

    -Roy

  • Yes, it WAS an oversight. That's the problem. As I stated, the subquery on its own errors out, as expected becuse the field does not exist However, within the enclosing query, it does NOT fail, and in fact returns a result set that is not the expected or correct result. The point is that the field in the subquery does NOT exist in the inner table, but DOES exists in the outer table.

  • you're essentially right in how it works, but it's by design.

    During the resolve phase, it's going to try to resolve the field names in the sub-query by itself. Once the resolve fails against the inner table, it will go out and resolve against the outer table/query. This is how correlated sub-queries work. It's assuming that that is what you wanted.

    This is where prefixing the columns you use against the table aliases you want them to be pulled from comes into play. If the column was prefixed with the tablename, it would have failed immediately.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Hey, SSCrazy. That sounds good except that now I have to question the design. You're right that the alias makes it fail correctly ( isn't that ironic? You have to get it right to get it wrong. ) but practically speaking, I'd still call it a bug.

    Basically it's saying that in spite of not being able to resolve the value to query ON, it's going to query on SOMETHING anyway. I guess I'm just unreasonable for expecting it to work logically...I would think due to the parens, it shouldn't need the alias.

  • The issue is - it is allowed because certain "legal" queries require the ability to use values from the outer query in the inner query. They're called correlated subqueries.

    for example - say I had an customer table and an address table, and I wanted to know the phone number of every customer that didn't have an address. One way to write this would be:

    Table Customers: custID, custName,phoneNum, (other fields as needed)

    table addresses: addID,addCustID, address1, (etc....)

    select c.custID,

    c.custName,

    c.phoneNum

    from Customers C

    where NOT EXISTS (select * from addresses a where a.addcustid=c.custID)

    Notice that I prefixed those for readability, but I could just as easily written it as

    select c.custID,

    c.custName,

    c.phoneNum

    from Customers C

    where NOT EXISTS (select * from addresses a where addcustid=custID)

    And it would have run correctly.

    Prefix or no - this type of query is just one of several which would REQUIRE the INNER to see stuff in the OUTER. Unfortunately - your error is just an unfortunate by-product of this being allowed.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Hey, SSCrazy. That sounds good except that now I have to question the design. You're right that the alias makes it fail correctly ( isn't that ironic? You have to get it right to get it wrong. ) but practically speaking, I'd still call it a bug.

    Basically it's saying that in spite of not being able to resolve the value to query ON, it's going to query on SOMETHING anyway. I guess I'm just unreasonable for expecting it to work logically...I would think due to the parens, it shouldn't need the alias.

    CREATE TABLE CurrentCollectDate

    ( Collect_Date datetime)

    CREATE TABLE History

    ( CollectDate datetime,

    FieldA nvarchar(128), -- or any

    FieldB int -- or any

    )

    SELECT * FROM History

    WHERE CollectDate =

    ( SELECT CollectDate FROM CurrentCollectDate)

    it's not a bug. since your column references do not include any table names/aliases, the sql parser tries to figure out what tables you're referencing by looking for an in-scope table that has that column in it. your subquery refers to CollectDate which only exists in your History table. thus the full query works. but when you run just the subquery, it fails because CollectDate does not exist within CurrentCollectDate.

  • Yeah, agreed, except that I don't want it to 'try to figure out'. Thanks, guys, for the verification it's not a bug. I still don't like it though. I'm just gettin' to old for this *&^^&^%

  • Yeah, agreed, except that I don't want it to 'try to figure out'. Thanks, guys, for the verification it's not a bug. I still don't like it though. I'm just gettin' to old for this *&^^&^%

    in any query where you don't qualify column names, the parser will "try to figure it out".

    if table Employees has (empId, empName, address, zip)

    and table Departments has (deptId, deptName, mgrId) then

    select deptName, empName

    from Employees join Departments on mgrId = empId

    will work just fine because the parser determines that deptName is from Departments and empName is from Employees. by qualifying column names, you tell the parser explicitly where to get the data. so if you don't like how the parser works, always qualify your column names.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply