Query Code returning incorrect results

  • Hello Community,

    I have the following logic:

    Find customers who didn't place orders from 2019-02-01 to 2019-03-01. Output customer's first name.

    I am working with the following dataset:

    CREATE TABLE #tmpTable (
    id int,
    first_name varchar(50),
    last_name varchar(50),
    city varchar(50),
    address varchar(50),
    phone_number varchar(50),
    id int,
    cust_id int,
    order_date date,
    order_quantity int,
    order_details varchar(50),
    order_cost int)

    INSERT #tmpTable VALUES
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',1,3,CONVERT(DATETIME, '2019-03-04', 120),1,'Coat',100),
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',2,3,CONVERT(DATETIME, '2019-03-01', 120),1,'Shoes',80),
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',3,3,CONVERT(DATETIME, '2019-03-10', 120),1,'Skirt',30),
    (7,'Jill','Michael','Florida','','813-297-0692',4,7,CONVERT(DATETIME, '2019-02-01', 120),1,'Coat',100),
    (7,'Jill','Michael','Florida','','813-297-0692',5,7,CONVERT(DATETIME, '2019-03-10', 120),1,'Shoes',80),
    (15,'Mia','Owen','Hawaii','','808-640-5201',6,15,CONVERT(DATETIME, '2019-02-01', 120),2,'Boats',100),
    (15,'Mia','Owen','Hawaii','','808-640-5201',7,15,CONVERT(DATETIME, '2019-01-11', 120),3,'Shirts',60),
    (15,'Mia','Owen','Hawaii','','808-640-5201',8,15,CONVERT(DATETIME, '2019-03-11', 120),1,'Slipper',20),
    (15,'Mia','Owen','Hawaii','','808-640-5201',9,15,CONVERT(DATETIME, '2019-03-01', 120),2,'Jeans',80),
    (15,'Mia','Owen','Hawaii','','808-640-5201',10,15,CONVERT(DATETIME, '2019-03-09', 120),3,'Shirts',50),
    (5,'Henry','Jackson','Hawaii','','808-601-7513',11,5,CONVERT(DATETIME, '2019-02-01', 120),1,'Shoes',80),
    (12,'Eva','Lucas','Arizona','4379 Skips Lane','301-509-8805',12,12,CONVERT(DATETIME, '2019-01-11', 120),3,'Shirts',60),
    (12,'Eva','Lucas','Arizona','4379 Skips Lane','301-509-8805',13,12,CONVERT(DATETIME, '2019-03-11', 120),1,'Slipper',20),
    (4,'William','Daniel','Colorado','','813-368-1200',14,4,CONVERT(DATETIME, '2019-02-01', 120),1,'Shoes',80),
    (4,'William','Daniel','Colorado','','813-368-1200',15,4,CONVERT(DATETIME, '2019-01-11', 120),3,'Shirts',60),
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',16,3,CONVERT(DATETIME, '2019-04-19', 120),3,'Shirts',50),
    (7,'Jill','Michael','Florida','','813-297-0692',17,7,CONVERT(DATETIME, '2019-04-19', 120),1,'Suit',150),
    (15,'Mia','Owen','Hawaii','','808-640-5201',18,15,CONVERT(DATETIME, '2019-04-19', 120),1,'Skirt',30),
    (15,'Mia','Owen','Hawaii','','808-640-5201',19,15,CONVERT(DATETIME, '2019-04-19', 120),2,'Dressess',200),
    (12,'Eva','Lucas','Arizona','4379 Skips Lane','301-509-8805',20,12,CONVERT(DATETIME, '2019-01-11', 120),1,'Coat',100)

    SELECT * FROM #tmpTable

    My solution attempt is as follows:

    SELECT
    customers.first_name
    FROM (SELECT
    orders.cust_id
    FROM dbo.orders
    WHERE orders.order_date BETWEEN '2019-02-01' AND '2019-03-01') SubQuery
    ,dbo.customers
    INNER JOIN dbo.orders
    ON customers.id = orders.cust_id
    WHERE customers.id NOT IN (SubQuery.cust_id)
    GROUP BY customers.first_name

    However, the query is incorrect.

    The correct result should like the following:

    FirstName
    John
    Emma
    Liam
    Mark
    Eva
    Jack
    Mona
    Lili
    Justin
    Frank

    Can someone take a look at my code and let me know where I'm going wrong?

     

     

     

     

     

     

  • Untested, but should get you closer.

    select c.FirstName
    from dbo.customers c
    where not exists (select 1 from dbo.orders o where o.cust_id = c.id and o.order_date BETWEEN '20190201' AND '20190301')

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Can I also point out, that I'm trying to achieve the result without using LEFT JOIN

  • carlton 84646 wrote:

    Can I also point out, that I'm trying to achieve the result without using LEFT JOIN

    If you are referring to my code, that's called a SEMI JOIN and is one of the most efficient ways of solving this type of problem.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • SSC Guru

    Thanks for reaching out.

    Thats great

    Can you let me know why my code won't work

  • The table #tmpTabl specifies the column 'id' twice.  Presumably the second 'id' is the unique identifier of the 'orders' table so here it's renamed as 'o_id'.

    drop table if exists #tmpTabl;
    go
    CREATE TABLE #tmpTable (
    id int,
    first_name varchar(50),
    last_name varchar(50),
    city varchar(50),
    address varchar(50),
    phone_number varchar(50),
    o_id int,
    cust_id int,
    order_date date,
    order_quantity int,
    order_details varchar(50),
    order_cost int)

    INSERT #tmpTable VALUES
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',1,3,CONVERT(DATETIME, '2019-03-04', 120),1,'Coat',100),
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',2,3,CONVERT(DATETIME, '2019-03-01', 120),1,'Shoes',80),
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',3,3,CONVERT(DATETIME, '2019-03-10', 120),1,'Skirt',30),
    (7,'Jill','Michael','Florida','','813-297-0692',4,7,CONVERT(DATETIME, '2019-02-01', 120),1,'Coat',100),
    (7,'Jill','Michael','Florida','','813-297-0692',5,7,CONVERT(DATETIME, '2019-03-10', 120),1,'Shoes',80),
    (15,'Mia','Owen','Hawaii','','808-640-5201',6,15,CONVERT(DATETIME, '2019-02-01', 120),2,'Boats',100),
    (15,'Mia','Owen','Hawaii','','808-640-5201',7,15,CONVERT(DATETIME, '2019-01-11', 120),3,'Shirts',60),
    (15,'Mia','Owen','Hawaii','','808-640-5201',8,15,CONVERT(DATETIME, '2019-03-11', 120),1,'Slipper',20),
    (15,'Mia','Owen','Hawaii','','808-640-5201',9,15,CONVERT(DATETIME, '2019-03-01', 120),2,'Jeans',80),
    (15,'Mia','Owen','Hawaii','','808-640-5201',10,15,CONVERT(DATETIME, '2019-03-09', 120),3,'Shirts',50),
    (5,'Henry','Jackson','Hawaii','','808-601-7513',11,5,CONVERT(DATETIME, '2019-02-01', 120),1,'Shoes',80),
    (12,'Eva','Lucas','Arizona','4379 Skips Lane','301-509-8805',12,12,CONVERT(DATETIME, '2019-01-11', 120),3,'Shirts',60),
    (12,'Eva','Lucas','Arizona','4379 Skips Lane','301-509-8805',13,12,CONVERT(DATETIME, '2019-03-11', 120),1,'Slipper',20),
    (4,'William','Daniel','Colorado','','813-368-1200',14,4,CONVERT(DATETIME, '2019-02-01', 120),1,'Shoes',80),
    (4,'William','Daniel','Colorado','','813-368-1200',15,4,CONVERT(DATETIME, '2019-01-11', 120),3,'Shirts',60),
    (3,'Farida','Joseph','Florida','3153 Rhapsody Street','813-368-1200',16,3,CONVERT(DATETIME, '2019-04-19', 120),3,'Shirts',50),
    (7,'Jill','Michael','Florida','','813-297-0692',17,7,CONVERT(DATETIME, '2019-04-19', 120),1,'Suit',150),
    (15,'Mia','Owen','Hawaii','','808-640-5201',18,15,CONVERT(DATETIME, '2019-04-19', 120),1,'Skirt',30),
    (15,'Mia','Owen','Hawaii','','808-640-5201',19,15,CONVERT(DATETIME, '2019-04-19', 120),2,'Dressess',200),
    (12,'Eva','Lucas','Arizona','4379 Skips Lane','301-509-8805',20,12,CONVERT(DATETIME, '2019-01-11', 120),1,'Coat',100)

    SELECT * FROM #tmpTable

    Then the code.  Well, imo the place to start is always the FROM clause because it's evaluated first.   Issues (in no particular order)

    • In this case the query uses 2 different styles of join.  The first is the older (and unfavored, for a variety of reasons) comma separated table(s) and the second is the current ANSI 92 style JOIN.
    • There is no alias assigned to the 'customers' table.  This makes the query not read-able and less self-documenting.  It also runs the risk of there being duplicate object names prior to the de-bugging necessitated by the lack of table alias(es).
    • The subquery is "non-correlated" (meaning it doesn't reference any object not in it's own FROM clause) and uses the older (and unfavored, for a variety of reasons) FROM clause location.  Imo the current preferred way would be to use a CTE, common table expression.  CTE's have the advantage of being updatable/deletable (if all keys and constraints are prove-able).
    • Instead of 'not in' the query could use 'where not exists' to filter out purchaser cust_id's.  This makes the subquery "correrlated" by adding an (aliased) reference to the 'customers' table.

    So what could the query look?  Maybe something like this

    select distinct c.first_name
    from dbo.customers c
    where not exists (select 1
    from dbo.orders o
    where c.id=o.cust_id
    and o.order_date between '2019-02-01'
    and '2019-03-01');

    It selects the unique first names from the 'customers' table where not a single order was placed by that cust_id between the 2 test dates.

     

    • This reply was modified 3 years, 10 months ago by  Steve Collins.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Hi Steve,

    I'm getting the error here where c.cust_id=o.cust_id with your code

  • I think maybe it should be 'where c.id=o.cust_id'  🙂

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • carlton 84646 wrote:

    Can I also point out, that I'm trying to achieve the result without using LEFT JOIN

    Curious, what issues will a LEFT JOIN cause?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John wrote:

    carlton 84646 wrote:

    Can I also point out, that I'm trying to achieve the result without using LEFT JOIN

    Curious, what issues will a LEFT JOIN cause?

    Duplicates - requiring DISTINCT or GROUP BY to remove????

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • You don't seem to know the table must have a key. By definition, not as an option. Furthermore, we cannot be all NULLs. Basically, you posted a very badly designed deck of punch cards and you seem to think that everything is a string of 50 characters and you have two columns, both named the vague and useless "id", instead of "<something in particular>-id".

    You really need to take the time to learn how to design a schema instead of just throwing out garbage like this. Once you learn the basics, you will simply write good schemas without any thought about it. Here's an attempt at fixing up what you posted, and normalizing it.

    CREATE TABLE Customers

    (customer_nbr CHAR(16) NOT NULL PRIMARY KEY,

    customer_first_name VARCHAR(35) NOT NULL,

    customer_last_name VARCHAR(35) NOT NULL,

    city_name VARCHAR(35) NOT NULL,

    street_address VARCHAR(35) NOT NULL,

    state_code CHAR(2) NOT NULL,

    zip_code CHAR(5) NOT NULL CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]')

    customer_phone_nbr CHAR(35) NOT NULL

    CHECK(customer_phone_nbr LIKE '[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][09]))

    );

    Looking at your points score, you've probably been using SQL for maybe a few weeks at most. Please notice this table deals the set of one and only one kind of entity, that the name field. Follow U.S. Postal Service rules about label with, and look at the column names. The post fixes like "_code", "_name", etc. are called attribute properties. They are what make something specific instead of vague and general. Notice the use of check constraints to guarantee that data elements are formatted properly.

    The remaining columns in your non-table seem to belong to an order. However, getting order details into a character string is insane. The order details are separate entities by themselves and would require another table. So let's try and fix that.

    CREATE TABLE Orders

    (order_nbr CHAR(10) NOT NULL PRIMARY KEY,

    customer_nbr CHAR(16) NOT NULL REFERENCES Customers ON DELETE CASCADE,

    order_date DATE DEFAULT CURRENT TIMESTAMP NOT NULL,

    order_qty INTEGER NOT NULL CHECK (order_qty > 0) ,

    order_gtin CHAR(15) NOT NULL REFERENCES Inventory (gtin) ON DELETE CASCADE,

    unit_cost DECIMAL (10,2) NOT NULL);

    The GTIN is a universal code for "global trade identification number, of items. The important thing here is notice the use of the references clauses. This is called a relational database because things are related. Notice the use of check constraints to ensure data integrity. Currency amounts are expressed as a decimal data type, not as integers. Likewise, dates are done as a temporal data type (usually DATE) and not converted into strings for display. That was 1960s.COBOL and it has no place in SQL. real SQL programmers use cast () and not the old Sybase convert ().

    I'm not going to copy over your insertions. For what it's worth, using just the word INSERT is a proprietary thing that SQL Server got from the original Sybase product. The ANSI/ISO syntax is INSERT INTO. While it will work, it just says that you don't know what you're doing and that you only talk and a local dialect. Another giveaway, is the way you arranged your select statement. You laid it out the way we would have done this with postcards.

    SELECT C.customer_first_name

    FROM Customers AS C, Orders AS O

    WHERE C.cust_nbr = O.cust_nbr

    AND O.order_date BETWEEN '2019-02-01' AND '2019-03-01';

    You are actually making this has to be.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Jeffrey Williams wrote:

    Michael L John wrote:

    carlton 84646 wrote:

    Can I also point out, that I'm trying to achieve the result without using LEFT JOIN

    Curious, what issues will a LEFT JOIN cause?

    Duplicates - requiring DISTINCT or GROUP BY to remove????

    Something like should give no duplicates(unless there's duplicates in the customers table).  May or may not perform better than a not exists.

    select c.*
    from dbo.customers c
    LEFT OUTER JOIN dbo.orders o
    ON c.id=o.cust_id
    and o.order_date between '2019-02-01'
    and '2019-03-01'
    WHERE o.cust_id IS NULL

     

  • ZZartin wrote:

    Jeffrey Williams wrote:

    Michael L John wrote:

    carlton 84646 wrote:

    Can I also point out, that I'm trying to achieve the result without using LEFT JOIN

    Curious, what issues will a LEFT JOIN cause?

    Duplicates - requiring DISTINCT or GROUP BY to remove????

    Something like should give no duplicates(unless there's duplicates in the customers table).  May or may not perform better than a not exists.

    select c.*
    from dbo.customers c
    LEFT OUTER JOIN dbo.orders o
    ON c.id=o.cust_id
    and o.order_date between '2019-02-01'
    and '2019-03-01'
    WHERE o.cust_id IS NULL

    I was guessing at the OP's reasoning...not that it would actually have any duplicates.

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Hi Jeffrey,

    It wouldn't cause an issue.

    The challenge was just to achieve the result without using LEFT JOIN.

  • SELECT C.first_name

    FROM dbo.customers AS C

    WHERE ID NOT IN

    (

    SELECT CustomerID

    FROM dbo.orders AS O

    WHERE O.order_date

    BETWEEN '2019-02-01' AND '2019-03-01'

    );

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply