Subtract one result set from another

  • Hi all,

    Very newbie seeking assistance....

    I have a table with a list of employees and a table with a list of holidays.

    I need a list of employees not on holiday, so I am after

    ActiveEmployees = AllEmployees(DateOfLeaving NULL) - EmployeesWithHolidayBookingsForToday.

    My first query was:

    SELECTEmployeeNumber, Forename, Surname

    FROM Employee

    WHERE (DateofLeaving IS NULL OR DateofLeaving > GETDATE())

    AND EmployeeNumber NOT IN

    (SELECT EmployeeNumber

    FROM ECP_HolidayBooking

    WHERE DateFrom <=GETDATE()

    AND DateTo>=GETDATE()

    ORDER BY EmployeeNumber

    Which gave me the result I was after but from what I have read up on so far give me a cartesian set which is not a good thing.(?)

    After a bit more digging around I found another solution using EXCEPT:

    SELECTEmployeeNumber

    FROM Employee

    WHERE (DateofLeaving IS NULL OR DateofLeaving > GETDATE())

    EXCEPT

    SELECT EmployeeNumber

    FROM ECP_HolidayBooking

    WHERE DateFrom <=GETDATE()

    AND DateTo>=GETDATE()

    ORDER BY EmployeeNumber

    I've been looking at subtraction joins (more efficient?) but can't see how to get that to work.

    Any suggestions would be much appreciated.

    Thanks.

  • Mate, post your table structures (CREATE TABLE statements), Some sample data (INSERT INTO script), Indexes and Constraints if any and your expected result based on the sample data. THis will help us help u.

  • I did not understand your requirement fully, so assuming this is what u are asking, i have made some sample data and query for your requirement.

    Assumptions :

    You need Employees that are

    1. NOT ON LEAVE TODAY

    2. WHOSE LEAVE END DATE DINT FALL ON TODAY

    IF OBJECT_ID('TEMPDB..#EMPLOYEE') IS NOT NULL

    DROP TABLE #EMPLOYEE

    CREATE TABLE #EMPLOYEE

    (

    EMP_ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_EMPID_EMPLOYEE PRIMARY KEY,

    EMP_NAME VARCHAR(10),

    )

    INSERT INTO #EMPLOYEE

    SELECT 'Emp1'

    UNION ALL

    SELECT 'Emp2'

    UNION ALL

    SELECT 'Emp3'

    UNION ALL

    SELECT 'Emp4'

    UNION ALL

    SELECT 'Emp5'

    UNION ALL

    SELECT 'Emp6'

    UNION ALL

    SELECT 'Emp7'

    UNION ALL

    SELECT 'Emp8'

    UNION ALL

    SELECT 'Emp9'

    UNION ALL

    SELECT 'Emp10'

    IF OBJECT_ID('TEMPDB..#LEAVE_DETAILS') IS NOT NULL

    DROP TABLE #LEAVE_DETAILS

    CREATE TABLE #LEAVE_DETAILS

    (

    EMP_NAME VARCHAR(10),

    LEAVE_START_DATE DATETIME,

    LEAVE_END_DATE DATETIME,

    )

    INSERT INTO #LEAVE_DETAILS

    SELECT 'Emp1','2010-04-07', '2010-04-07'

    UNION ALL

    SELECT 'Emp2','2010-04-20', '2010-04-28'

    UNION ALL

    SELECT 'Emp3','2010-03-20', '2010-03-28'

    UNION ALL

    SELECT 'Emp4','2010-05-20', '2010-05-28'

    UNION ALL

    SELECT 'Emp5',NULL, NULL

    UNION ALL

    SELECT 'Emp6',NULL, NULL

    UNION ALL

    SELECT 'Emp7','2010-04-07', '2010-04-07'

    UNION ALL

    SELECT 'Emp8',NULL, NULL

    UNION ALL

    SELECT 'Emp9',NULL, NULL

    UNION ALL

    SELECT 'Emp10','2010-01-20', '2010-01-28'

    UNION ALL

    SELECT 'Emp11','2010-04-01', '2010-04-07'

    --SELECT EMP_ID,EMP_NAME FROM #EMPLOYEE

    --SELECT EMP_NAME , LEAVE_START_DATE, LEAVE_END_DATE FROM #LEAVE_DETAILS

    --HERE IS THE CODE TO RETRIEVE EMPLOYEES THAT ARE

    SELECT E.EMP_ID, E.EMP_NAME FROM #EMPLOYEE E

    LEFT JOIN #LEAVE_DETAILS L

    ON E.EMP_NAME = L.EMP_NAME

    WHERE (LEAVE_START_DATE IS NULL

    OR DATEPART(DD,LEAVE_START_DATE) <> DATEPART(DD,GETDATE())

    OR DATEPART(DD,LEAVE_END_DATE) <> DATEPART(DD,GETDATE()))

    Please inform us if those assumptions are correct and the code provides you what you wanted..

    If those are wrong, then kindly go thro this article http://www.sqlservercentral.com/articles/Best+Practices/61537/ and post the necessary data, not to forget the desired result.

    Hope it helps

    Cheers,

    C'est Pras!!

  • Hi COldCoffee,

    Thanks for the assist...

    Your intepretation of my poorly expressed question is good, and thanks for pointing me to the best pratices...

    I've adjusted the code you supplied with the table setup I have to work with. I hope this make a bit more sense.

    IF OBJECT_ID('TEMPDB..#EMPLOYEE') IS NOT NULL

    DROP TABLE #EMPLOYEE

    CREATE TABLE #EMPLOYEE

    (

    IDINT IDENTITY(1,1) NOT NULL CONSTRAINT PK_EMPID_EMPLOYEE PRIMARY KEY,

    EmployeeNumberINT,

    ForenameVARCHAR(100),

    SurnameVARCHAR(100),

    EmploymentEndDATETIME

    )

    INSERT INTO #EMPLOYEE

    SELECT '0001','Fred','Bloggs','2010-02-01 00:00:00'

    UNION ALL

    SELECT '0002','Simon','Hughes',NULL

    UNION ALL

    SELECT '0004','John','Smith',NULL

    UNION ALL

    SELECT '0005','Peter','Jones',NULL

    UNION ALL

    SELECT '1004','Tim','Ward',NULL

    UNION ALL

    SELECT '1005','Darren','Smith',NULL

    UNION ALL

    SELECT '1008','John','Morris',NULL

    UNION ALL

    SELECT '2020','Sarah','Miller','2010-01-10 00:00:00'

    UNION ALL

    SELECT '2021','Susan','Lowe',NULL

    UNION ALL

    SELECT '0008','Doug','Williams',NULL

    IF OBJECT_ID('TEMPDB..#LEAVE_DETAILS') IS NOT NULL

    DROP TABLE #LEAVE_DETAILS

    CREATE TABLE #LEAVE_DETAILS

    (

    EMP_ID INT,

    LEAVE_START_DATE DATETIME,

    LEAVE_END_DATE DATETIME,

    )

    INSERT INTO #LEAVE_DETAILS

    SELECT '0001','2010-04-07', '2010-04-07'

    UNION ALL

    SELECT '0001','2010-04-20', '2010-04-28'

    UNION ALL

    SELECT '0002','2010-03-20', '2010-03-28'

    UNION ALL

    SELECT '1004','2010-05-20', '2010-05-28'

    UNION ALL

    SELECT '2020','2010-04-07', '2010-04-07'

    UNION ALL

    SELECT '0008','2010-01-20', '2010-01-28'

    UNION ALL

    SELECT '0004','2010-04-01', '2010-04-07'

    When an employee leaves the company a date is entered into the EmploymentEnd column. All active employees have a NULL value in the DateOfLeaving.

    When leave is booked a row is inserted into the LEAVE_DETAILS table based on the EmployeeNumber.

    What I am after is an efficient way to list current employeeNumbers, Forename, Surname that are that are not on holiday today.

    This gives me my list of active Employees:

    SELECT * FROM #EMPLOYEE

    WHERE (EmploymentEnd IS NULL OR EmploymentEnd > GETDATE())

    and this give me my list of employees on leave to day.

    SELECT * FROM #LEAVE_DETAILS

    WHERE LEAVE_START_DATE <=GETDATE()

    AND LEAVE_END_DATE >=GETDATE()

    So I am trying to subtract the list of employees on leave from the list of current employees.

    I'm probably looking at this the wrong way or something, so any suggestions are welcome...

    Many thanks.

  • JoelyB,

    I have a written a piece of code which I believe can help you get some directions:

    [p][/p]

    CREATE TABLE [dbo].[t1]([c1] [int] NOT NULL PRIMARY KEY,[c2] [varchar](10))

    CREATE TABLE [dbo].[t11]([c1] [int] NOT NULL REFERENCES t1(c1),[c2] [varchar](10))

    -- insert into t1 rows with PK values 1,2,3,4,5

    -- insert into t11 rows that match FK values 1,3,5 (matching rows from t1's c1 column)

    ---now our aim to get those rows from t1 for which there is no matching records in t11

    ---i.e. rows with c1 values: 2 and 4

    --this can be written as JoelyB as written using NOT IN as

    SELECT c1 from t1 where c1 NOT IN (select c1 from t11)

    --or can be written using EXCEPT as

    SELECT c1 from t1 EXCEPT select c1 from t11

    --please note that the above queries (EXCEPT and NOT IN options) results into a LEFT ANTI SEMI JOIN

    --however a more cleaner way (and generally accepted) way of rewriting this query would be

    select T1.c1 From t1 left join t11 on t1.c1 = t11.c1

    where t11.c1 IS NULL

    [p]Most important question of all: why do you think NOT IN or EXCEPT are less performant than regular joins. I am not aware of any specific issues where LEFT ANTI SEMI JOIN operator has been a cause of slow performance as such.[/p]

    [p]Let me know if this helps.[/p]

  • Why do you think the NOT IN answer is giving you a Cartesian Product or Cross Join? This seems to be an acceptable way of getting the answer you are looking for...is the performance unacceptable?

  • Rustman, I did not get your point. do you mean to say that using NOT IN or EXCEPT results in cartesian product? I have already specified that it results in LEFT ANTI SEMI JOIN which is a form of LEFT OUTER JOIN. Run the queries with statistics profile on and you will see it for yourself.

  • mangeshd (4/8/2010)


    Rustman, I did not get your point. do you mean to say that using NOT IN or EXCEPT results in cartesian product? I have already specified that it results in LEFT ANTI SEMI JOIN which is a form of LEFT OUTER JOIN. Run the queries with statistics profile on and you will see it for yourself.

    Sorry mangeshd,

    I was replying back to JoelyB as it was stated in the Initial post "Which gave me the result I was after but from what I have read up on so far give me a cartesian set which is not a good thing.(?)" This was refering to the NOT IN queries which was the first query tried.

  • mangeshd (4/7/2010)


    Most important question of all: why do you think NOT IN or EXCEPT are less performant than regular joins. I am not aware of any specific issues where LEFT ANTI SEMI JOIN operator has been a cause of slow performance as such.

    NOT IN produces incorrect results if the subquery or expression contains one or more NULLs.

    EXCEPT performs an extra DISTINCT, and may therefore perform less well, or produce different results.

    NOT EXISTS generally produces an anti-semi join too (unless transformed).

    True anti-semi joins typically offer no advantage over the equivalent outer join, since every row must be checked.

    The query optimizer recognises this, and may transform one to the other, regardless of the textual form of the query, if the logical effect is provably identical (as is frequently the case).

  • Hi all,

    Thank you for all your responses, unfortunately I have had to shelve this project for the moment as another urgent (crisis!) has come up.

    mangeshd, thanks for the code, it has given me something to learn and work with...

    Paul White NZ, thanks for the explainations, yet more reading up to do...

    All your answers here have certainly given me something to work with, now all I need is a window of time...

    Thanks

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply