April 6, 2010 at 12:34 pm
Hi all,
Very newbie seeking assistance....
I have a table with a list of employees and a table with a list of holidays.
I need a list of employees not on holiday, so I am after
ActiveEmployees = AllEmployees(DateOfLeaving NULL) - EmployeesWithHolidayBookingsForToday.
My first query was:
SELECTEmployeeNumber, Forename, Surname
FROM Employee
WHERE (DateofLeaving IS NULL OR DateofLeaving > GETDATE())
AND EmployeeNumber NOT IN
(SELECT EmployeeNumber
FROM ECP_HolidayBooking
WHERE DateFrom <=GETDATE()
AND DateTo>=GETDATE()
ORDER BY EmployeeNumber
Which gave me the result I was after but from what I have read up on so far give me a cartesian set which is not a good thing.(?)
After a bit more digging around I found another solution using EXCEPT:
SELECTEmployeeNumber
FROM Employee
WHERE (DateofLeaving IS NULL OR DateofLeaving > GETDATE())
EXCEPT
SELECT EmployeeNumber
FROM ECP_HolidayBooking
WHERE DateFrom <=GETDATE()
AND DateTo>=GETDATE()
ORDER BY EmployeeNumber
I've been looking at subtraction joins (more efficient?) but can't see how to get that to work.
Any suggestions would be much appreciated.
Thanks.
April 6, 2010 at 11:40 pm
Mate, post your table structures (CREATE TABLE statements), Some sample data (INSERT INTO script), Indexes and Constraints if any and your expected result based on the sample data. THis will help us help u.
April 7, 2010 at 12:28 am
I did not understand your requirement fully, so assuming this is what u are asking, i have made some sample data and query for your requirement.
Assumptions :
You need Employees that are
1. NOT ON LEAVE TODAY
2. WHOSE LEAVE END DATE DINT FALL ON TODAY
IF OBJECT_ID('TEMPDB..#EMPLOYEE') IS NOT NULL
DROP TABLE #EMPLOYEE
CREATE TABLE #EMPLOYEE
(
EMP_ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_EMPID_EMPLOYEE PRIMARY KEY,
EMP_NAME VARCHAR(10),
)
INSERT INTO #EMPLOYEE
SELECT 'Emp1'
UNION ALL
SELECT 'Emp2'
UNION ALL
SELECT 'Emp3'
UNION ALL
SELECT 'Emp4'
UNION ALL
SELECT 'Emp5'
UNION ALL
SELECT 'Emp6'
UNION ALL
SELECT 'Emp7'
UNION ALL
SELECT 'Emp8'
UNION ALL
SELECT 'Emp9'
UNION ALL
SELECT 'Emp10'
IF OBJECT_ID('TEMPDB..#LEAVE_DETAILS') IS NOT NULL
DROP TABLE #LEAVE_DETAILS
CREATE TABLE #LEAVE_DETAILS
(
EMP_NAME VARCHAR(10),
LEAVE_START_DATE DATETIME,
LEAVE_END_DATE DATETIME,
)
INSERT INTO #LEAVE_DETAILS
SELECT 'Emp1','2010-04-07', '2010-04-07'
UNION ALL
SELECT 'Emp2','2010-04-20', '2010-04-28'
UNION ALL
SELECT 'Emp3','2010-03-20', '2010-03-28'
UNION ALL
SELECT 'Emp4','2010-05-20', '2010-05-28'
UNION ALL
SELECT 'Emp5',NULL, NULL
UNION ALL
SELECT 'Emp6',NULL, NULL
UNION ALL
SELECT 'Emp7','2010-04-07', '2010-04-07'
UNION ALL
SELECT 'Emp8',NULL, NULL
UNION ALL
SELECT 'Emp9',NULL, NULL
UNION ALL
SELECT 'Emp10','2010-01-20', '2010-01-28'
UNION ALL
SELECT 'Emp11','2010-04-01', '2010-04-07'
--SELECT EMP_ID,EMP_NAME FROM #EMPLOYEE
--SELECT EMP_NAME , LEAVE_START_DATE, LEAVE_END_DATE FROM #LEAVE_DETAILS
--HERE IS THE CODE TO RETRIEVE EMPLOYEES THAT ARE
SELECT E.EMP_ID, E.EMP_NAME FROM #EMPLOYEE E
LEFT JOIN #LEAVE_DETAILS L
ON E.EMP_NAME = L.EMP_NAME
WHERE (LEAVE_START_DATE IS NULL
OR DATEPART(DD,LEAVE_START_DATE) <> DATEPART(DD,GETDATE())
OR DATEPART(DD,LEAVE_END_DATE) <> DATEPART(DD,GETDATE()))
Please inform us if those assumptions are correct and the code provides you what you wanted..
If those are wrong, then kindly go thro this article http://www.sqlservercentral.com/articles/Best+Practices/61537/ and post the necessary data, not to forget the desired result.
Hope it helps
Cheers,
C'est Pras!!
April 7, 2010 at 3:55 am
Hi COldCoffee,
Thanks for the assist...
Your intepretation of my poorly expressed question is good, and thanks for pointing me to the best pratices...
I've adjusted the code you supplied with the table setup I have to work with. I hope this make a bit more sense.
IF OBJECT_ID('TEMPDB..#EMPLOYEE') IS NOT NULL
DROP TABLE #EMPLOYEE
CREATE TABLE #EMPLOYEE
(
IDINT IDENTITY(1,1) NOT NULL CONSTRAINT PK_EMPID_EMPLOYEE PRIMARY KEY,
EmployeeNumberINT,
ForenameVARCHAR(100),
SurnameVARCHAR(100),
EmploymentEndDATETIME
)
INSERT INTO #EMPLOYEE
SELECT '0001','Fred','Bloggs','2010-02-01 00:00:00'
UNION ALL
SELECT '0002','Simon','Hughes',NULL
UNION ALL
SELECT '0004','John','Smith',NULL
UNION ALL
SELECT '0005','Peter','Jones',NULL
UNION ALL
SELECT '1004','Tim','Ward',NULL
UNION ALL
SELECT '1005','Darren','Smith',NULL
UNION ALL
SELECT '1008','John','Morris',NULL
UNION ALL
SELECT '2020','Sarah','Miller','2010-01-10 00:00:00'
UNION ALL
SELECT '2021','Susan','Lowe',NULL
UNION ALL
SELECT '0008','Doug','Williams',NULL
IF OBJECT_ID('TEMPDB..#LEAVE_DETAILS') IS NOT NULL
DROP TABLE #LEAVE_DETAILS
CREATE TABLE #LEAVE_DETAILS
(
EMP_ID INT,
LEAVE_START_DATE DATETIME,
LEAVE_END_DATE DATETIME,
)
INSERT INTO #LEAVE_DETAILS
SELECT '0001','2010-04-07', '2010-04-07'
UNION ALL
SELECT '0001','2010-04-20', '2010-04-28'
UNION ALL
SELECT '0002','2010-03-20', '2010-03-28'
UNION ALL
SELECT '1004','2010-05-20', '2010-05-28'
UNION ALL
SELECT '2020','2010-04-07', '2010-04-07'
UNION ALL
SELECT '0008','2010-01-20', '2010-01-28'
UNION ALL
SELECT '0004','2010-04-01', '2010-04-07'
When an employee leaves the company a date is entered into the EmploymentEnd column. All active employees have a NULL value in the DateOfLeaving.
When leave is booked a row is inserted into the LEAVE_DETAILS table based on the EmployeeNumber.
What I am after is an efficient way to list current employeeNumbers, Forename, Surname that are that are not on holiday today.
This gives me my list of active Employees:
SELECT * FROM #EMPLOYEE
WHERE (EmploymentEnd IS NULL OR EmploymentEnd > GETDATE())
and this give me my list of employees on leave to day.
SELECT * FROM #LEAVE_DETAILS
WHERE LEAVE_START_DATE <=GETDATE()
AND LEAVE_END_DATE >=GETDATE()
So I am trying to subtract the list of employees on leave from the list of current employees.
I'm probably looking at this the wrong way or something, so any suggestions are welcome...
Many thanks.
April 7, 2010 at 1:37 pm
JoelyB,
I have a written a piece of code which I believe can help you get some directions:
[p][/p]
CREATE TABLE [dbo].[t1]([c1] [int] NOT NULL PRIMARY KEY,[c2] [varchar](10))
CREATE TABLE [dbo].[t11]([c1] [int] NOT NULL REFERENCES t1(c1),[c2] [varchar](10))
-- insert into t1 rows with PK values 1,2,3,4,5
-- insert into t11 rows that match FK values 1,3,5 (matching rows from t1's c1 column)
---now our aim to get those rows from t1 for which there is no matching records in t11
---i.e. rows with c1 values: 2 and 4
--this can be written as JoelyB as written using NOT IN as
SELECT c1 from t1 where c1 NOT IN (select c1 from t11)
--or can be written using EXCEPT as
SELECT c1 from t1 EXCEPT select c1 from t11
--please note that the above queries (EXCEPT and NOT IN options) results into a LEFT ANTI SEMI JOIN
--however a more cleaner way (and generally accepted) way of rewriting this query would be
select T1.c1 From t1 left join t11 on t1.c1 = t11.c1
where t11.c1 IS NULL
[p]Most important question of all: why do you think NOT IN or EXCEPT are less performant than regular joins. I am not aware of any specific issues where LEFT ANTI SEMI JOIN operator has been a cause of slow performance as such.[/p]
[p]Let me know if this helps.[/p]
April 8, 2010 at 6:56 am
Why do you think the NOT IN answer is giving you a Cartesian Product or Cross Join? This seems to be an acceptable way of getting the answer you are looking for...is the performance unacceptable?
April 8, 2010 at 8:35 am
Rustman, I did not get your point. do you mean to say that using NOT IN or EXCEPT results in cartesian product? I have already specified that it results in LEFT ANTI SEMI JOIN which is a form of LEFT OUTER JOIN. Run the queries with statistics profile on and you will see it for yourself.
April 8, 2010 at 8:41 am
mangeshd (4/8/2010)
Rustman, I did not get your point. do you mean to say that using NOT IN or EXCEPT results in cartesian product? I have already specified that it results in LEFT ANTI SEMI JOIN which is a form of LEFT OUTER JOIN. Run the queries with statistics profile on and you will see it for yourself.
Sorry mangeshd,
I was replying back to JoelyB as it was stated in the Initial post "Which gave me the result I was after but from what I have read up on so far give me a cartesian set which is not a good thing.(?)" This was refering to the NOT IN queries which was the first query tried.
April 11, 2010 at 10:36 am
mangeshd (4/7/2010)
Most important question of all: why do you think NOT IN or EXCEPT are less performant than regular joins. I am not aware of any specific issues where LEFT ANTI SEMI JOIN operator has been a cause of slow performance as such.
NOT IN produces incorrect results if the subquery or expression contains one or more NULLs.
EXCEPT performs an extra DISTINCT, and may therefore perform less well, or produce different results.
NOT EXISTS generally produces an anti-semi join too (unless transformed).
True anti-semi joins typically offer no advantage over the equivalent outer join, since every row must be checked.
The query optimizer recognises this, and may transform one to the other, regardless of the textual form of the query, if the logical effect is provably identical (as is frequently the case).
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
April 12, 2010 at 7:13 am
Hi all,
Thank you for all your responses, unfortunately I have had to shelve this project for the moment as another urgent (crisis!) has come up.
mangeshd, thanks for the code, it has given me something to learn and work with...
Paul White NZ, thanks for the explainations, yet more reading up to do...
All your answers here have certainly given me something to work with, now all I need is a window of time...
Thanks
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply