January 24, 2019 at 8:54 am
Hi,
I have a task which I have been working on but am struggling finding the end result.
Task Description
Check each row in Main table
If column a = ‘x’ then assign one of the 4 ‘x’ employees from employee table picked randomly to that row
If column a = ‘y’ then get one of the 4 ‘y’ Standards officers from employee table picked randomly to that row
Employee Table
Id Dept Employee
1 x x1
2 x x2
3 x x3
4 x x4
5 y y1
6 y y2
7 y y3
8 y y4
Main table (DATA) contains many columns and I need end column (Employee) to be populated by a Employee based on Dept and shauffled in employee table to randomise also.
Hope you can help
Thank you
January 24, 2019 at 9:18 am
This should get you started. Since you're new here, I created your table and added some data. If you want a tested answer, you should provide this.CREATE TABLE Employee (
ID INT,
Dept CHAR,
EmpID CHAR(2)
);
GO
INSERT INTO Employee
VALUES (1, 'x', 'x1'),
(2, 'x', 'x2'),
(3, 'x', 'x3'),
(4, 'x', 'x4'),
(5, 'y', 'y1'),
(6, 'y', 'y2'),
(7, 'y', 'y3'),
(8, 'y', 'y4');
SELECT TOP (4) *
FROM Employee
ORDER BY NEWID();
January 24, 2019 at 10:49 am
At first glance, you could do this with a correlated subquery. This would pick a random X or Y employee for each DataTable row.
UPDATE dbo.DataTable SET Employee = (
SELECT TOP (1) Employee
FROM dbo.EmployeeTable e
WHERE e.Dept = DataTable.a
ORDER BY NEWID() );
The problem here is, if the Employee table is large, the subquery could get expensive. You also don't say whether there are more than 4 employees in the X and Y departments to choose from.
If you want to first select a set of 4 random X employees and a set of 4 random Y employees, then randomly assign from those groups to each DataTable row, it might look like this. The column "PK" would be the primary key of the Data table.
WITH xemp AS (
-- Pick 4 X employees at random
SELECT TOP (4) Employee
FROM dbo.EmployeeTable
WHERE Dept = 'x'
ORDER BY NEWID() ),
yemp AS (
-- Pick 4 Y employees at random
SELECT TOP (4) Employee
FROM dbo.EmployeeTable
WHERE Dept = 'y'
ORDER BY NEWID() )
UPDATE d SET Employee = e.Employee
FROM dbo.DataTable d
INNER JOIN (
-- Generate 4 matches for each DataTable row, order them randomly to pick one
SELECT dt.PK, Employee = ISNULL(x.Employee, y.Employee),
PickOne = ROW_NUMBER() OVER (PARTITION BY dt.PK ORDER BY (SELECT NEWID()))
FROM dbo.DataTable dt
LEFT JOIN xemp ON dt.a = 'x'
LEFT JOIN yemp ON dt.a = 'y'
) e ON d.PK = e.PK AND PickOne = 1;
January 24, 2019 at 6:30 pm
I'm pretty sure that no matter what happens, there's going to be a table or index scan involved and that there will also be a SORT involved. Borrowing heavily on the test harness that pietlinden provided, we can do it all for any number of departments in a single scan of the table and without having to make helper tables.
naveed.tuf - Thursday, January 24, 2019 8:54 AMHi,
I have a task which I have been working on but am struggling finding the end result.
Task Description
Check each row in Main table
If column a = ‘x’ then assign one of the 4 ‘x’ employees from employee table picked randomly to that row
If column a = ‘y’ then get one of the 4 ‘y’ Standards officers from employee table picked randomly to that rowEmployee Table
Id Dept Employee
1 x x1
2 x x2
3 x x3
4 x x4
5 y y1
6 y y2
7 y y3
8 y y4
Main table (DATA) contains many columns and I need end column (Employee) to be populated by a Employee based on Dept and shauffled in employee table to randomise also.
Hope you can help
Thank you
How many times is Dept repeated in the "Main Table"?
--Jeff Moden
Change is inevitable... Change for the better is not.
January 25, 2019 at 5:22 am
It occurred to me that instead of randomly reordering the Employees over and over again for each DataTable row, the NTILE function could be used to randomly divide DataTable into four groups in one pass.
UPDATE d SET Employee = e.Employee
FROM dbo.DataTable d
INNER JOIN (
-- Assign each DataTable row to one of four groups randomly
SELECT dt.PK, Dept = dt.a,
n = NTILE(4) OVER (PARTITION BY a ORDER BY (SELECT NEWID()))
FROM dbo.DataTable dt
WHERE dt.a IN ('x', 'y')
) dn ON d.PK = dn.PK
INNER JOIN (
-- Pick X employees and Y employees at random, only the top 4 of each will match an NTILE value
SELECT Dept, Employee,
n = ROW_NUMBER() OVER (PARTITION BY Dept ORDER BY (SELECT NEWID()))
FROM dbo.EmployeeTable
WHERE Dept IN ('x', 'y')
) e ON dn.Dept = e.Dept AND dn.n = e.n
January 25, 2019 at 5:55 am
naveed.tuf - Thursday, January 24, 2019 8:54 AMHi,
I have a task which I have been working on but am struggling finding the end result.
Task Description
Check each row in Main table
If column a = ‘x’ then assign one of the 4 ‘x’ employees from employee table picked randomly to that row
If column a = ‘y’ then get one of the 4 ‘y’ Standards officers from employee table picked randomly to that rowEmployee Table
Id Dept Employee
1 x x1
2 x x2
3 x x3
4 x x4
5 y y1
6 y y2
7 y y3
8 y y4
Main table (DATA) contains many columns and I need end column (Employee) to be populated by a Employee based on Dept and shauffled in employee table to randomise also.
Hope you can help
Thank you
We're operating blind here. Please answer my previous question. Is the Dept column in the "Main table (DATA)" unique or not?
--Jeff Moden
Change is inevitable... Change for the better is not.
January 28, 2019 at 4:15 am
Hello all,
Thank you for your responses.
I have been able to resolve this by adding a random column (random values generated between 1 to 4) to my main table and then concatenated the dept column and random column .
For anyone that may be looking for a similar solution the TSQL is:
"update [DATA]
set rnd = CAST(SUBSTRING(CAST(NEWID() AS binary(16)),1, 3) AS int) % 5"
Appreciate all your responses, i am sure I will be using these at some later stage.
Thank you again!
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply