pick random (shuffle by dept) employees from employees table and update main table with employee name

Question

pick random (shuffle by dept) employees from employees table and update main table with employee name

naveed.tuf

SSC Veteran

Points: 268
More actions
January 24, 2019 at 8:54 am

#416651

Hi,
I have a task which I have been working on but am struggling finding the end result.
Task Description
Check each row in Main table
If column a = ‘x’ then assign one of the 4 ‘x’ employees from employee table picked randomly to that row
If column a = ‘y’ then get one of the 4 ‘y’ Standards officers from employee table picked randomly to that row
Employee Table
Id   Dept               Employee
1              x              x1
2              x              x2
3              x              x3
4              x              x4
5              y              y1
6              y              y2
7              y              y3
8              y              y4
Main table (DATA) contains many columns and I need end column (Employee) to be populated by a Employee based on Dept and shauffled in employee table to randomise also.
Hope you can help
Thank you

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63503 More actions · Answer 1

This should get you started. Since you're new here, I created your table and added some data. If you want a tested answer, you should provide this.
CREATE TABLE Employee ( ID INT, Dept CHAR, EmpID CHAR(2) ); GO INSERT INTO Employee VALUES (1, 'x', 'x1'), (2, 'x', 'x2'), (3, 'x', 'x3'), (4, 'x', 'x4'), (5, 'y', 'y1'), (6, 'y', 'y2'), (7, 'y', 'y3'), (8, 'y', 'y4');

SELECT TOP (4) * FROM Employee ORDER BY NEWID();

Scott Coleman One Orange Chip Points: 27452 More actions · Answer 2

At first glance, you could do this with a correlated subquery. This would pick a random X or Y employee for each DataTable row.

UPDATE dbo.DataTable SET Employee = ( SELECT TOP (1) Employee FROM dbo.EmployeeTable e WHERE e.Dept = DataTable.a ORDER BY NEWID() );

The problem here is, if the Employee table is large, the subquery could get expensive. You also don't say whether there are more than 4 employees in the X and Y departments to choose from.

If you want to first select a set of 4 random X employees and a set of 4 random Y employees, then randomly assign from those groups to each DataTable row, it might look like this. The column "PK" would be the primary key of the Data table.

WITH xemp AS ( -- Pick 4 X employees at random SELECT TOP (4) Employee FROM dbo.EmployeeTable WHERE Dept = 'x' ORDER BY NEWID() ), yemp AS ( -- Pick 4 Y employees at random SELECT TOP (4) Employee FROM dbo.EmployeeTable WHERE Dept = 'y' ORDER BY NEWID() ) UPDATE d SET Employee = e.Employee FROM dbo.DataTable d INNER JOIN ( -- Generate 4 matches for each DataTable row, order them randomly to pick one SELECT dt.PK, Employee = ISNULL(x.Employee, y.Employee), PickOne = ROW_NUMBER() OVER (PARTITION BY dt.PK ORDER BY (SELECT NEWID())) FROM dbo.DataTable dt LEFT JOIN xemp ON dt.a = 'x' LEFT JOIN yemp ON dt.a = 'y' ) e ON d.PK = e.PK AND PickOne = 1;

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 3

I'm pretty sure that no matter what happens, there's going to be a table or index scan involved and that there will also be a SORT involved. Borrowing heavily on the test harness that pietlinden provided, we can do it all for any number of departments in a single scan of the table and without having to make helper tables.

naveed.tuf - Thursday, January 24, 2019 8:54 AM
Hi,
I have a task which I have been working on but am struggling finding the end result.
Task Description
Check each row in Main table
If column a = â€˜xâ€™ then assign one of the 4 â€˜xâ€™ employees from employee table picked randomly to that row
If column a = â€˜yâ€™ then get one of the 4 â€˜yâ€™ Standards officers from employee table picked randomly to that row
Employee Table
Id   Dept               Employee
1              x              x1
2              x              x2
3              x              x3
4              x              x4
5              y              y1
6              y              y2
7              y              y3
8              y              y4
Main table (DATA) contains many columns and I need end column (Employee) to be populated by a Employee based on Dept and shauffled in employee table to randomise also.
Hope you can help
Thank you

How many times is Dept repeated in the "Main Table"?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Scott Coleman One Orange Chip Points: 27452 More actions · Answer 4

It occurred to me that instead of randomly reordering the Employees over and over again for each DataTable row, the NTILE function could be used to randomly divide DataTable into four groups in one pass.

UPDATE d SET Employee = e.Employee FROM dbo.DataTable d INNER JOIN ( -- Assign each DataTable row to one of four groups randomly SELECT dt.PK, Dept = dt.a, n = NTILE(4) OVER (PARTITION BY a ORDER BY (SELECT NEWID())) FROM dbo.DataTable dt WHERE dt.a IN ('x', 'y') ) dn ON d.PK = dn.PK INNER JOIN ( -- Pick X employees and Y employees at random, only the top 4 of each will match an NTILE value SELECT Dept, Employee, n = ROW_NUMBER() OVER (PARTITION BY Dept ORDER BY (SELECT NEWID())) FROM dbo.EmployeeTable WHERE Dept IN ('x', 'y') ) e ON dn.Dept = e.Dept AND dn.n = e.n

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 5

naveed.tuf - Thursday, January 24, 2019 8:54 AM
Hi,
I have a task which I have been working on but am struggling finding the end result.
Task Description
Check each row in Main table
If column a = â€˜xâ€™ then assign one of the 4 â€˜xâ€™ employees from employee table picked randomly to that row
If column a = â€˜yâ€™ then get one of the 4 â€˜yâ€™ Standards officers from employee table picked randomly to that row
Employee Table
Id   Dept               Employee
1              x              x1
2              x              x2
3              x              x3
4              x              x4
5              y              y1
6              y              y2
7              y              y3
8              y              y4
Main table (DATA) contains many columns and I need end column (Employee) to be populated by a Employee based on Dept and shauffled in employee table to randomise also.
Hope you can help
Thank you

We're operating blind here. Please answer my previous question. Is the Dept column in the "Main table (DATA)" unique or not?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

naveed.tuf SSC Veteran Points: 268 More actions · Answer 6

Hello all,

Thank you for your responses.

I have been able to resolve this by adding a random column (random values generated between 1 to 4) to my main table and then concatenated the dept column and random column .

For anyone that may be looking for a similar solution the TSQL is:

"update [DATA]
set rnd = CAST(SUBSTRING(CAST(NEWID() AS binary(16)),1, 3) AS int) % 5"

Appreciate all your responses, i am sure I will be using these at some later stage.

Thank you again!