query help

Question

Post reply

query help

Mvs2k11

Ten Centuries

Points: 1223
More actions
January 29, 2014 at 10:04 pm

#281659

I need a query which should update the master table with status based
on the action columns values.
If any of the values of the tableB for action column are 1,3,4,5,6 then update the status to 'Y' for the
SID in the tableA else the SID will be 'N'.
Below is the sample data with expected output.
Thanks for your help in advance..
TableA
SID Name
---- -----
10 Andy
11 Sam
12 pat
13 Mat
14 John
TableB
SID action
---- -------
10 1
10 3
10 9
11 5
11 6
12 2
12 7
13 1
14 10
Expected Output:
SID Name status
---- ----- -------
10 Andy Y
11 Sam Y
12 pat N
13 Mat Y
14 John N

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

KtmGuy Old Hand Points: 321 More actions · Answer 1

Try executing this query ;

UPDATE a

SET

a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END

FROM

TableA a

JOIN

TableB b

ON

a.SID = b.SID

Alan Burstein SSC Guru Points: 61141 More actions · Answer 2

KtmGuy (1/29/2014)
Try executing this query ;
UPDATE a
SET
a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END
FROM
TableA a
JOIN
TableB b
ON
a.SID = b.SID

Well done! Just a quick tip: you can make this much faster with a left outer join instead of the inner join.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Sean Lange SSC Guru Points: 286590 More actions · Answer 3

Alan.B (1/30/2014)
KtmGuy (1/29/2014)
Try executing this query ;
UPDATE a
SET
a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END
FROM
TableA a
JOIN
TableB b
ON
a.SID = b.SID
Well done! Just a quick tip: you can make this much faster with a left outer join instead of the inner join.

I am not sure I follow what you mean that a left join would make this faster. Can you explain that?

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Alan Burstein SSC Guru Points: 61141 More actions · Answer 4

Sean Lange (1/31/2014)
Alan.B (1/30/2014)
KtmGuy (1/29/2014)
Try executing this query ;
UPDATE a
SET
a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END
FROM
TableA a
JOIN
TableB b
ON
a.SID = b.SID
Well done! Just a quick tip: you can make this much faster with a left outer join instead of the inner join.
I am not sure I follow what you mean that a left join would make this faster. Can you explain that?

First, and I wanted to mention this in my original response, I don't think an inner join would guarantee the correct answer. In the OP's sample data there is a match in tableB for every SID in tableA. If that were not the case then we would lose data from the LEFT table.

That said, I was making an assumption based the following DDL and Sample Data:

USE tempdb

GO

IF OBJECT_ID('tempdb.dbo.tableB') IS NOT NULL DROP TABLE dbo.tableB;

IF OBJECT_ID('tempdb.dbo.tableA') IS NOT NULL DROP TABLE dbo.tableA;

CREATE TABLE dbo.tableA([SID] int primary key, name varchar(10) not null, [status] CHAR(1) null);

CREATE TABLE dbo.tableB(g_id int identity primary key, [SID] int foreign key references dbo.tableA([SID]), [action] int not null);

INSERT dbo.tableA([SID],name) VALUES(10,'Andy'),(11,'Sam'),(12,'pat'),(13,'Mat'),(14,'John');

INSERT dbo.tableB([SID],action) VALUES(10,1),(10,3),(10,9),(11,5),(11,6),(12,2),(12,7),(13,1),(14,10);

UPDATE a

SET a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END

FROM dbo.TableA a

JOIN dbo.TableB b ON a.SID = b.SID

UPDATE a

SET a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END

FROM dbo.TableA a

LEFT JOIN dbo.TableB b ON a.SID = b.SID

UPDATE a

SET a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END

FROM dbo.TableB b

RIGHT JOIN dbo.TableA a ON a.SID = b.SID

Based on my DDL (and if there was a matching SID in tableB for each SID in tableA) all three queries should produce the correct result but the OUTER joins produce what appears to be a better query plan based on cost...

Below is some code based I threw together to test roughly 1,000,000 rows...

IF OBJECT_ID('tempdb.dbo.tableB') IS NOT NULL DROP TABLE dbo.tableB;

IF OBJECT_ID('tempdb.dbo.tableA') IS NOT NULL DROP TABLE dbo.tableA;

IF OBJECT_ID('tempdb..#prep') IS NOT NULL DROP TABLE #prep;

IF OBJECT_ID('tempdb..#stg') IS NOT NULL DROP TABLE #stg;

CREATE TABLE dbo.tableA([SID] int primary key, name varchar(10) not null, [status] CHAR(1) null);

CREATE TABLE dbo.tableB(g_id int identity primary key, [SID] int foreign key references dbo.tableA([SID]), [action] int not null);

CREATE TABLE #prep(n int primary key, eid char(8) not null, rnd tinyint not null);

CREATE TABLE #stg(SID_PK int not null, eid char(8) not null, g_id int not null, SID_FK int not null, action tinyint not null);

WITH iTally(n,eid,rnd) AS

(SELECT TOP 420000

ROW_NUMBER() OVER (ORDER BY (SELECT ($))),

LEFT(convert(char(36),newid()),8),

convert(int,floor(3*rand(convert(varbinary,newid())))+2) -- rows: 2-5

FROM sys.all_columns a CROSS JOIN sys.all_columns b)

INSERT #prep

SELECT n,eid,rnd

FROM iTally a;

INSERT #stg

SELECT

x.n AS [SID_PK], x.eid,

ROW_NUMBER() OVER (ORDER BY (SELECT ($))) AS g_id,

x.n AS [SID_FK], t.c AS [action]

FROM #prep x

CROSS APPLY

(SELECT TOP (x.rnd)

convert(int,ceiling(5*rand(convert(varbinary,newid()))))

FROM #prep) t(c)

GROUP BY x.n, x.eid, t.c;

INSERT dbo.tableA(SID,name)

SELECT DISTINCT SID_PK, eid

FROM #stg;

INSERT dbo.tableB(SID,action)

SELECT SID_FK, action

FROM #stg;

I ran this update a few times, here's a result comparison (I ran this a few times and combined the results below)

SET STATISTICS TIME ON;

UPDATE a

SET a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END

FROM dbo.TableA a

JOIN dbo.TableB b ON a.SID = b.SID

UPDATE a

SET a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END

FROM dbo.TableA a

LEFT JOIN dbo.TableB b ON a.SID = b.SID

UPDATE a

SET a.Status = CASE WHEN b.action IN(1,3,4,5,6) THEN 'Y' ELSE 'N' END

FROM dbo.TableB b

RIGHT JOIN dbo.TableA a ON a.SID = b.SID

SET STATISTICS TIME OFF;

GO

and these were the results:

--inner

SQL Server Execution Times:

run1: CPU time = 6847 ms, elapsed time = 3228 ms.

run2: CPU time = 6568 ms, elapsed time = 3172 ms.

run3: CPU time = 3946 ms, elapsed time = 2522 ms.

--left

SQL Server Execution Times:

run1: CPU time = 6054 ms, elapsed time = 2469 ms.

run2: CPU time = 5836 ms, elapsed time = 2458 ms.

run3: CPU time = 4619 ms, elapsed time = 1706 ms.

--right

SQL Server Execution Times:

run1: CPU time = 5945 ms, elapsed time = 2691 ms.

run2: CPU time = 5725 ms, elapsed time = 2491 ms.

run3: CPU time = 2823 ms, elapsed time = 1453 ms.

I'm sorry this is such a long response. I hope I have been clear. I am interested in your thoughts...

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Sean Lange SSC Guru Points: 286590 More actions · Answer 5

Nice work Alan. I too suspected that an inner join could (and probably will) not produce exactly the right results. Of course in our business accuracy is the first requirement. It doesn't much matter how fast the code runs if the results are incorrect. 😀

Interesting reading the results of your test using the different join types. Somewhat surprised the inner join was slower.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/