Rewriting an update statement without using a cursor

Question

Rewriting an update statement without using a cursor

Julie Hargraves

SSCarpal Tunnel

Points: 4457
More actions
June 30, 2006 at 12:09 pm

#168999

I wrote the following SQL, but I know there has to be a better way to do what I'm trying to accomplish. Basically, I've created a new column in the Drivers table that I need to update only if the driver's license state is different from the one corresponding row in the Policy table. I've been told how evil cursors are in terms of efficiency, but I'm really wondering if I have to use one in this case since I haven't had any luck writing an update that also has a join in it.
The relevant data in the tables looks like this:
Drivers
Id char(20)
DriverNumber smallint
State char(2)
LicenseStateDifferent char(1)
Policy
Id char(20)
State char(2)
INSERT INTO Policy (Id, State) VALUES ( '1', 'MI' )
INSERT INTO Policy (Id, State) VALUES ( '2', 'IL' )
INSERT INTO Drivers (Id, DriverNumber, State) VALUES ( '1', '1', 'MI' )
INSERT INTO Drivers (Id, DriverNumber, State) VALUES ( '1', '2', 'OH' )
INSERT INTO Drivers (Id, DriverNumber, State) VALUES ( '1', '3', '' )
INSERT INTO Drivers (Id, DriverNumber, State) VALUES ( '2', '1', 'IL' )
The only row in Drivers that should get updated is Id = 1, DriverNumber = 2. The primary key on Policy is Id, and the primary key on Drivers is Id and DriverNumber. My query currently looks like this:
DECLARE @Id  char(20)
DECLARE @DriverNumber smallint
DECLARE temp_cursor CURSOR FOR
      SELECT Drivers.Id
                , Drivers.DriverNumber
                    FROM Drivers
                            LEFT JOIN Policy
                                   ON Policy.Id = Drivers.Id
                    WHERE RTrim ( IsNull ( Drivers.State , '' ) ) <> ''
                              AND Policy.State <> Drivers.State
OPEN temp_cursor
FETCH FROM temp_cursor INTO @Id, @DriverNumber
WHILE (@@FETCH_STATUS = 0)
     BEGIN
        UPDATE Drivers
             SET LicenseStateDifferent = 'Y'
             WHERE Drivers.Id = @Id
                    AND Drivers.DriverNumber = @DriverNumber
         FETCH FROM temp_cursor INTO @Id, @DriverNumber
     END
CLOSE temp_cursor
DEALLOCATE temp_cursor
Any help on eliminating the cursor would be appreciated.

The SELECT statement in the view cannot contain these Transact-SQL syntax elements:

The select list cannot use the * or table_name.* syntax to specify columns. Column names must be explicitly stated.
A table column name used as a simple expression cannot be specified in more than one view column. A column can be referenced multiple times provided all, or all but one, reference to the column is part of a complex expression or a parameter to a function. For example, this select list is invalid:
```
SELECT ColumnA, ColumnB, ColumnA
```
These select lists are valid:
```
SELECT ColumnA, AVG(ColumnA), ColumnA + Column B AS AddColAColBSELECT SUM(ColumnA), ColumnA % ColumnB AS ModuloColAColB
```
A derived table.
Rowset functions.
UNION operator.
Subqueries.
Outer or self joins.
TOP clause.
ORDER BY clause.
DISTINCT keyword.
COUNT(*) (COUNT_BIG(*) is allowed.)

The AVG, MAX, MIN, STDEV, STDEVP, VAR, or VARP aggregate functions. If AVG, MAX, MIN, STDEV, STDEVP, VAR, or VARP are specified in queries referencing the indexed view, the optimizer can often calculate the needed result if the view select list contains these substitute functions.

Complex aggregate function

Substitute simple aggregate functions

AVG(X)

SUM(X), COUNT_BIG(X)

STDEV(X)

SUM(X), COUNT_BIG(X), SUM(X**2)

STDEVP(X)

SUM(X), COUNT_BIG(X), SUM(X**2)

VAR(X)

SUM(X), COUNT_BIG(X), SUM(X**2)

VARP(X)

SUM(X), COUNT_BIG(X), SUM(X**2)

For example, an indexed view select list cannot contain the expression AVG(SomeColumn). If the view select list contains the expressions SUM(SomeColumn) and COUNT_BIG(SomeColumn), SQL Server can calculate the average for a query that references the view and specifies AVG(SomeColumn).

A SUM function that references a nullable expression.
The full-text predicates CONTAINS or FREETEXT.
COMPUTE or COMPUTE BY clause.

If GROUP BY is not specified, the view select list cannot contain aggregate expressions.
If GROUP BY is specified, the view select list must contain a COUNT_BIG(*) expression, and the view definition cannot specify HAVING, CUBE, or ROLLUP.
A column resulting from an expression that either evaluates to a float value or uses float expressions for its evaluation cannot be a key of an index in an indexed view or a table.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

PW-201837 SSC-Insane Points: 20805 More actions · Answer 1

UPDATE D

SET LicenseStateDifferent =

CASE WHEN P.State IS NULL THEN 'Y' ELSE 'N' END

FROM Drivers As D

LEFT JOIN Policy As P

ON ( D.Id = P.Id AND

(D.State = P.State OR D.State = '') )

stax68 SSChampion Points: 11711 More actions · Answer 2

It might be better just to write a view for this - especially if drivers might change policy or state. That way you know the denormalised values always reflect the underlying data. You can index the view if performance is an issue.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 3

Shouldn't need to index the view if the underlying tables are properly indexed... indexing a view also requires the view to be schema bound and that all UDF's addressed by the view also be created with the schema binding option. On top of that, I think the restrictions for an indexed view just aren't worth it... the query that PW posted supposedly (I haven't tried it) will not work in an indexed view because it contains an outer join. From BOL...

Julie Hargraves SSCarpal Tunnel Points: 4457 More actions · Answer 4

Considering this is going to be a one-time conversion once we add the column to the table, I don't really see the need for a view. The data itself can change, but we take care of data integrity programmatically in our web application. I have yet to try the query posted here, too, but I'll go test that out right now.

Julie Hargraves SSCarpal Tunnel Points: 4457 More actions · Answer 5

Looks like the only thing I need to do differently on the query is only update the rows that will have LicenseStateDifferent = 'Y'. The field is originally defaulted to 'N' since this is the most common value. I don't want to update every row in the table since there are around 300,000 rows or something like that. I figured I would default it to 'N' for everyone and then only update the ones that are 'Y' with this conversion. If the state in Drivers is blank, we're leaving the value to 'N' because it means the driver hasn't selected their state yet, and it's most likely to be 'N' but when they do select their state, the web app will take care of the data integrity part. Thanks for everyone's help so far! I might be able to figure it out from here on my own, but I wouldn't turn down a little extra help.

Julie Hargraves SSCarpal Tunnel Points: 4457 More actions · Answer 6

Never mind. I did get it on my own. The following query is the one I'll be using:

UPDATE D

SET LicenseStateDifferent = 'Y'

FROM Drivers As D

LEFT JOIN Policy As P

ON D.Id = P.Id

WHERE D.State <> P.State

AND RTrim ( IsNull ( D.State, '' ) ) <> ''

stax68 SSChampion Points: 11711 More actions · Answer 7

Nearly there, but you don't and never did need the left join (Jeff!). Your initialisation query may be once only, but the minatenance of the data isn't once only, is it. The web app has to do it. I don't like web apps maintaining my data.

Jeff:

>Shouldn't need to index the view if the underlying tables are properly indexed

Well that isn't generally true, is it. An example - indexing highly aggregated data from a huge table. Note I specified 'if performance is an issue'. It probably won't be if the tables are indexed, no. I generally mention this possibility to forestall specious objections to eliminating denormalised data, based on performance/contention.

>the restrictions for an indexed view just aren't worth it

Which of the disallowed features do you wish you could use? Select *? Float data? Perhaps preconcatenating data with a UNION? Surely not derived tables, since you can use a view instead, which promotes code reuse hence consistency. Rowset functions, perhaps? Nah. And I can't imagine you're bothered by being unable to enforce an ephemeral order on a whole view's worth of data (without necessarily indexing the ordering columns), when you can always order in the outermost SELECT. Maybe it's the fact that some aggregates have to be stored in decomposed form, even though the aggregates will still be usable from outside the view? Seems a bit picky. Try averaging some averages one day and see how you get on...

No, it must be subqueries and left joins. Neither are needed here, of course, or in many other expensive queries. In those cases, the restrictions are neither worth it nor not worth it. Worth what?

There are other reasons wieghing against using indexed views, e.g. maintenance overhead in a highly volatile OLTP environment, but in a given case, the restrictions will usually either apply, and rule it out altogether, or not apply, so having no relevance. It's not a question of being 'worth it'.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 8

>Shouldn't need to index the view if the underlying tables are properly indexed

Well that isn't generally true, is it.

Yes, I've found that it is generally true especially on aggragated data from a huge table if the aggragates

and indexes are well formed. Check it out (single table test, multiple table join works in a similar fashion)...

--===== Create and populate a 4 million row test table to test with.

SELECT TOP 4000000

IDENTITY(INT,1,1) AS RowNum,

CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT) AS UserID,

CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) AS SomeValue,

'A column for kicks' AS Kicks,

'Still another column just for proofing' AS StillAnother,

CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY) AS SomeNumber1,

CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY) AS SomeNumber2,

CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY) AS SomeNumber3,

CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME) AS ADate --(>=01/01/2000 <01/01/2010)

INTO dbo.BigTest

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

GO

--===== Add a calculated column to strip the time from the date.

ALTER TABLE dbo.BigTest

ADD AMonth AS YEAR(ADate)*100+MONTH(ADate)

GO

--===== Create a test view

CREATE VIEW vBigTest

AS

SELECT SomeValue, AMonth,

CASE

WHEN LEFT(SomeValue,1)=RIGHT(SomeValue,1)

THEN AVG(SomeNumber1+SomeNumber2+SomeNumber3)

WHEN LEFT(SomeValue,1)<RIGHT(SomeValue,1)

THEN AVG(SomeNumber1+SomeNumber2)

WHEN LEFT(SomeValue,1)>RIGHT(SomeValue,1)

THEN AVG(SomeNumber2+SomeNumber3)

END AS AvgSomeNumber

FROM dbo.BigTest

GROUP BY SomeValue,AMonth

--===== Show the execution plan without and with indexes including a primary key

GO

SET SHOWPLAN_TEXT ON

GO

SELECT * FROM vBigTest

GO

SET SHOWPLAN_TEXT OFF

GO

ALTER TABLE dbo.BigTest ADD PRIMARY KEY CLUSTERED (RowNum)

GO

SET SHOWPLAN_TEXT ON

GO

SELECT * FROM vBigTest

GO

SET SHOWPLAN_TEXT OFF

GO

CREATE INDEX BigTest2 ON dbo.BigTest(SomeValue, AMonth, SomeNumber1, SomeNumber2, SomeNumber3)

GO

SET SHOWPLAN_TEXT ON

GO

SELECT * FROM vBigTest

GO

SET SHOWPLAN_TEXT OFF

GO

... produces... {edit... previously included result of a "cubed" view instead of simple "group by". Has been updated.}


(400000 row(s) affected)

StmtText                  
------------------------- 
SELECT * FROM vBigTest

(1 row(s) affected)

StmtText                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
  |--Compute Scalar(DEFINE: ([Expr1005]=If (substring(Convert([BigTest].[SomeValue]), 1, 1)=right(Convert([BigTest].[SomeValue]), 1)) then [Expr1002] else If (substring(Convert([BigTest].[SomeValue]), 1, 1)<right(Convert([BigTest].[SomeValue]), 1)) then [Expr1003] else If (substring(Convert([BigTest].[SomeValue]), 1, 1)>right(Convert([BigTest].[SomeValue]), 1)) then [Expr1004] else NULL))
       |--Compute Scalar(DEFINE: ([Expr1002]=If ([Expr1024]=0) then NULL else ([Expr1025]/Convert([Expr1024])), [Expr1003]=If ([Expr1026]=0) then NULL else ([Expr1027]/Convert([Expr1026])), [Expr1004]=If ([Expr1028]=0) then NULL else ([Expr1029]/Convert([Expr1028]))))
            |--Hash Match(Aggregate, HASH: ([BigTest].[SomeValue], [BigTest].[AMonth]), RESIDUAL: ([BigTest].[SomeValue]=[BigTest].[SomeValue] AND [BigTest].[AMonth]=[BigTest].[AMonth]) DEFINE: ([Expr1024]=COUNT_BIG([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1025]=SUM([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1026]=COUNT_BIG([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]), [Expr1027]=SUM([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]), [Expr1028]=COUNT_BIG([BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1029]=SUM([BigTest].[SomeNumber2]+[BigTest].[SomeNumber3])))
                 |--Compute Scalar(DEFINE: ([BigTest].[AMonth]=datepart(year, dbo.BigTest.[ADate])*100+datepart(month, dbo.BigTest.[ADate])))
                      |--Table Scan(OBJECT: ([BELUTION].[dbo].[BigTest]))

(5 row(s) affected)

StmtText                  
------------------------- 
SELECT * FROM vBigTest

(1 row(s) affected)

StmtText                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
  |--Compute Scalar(DEFINE: ([Expr1005]=If (substring(Convert([BigTest].[SomeValue]), 1, 1)=right(Convert([BigTest].[SomeValue]), 1)) then [Expr1002] else If (substring(Convert([BigTest].[SomeValue]), 1, 1)<right(Convert([BigTest].[SomeValue]), 1)) then [Expr1003] else If (substring(Convert([BigTest].[SomeValue]), 1, 1)>right(Convert([BigTest].[SomeValue]), 1)) then [Expr1004] else NULL))
       |--Compute Scalar(DEFINE: ([Expr1002]=If ([Expr1024]=0) then NULL else ([Expr1025]/Convert([Expr1024])), [Expr1003]=If ([Expr1026]=0) then NULL else ([Expr1027]/Convert([Expr1026])), [Expr1004]=If ([Expr1028]=0) then NULL else ([Expr1029]/Convert([Expr1028]))))
            |--Hash Match(Aggregate, HASH: ([BigTest].[SomeValue], [BigTest].[AMonth]), RESIDUAL: ([BigTest].[SomeValue]=[BigTest].[SomeValue] AND [BigTest].[AMonth]=[BigTest].[AMonth]) DEFINE: ([Expr1024]=COUNT_BIG([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1025]=SUM([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1026]=COUNT_BIG([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]), [Expr1027]=SUM([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]), [Expr1028]=COUNT_BIG([BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1029]=SUM([BigTest].[SomeNumber2]+[BigTest].[SomeNumber3])))
                 |--Compute Scalar(DEFINE: ([BigTest].[AMonth]=datepart(year, dbo.BigTest.[ADate])*100+datepart(month, dbo.BigTest.[ADate])))
                      |--Clustered Index Scan(OBJECT: ([BELUTION].[dbo].[BigTest].[PK__BigTest__0B9350C2]))

(5 row(s) affected)

StmtText                  
------------------------- 
SELECT * FROM vBigTest

(1 row(s) affected)

StmtText                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
  |--Compute Scalar(DEFINE: ([Expr1005]=If (substring(Convert([BigTest].[SomeValue]), 1, 1)=right(Convert([BigTest].[SomeValue]), 1)) then [Expr1002] else If (substring(Convert([BigTest].[SomeValue]), 1, 1)<right(Convert([BigTest].[SomeValue]), 1)) then [Expr1003] else If (substring(Convert([BigTest].[SomeValue]), 1, 1)>right(Convert([BigTest].[SomeValue]), 1)) then [Expr1004] else NULL))
       |--Compute Scalar(DEFINE: ([Expr1002]=If ([Expr1024]=0) then NULL else ([Expr1025]/Convert([Expr1024])), [Expr1003]=If ([Expr1026]=0) then NULL else ([Expr1027]/Convert([Expr1026])), [Expr1004]=If ([Expr1028]=0) then NULL else ([Expr1029]/Convert([Expr1028]))))
            |--Stream Aggregate(GROUP BY: ([BigTest].[SomeValue], [BigTest].[AMonth]) DEFINE: ([Expr1024]=COUNT_BIG([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1025]=SUM([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1026]=COUNT_BIG([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]), [Expr1027]=SUM([BigTest].[SomeNumber1]+[BigTest].[SomeNumber2]), [Expr1028]=COUNT_BIG([BigTest].[SomeNumber2]+[BigTest].[SomeNumber3]), [Expr1029]=SUM([BigTest].[SomeNumber2]+[BigTest].[SomeNumber3])))
                 |--Index Scan(OBJECT: ([BELUTION].[dbo].[BigTest].[BigTest2]), ORDERED FORWARD)

(4 row(s) affected)

 Try averaging some averages one day and see how you get on

Don't need to try it... Proper use of ROLLUP and CUBE usually make that task pretty simple.

Which of the disallowed features do you wish you could use?

These, Tim...

stax68 SSChampion Points: 11711 More actions · Answer 9

Here's an indexed view version.
Performance stats are shown at the bottom for three tests, comparing use of base table indexes v use of view indexes.

[edit: revised and expanded, nothing essential changed.]

--/*--session settings-------------

set

numeric_roundabort off

set ansi_nulls on

set ansi_padding on

set ansi_warnings on

set arithabort on

set concat_null_yields_null on

set quoted_identifier on

--*/--end session settings---------

go--

--

/*----CLEANUP CODE---------------------------------------------------------

drop view dbo.vBigTestA_C_FromTables

go--

drop view dbo.vBigTestA_C_FromViewIndex

go--

drop view dbo.vBigTestA_C_AGG

go--

drop table dbo.MoreData

go--

drop table dbo.BigTest

go--

--

*/----END CLEANUP CODE-----------------------------------------------------

--

--/*--DDL AND TEST DATA----------------------------------------------------

create

table MoreData(RowNum int identity not null primary key clustered, SomeNumber3 money not null)

go--

--

create

table dbo.BigTest

(

Rownum int identity not null unique clustered, MoreDataRowNum int not null

, Userid int not null, SomeValue varchar(2) not null

, Kicks varchar(1000) null

, EighthOfAPage varchar(1000) null

, SomeNumber1 money not null, SomeNumber2 money not null

, ADate datetime not null, AMonth as YEAR(ADate)*100+MONTH(ADate)

)

go--

--

--in the TOP expression, use the square root of the desired total number of records

--so using 1415 results in 2,002,225 records in table BigTest

insert

MoreData(SomeNumber3) select top 2000 rand(cast(newid() as varbinary))*100 SomeNumber3

from

master.dbo.SysColumns sc1

cross

join master.dbo.SysColumns sc2

go--

--

declare

@monot_all bigint, @monot_1_26 bigint, @monot_1_1000 bigint

select @monot_all = count_big(*)

, @monot_1_26 = count_big(*)/26+1

,

@monot_1_1000 = count_big(*)/1000+1 from MoreData

insert

dbo.BigTest(MoreDataRowNum,Userid,SomeValue,Kicks,EighthOfAPage,SomeNumber1,SomeNumber2,ADate)

select

m.Rownum MoreDataRowNum

,rand(cast(newid() as varbinary))*50000+1 Userid

,CHAR(STR(m.Rownum/@monot_1_26+65))

+ CHAR(STR(m2.Rownum/@monot_1_26+65)) SomeValue

,'A column for kicks' Kicks

,'X' --replicate('X',1000) EighthOfAPage

,m.Rownum/@monot_1_1000 + (1.0 - 1.0/m.Rownum) SomeNumber1

,m2.Rownum/@monot_1_1000 + 55.0/m.Rownum SomeNumber2

,rand(cast(newid() as varbinary))*3653.0+36524.0 ADate --(>=01/01/2000 <01/01/2010)

from MoreData m

cross join MoreData m2

go--

--

create

index IX_BigTest_covering on dbo.BigTest(Rownum, MoreDataRowNum, SomeValue, AMonth, SomeNumber1, SomeNumber2)

go--

--

create

view dbo.vBigTestA_C_FromTables

AS

SELECT SomeValue, AMonth,

CASE

WHEN LEFT(SomeValue,1)=RIGHT(SomeValue,1)

THEN AVG(SomeNumber1+SomeNumber2+SomeNumber3)

WHEN LEFT(SomeValue,1)<RIGHT(SomeValue,1)

THEN AVG(SomeNumber1+SomeNumber2)

WHEN LEFT(SomeValue,1)>RIGHT(SomeValue,1)

THEN AVG(SomeNumber2+SomeNumber3)

END AS AvgSomeNumber,

sum(SomeNumber1) SomeNumber1 ,sum(SomeNumber2) SomeNumber2, sum(SomeNumber3) SomeNumber3

from dbo.BigTest b join dbo.MoreData m

on b.MoreDataRowNum = m.RowNum

where right(SomeValue,1) not between 'A' and 'C'

group

by SomeValue,AMonth

go--

--

create

view dbo.vBigTestA_C_AGG

with

schemabinding

as

select AMonth

,left(

b.SomeValue,1) SomeValue_LEFT1

,right(

b.SomeValue,1) SomeValue_RIGHT1

,

sum(b.SomeNumber1) SomeNumber1_SUM

,

sum(b.SomeNumber2) SomeNumber2_SUM

,

sum(m.SomeNumber3) SomeNumber3_SUM

,

count_big(SomeValue) _COUNT

from

dbo.BigTest b

join

dbo.MoreData m

on b.MoreDataRowNum = m.Rownum

where

right(SomeValue,1) not between 'A' and 'C'

group

by AMonth,left(SomeValue,1),right(SomeValue,1)

go--

--

create

unique clustered index IXC_vBigTestA_C_AGG_covering on vBigTestA_C_AGG(SomeValue_LEFT1, SomeValue_RIGHT1, AMonth)

go--

--

create

view vBigTestA_C_FromViewIndex

as

SELECT SomeValue_LEFT1+SomeValue_RIGHT1 SomeValue, AMonth,

case when SomeValue_LEFT1=SomeValue_RIGHT1

then (SomeNumber1_SUM+SomeNumber2_SUM+SomeNumber3_SUM)/_COUNT

when SomeValue_LEFT1<SomeValue_RIGHT1

then (SomeNumber1_SUM+SomeNumber2_SUM)/_COUNT

when SomeValue_LEFT1>SomeValue_RIGHT1

then (SomeNumber2_SUM+SomeNumber3_SUM)/_COUNT

end as AvgSomeNumber

from dbo.vBigTestA_C_AGG with(noexpand)

go--

--/*--END DDL AND TEST DATA------------------------------------------------

--

--/*--performance test code------------------------------------------------

--/*--session settings-------------

set

numeric_roundabort off

set ansi_nulls on

set ansi_padding on

set ansi_warnings on

set arithabort on

set concat_null_yields_null on

set quoted_identifier on

--*/--end session settings---------

set

statistics profile on

set statistics time on

go-- TEST 1: full data set based on original spec, ordered

select

SomeValue, AMonth, AvgSomeNumber from vBigTestA_C_FromTables order by SomeValue

go--

select

SomeValue, AMonth, AvgSomeNumber from vBigTestA_C_FromViewIndex

--

go-- TEST 2: moderately selective query

select

SomeValue

,

AMonth

,

right(SomeValue,1) SomeValueRight1

,

abs(SomeNumber1-SomeNumber3) SomeNumber1_3diff

,

count_big(*) Cnt

from

vBigTestA_C_FromTables

where

right(SomeValue,1) in ('F','X','J','Z','R') or abs(SomeNumber1-SomeNumber3) <=100

group

by SomeValue, AMonth, right(SomeValue,1), abs(SomeNumber1-SomeNumber3)

order

by SomeValue

go--

select

SomeValue_LEFT1+SomeValue_RIGHT1 SomeValue

,

AMonth

,

SomeValue_RIGHT1 SomeValueRight1

,

abs(SomeNumber1_SUM - SomeNumber3_SUM) SomeNumber1_3diff

,

count_big(*) Cnt

from

dbo.vBigTestA_C_AGG with(noexpand)

where

SomeValue_RIGHT1 in ('F','X','J','Z','R') or abs(SomeNumber1_SUM - SomeNumber3_SUM) <=100

group

by SomeValue_LEFT1,SomeValue_RIGHT1, AMonth, abs(SomeNumber1_SUM - SomeNumber3_SUM)

--

go-- TEST 3: more selective query

select

SomeValue

,

AMonth

,

right(SomeValue,1) SomeValueRight1

,

count_big(*) Cnt

from

vBigTestA_C_FromTables

where

SomeValue in ('FR','XR','DR', 'RR')

or

SomeValue between 'JF' and 'JZ'

group

by SomeValue, AMonth, right(SomeValue,1)

order

by SomeValue

go--

select

SomeValue_LEFT1+SomeValue_RIGHT1 SomeValue

,

AMonth

,

SomeValue_RIGHT1 SomeValueRight1

,

count_big(*) Cnt

from

dbo.vBigTestA_C_AGG with(noexpand)

where

(SomeValue_RIGHT1 = 'R' and SomeValue_LEFT1 in ('F','X','D','R'))

or (SomeValue_LEFT1 = 'J' and SomeValue_RIGHT1 between 'F' and 'Z')

group

by SomeValue_LEFT1,SomeValue_RIGHT1, AMonth

go--

set

statistics time off

set statistics profile off

go--

--*/--end performance test code--------------------------------------------

Here's the output:

TEST 1: Complete output from views.

    Query 1 (table indexes) v Query 2 (view indexes).

    result: view indexes found to be 10 times faster than table indexes.
                        table indexes estimated to be 128 times more costly than view indexes.

TEST 2: Moderately selective complex filter applied to views.

    Query 3 (table indexes) v Query 4 (view indexes).

    result: view indexes found to be 20 times faster than table indexes.
                        table indexes estimated to be 89 times more costly than view indexes.

TEST 3: Highly selective filter, grouping applied to views.

    Query 5 (table indexes) v Query 6 (view indexes).

    result: view indexes found to be 70 times faster than table indexes.
                        table indexes estimated to be 1,283 times more costly than view indexes.
                        
COMMENTS: The optimiser's cost estimates may be inaccurate. 
It is possible to check the optimiser's accuracy in predicting rowcounts,
since the actual row count is stored during execution.

For queries 1,3 and 5 (table indexes) the rowcounts have been significantly underestimated for
the two main operations (index scan and hash join).

In queries 2 and 6 the rowcount estimates where correct.                
In query 4 the rowcount was doubled.

This means that the cost ratios given above are likely to be overgenerous(!) to the table index method.

------------------------------------------------------------------------------------------------------------
Query 1: query cost (relative to the batch): 40%
------------------------------------------------------------------------------------------------------------

    Statement text:

           select SomeValue, AMonth, AvgSomeNumber from vBigTestA_C_FromTables order by SomeValue

    Execution Plan:

           StmtText                                                                                                                                                                                                                                               
           -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
           select SomeValue, AMonth, AvgSomeNumber from vBigTestA_C_FromTables order by SomeValue                                                                                                                                                                 
                |--Sort(ORDER BY: (.[SomeValue] ASC))                                                                                                                                                                                                          
                     |--Compute Scalar(DEFINE: ([Expr1010]=CASE WHEN substring([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue],(1),(1))=right([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue],(1)) THEN [Expr1004] ELSE CASE 
                          |--Compute Scalar(DEFINE: ([Expr1004]=CASE WHEN [Expr1028]=(0) THEN NULL ELSE [Expr1029]/CONVERT_IMPLICIT(money,[Expr1028],0) END, [Expr1005]=CASE WHEN [Expr1030]=(0) THEN NULL ELSE [Expr1031]/CONVERT_IMPLICIT(money,[
                               |--Hash Match(Aggregate, HASH: (.[SomeValue], .[AMonth]), RESIDUAL: ([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue] = [Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue] AND [Scheduler].[
                                    |--Hash Match(Inner Join, HASH: ([m].[RowNum])=(.[MoreDataRowNum]))                                                                                                                                            
                                         |--Clustered Index Scan(OBJECT: ([Scheduler].[dbo].[MoreData].[PK__MoreData__24DD5622] AS [m]))                                                                                                           
                                         |--Compute Scalar(DEFINE: (.[AMonth]=[Scheduler].[dbo].[BigTest].[AMonth] as .[AMonth], [Expr1011]=[Scheduler].[dbo].[BigTest].[SomeNumber1] as .[SomeNumber1]+[Scheduler].[dbo].[BigTest].[SomeN
                                              |--Index Scan(OBJECT: ([Scheduler].[dbo].[BigTest].[IX_BigTest_covering] AS ),  WHERE: (right([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue],(1))<'A' OR right([Scheduler].[dbo].

=[Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_LEFT1]+[Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_RIGHT1], [Expr1004]=CASE WHEN [Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_LEFT1]=[Sch

|--Clustered Index Scan(OBJECT: ([Scheduler].[dbo].[vBigTestA_C_AGG].[IXC_vBigTestA_C_AGG_covering]))

Rows PhysicalOp (LogicalOp) EstimateRows EstimateIO EstimateCPU AvgRowSize SubtreeCost

--------- ---------------------- ------------ ---------- ----------- ---------- -----------

71760 71760 0.4390236

0 Compute Scalar 71760 0 0.007176 24 0.4390236

71760 Clustered Index Scan 71760 0.3527546 0.079093 49 0.4318476

    Time statistics:

           SQL Server Execution Times:
                 CPU time = 201 ms,  elapsed time = 2084 ms.
           SQL Server parse and compile time: 
                 CPU time = 40 ms, elapsed time = 140 ms.

------------------------------------------------------------------------------------------------------------
Query 3: query cost (relative to the batch): 37%
------------------------------------------------------------------------------------------------------------

    Statement text:

           select SomeValue, AMonth, right(SomeValue,1) SomeValueRight1, abs(SomeNumber1-SomeNumber3) SomeNumber1_3diff, count_big(*) Cnt from vBigTestA_C_FromTables where right(SomeValue,1) in ('F','X','J','Z','R') or abs(SomeNumber1-SomeN

    Execution Plan:

           StmtText                                                                                                                                                                                                                                               
           -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
           select SomeValue, AMonth, right(SomeValue,1) SomeValueRight1, abs(SomeNumber1-SomeNumber3) SomeNumber1_3diff, count_big(*) Cnt from vBigTestA_C_FromTables where right(SomeValue,1) in ('F','X','J','Z','R') or abs(SomeNumber1-SomeN
                |--Sort(ORDER BY: (.[SomeValue] ASC))                                                                                                                                                                                                          
                     |--Compute Scalar(DEFINE: ([Expr1012]=abs([Expr1007]-[Expr1009]), [Expr1013]=CASE WHEN (0) IS NULL THEN (0) ELSE (1) END))                                                                                                                
                          |--Filter(WHERE: ([Expr1014]='R' OR [Expr1014]='Z' OR [Expr1014]='J' OR [Expr1014]='X' OR [Expr1014]='F' OR abs([Expr1007]-[Expr1009])<=($100.0000)))                                                                             
                               |--Compute Scalar(DEFINE: ([Expr1014]=right([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue],(1))))                                                                                                             
                                    |--Hash Match(Aggregate, HASH: (.[SomeValue], .[AMonth]), RESIDUAL: ([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue] = [Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue] AND [Schedul
                                         |--Hash Match(Inner Join, HASH: ([m].[RowNum])=(.[MoreDataRowNum]))                                                                                                                                    
                                              |--Clustered Index Scan(OBJECT: ([Scheduler].[dbo].[MoreData].[PK__MoreData__24DD5622] AS [m]))                                                                                                   
                                              |--Compute Scalar(DEFINE: (.[AMonth]=[Scheduler].[dbo].[BigTest].[AMonth] as .[AMonth]))                                                                                                    
                                                   |--Index Scan(OBJECT: ([Scheduler].[dbo].[BigTest].[IX_BigTest_covering] AS ),  WHERE: (right([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue],(1))<'A' OR right([Scheduler].
           
           
           Rows     PhysicalOp (LogicalOp) EstimateRows EstimateIO EstimateCPU AvgRowSize SubtreeCost  
           -------- ----------------------  ---------- ---------- ----------- ---------- -----------  
           15716                            38759.57                                     52.92312        
           15716    Sort                    38759.57   0.01126126   2.711797       35    52.92312        
           0        Compute Scalar          38759.57   0            0.003875957    35    50.20007        
           15716    Filter                  38759.57   0            0.190273       35    50.19619        
           0        Compute Scalar          79946.64   0            0.007994664    35    50.00592        
           71760    Hash Match (Aggregate)  79946.64   0            16.06204       33    49.99792        
           3540000  Hash Match (Inner Join) 2034241    0            10.936         33    33.93587        
           2000     Clustered Index Scan    2000       0.00682870   0.002357       19    0.009185703  
           0        Compute Scalar          2040000    0            0.4            31    19.47069        
           3540000  Index Scan              2040000    14.67053     4.400157       29    19.07069

    Time statistics:

           SQL Server Execution Times:
                 CPU time = 12948 ms,  elapsed time = 17230 ms.
           SQL Server parse and compile time: 
                 CPU time = 801 ms, elapsed time = 1538 ms.

------------------------------------------------------------------------------------------------------------
Query 4: query cost (relative to the batch): 0%
------------------------------------------------------------------------------------------------------------

    Statement text:

           select SomeValue_LEFT1+SomeValue_RIGHT1 SomeValue, AMonth, SomeValue_RIGHT1 SomeValueRight1, abs(SomeNumber1_SUM - SomeNumber3_SUM) SomeNumber1_3diff, count_big(*) Cnt from dbo.vBigTestA_C_AGG with(noexpand) where SomeValue_RIGHT

    Execution Plan:
    
           StmtText                                                                                                                                                                                                                                               
           -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
           select SomeValue_LEFT1+SomeValue_RIGHT1 SomeValue, AMonth, SomeValue_RIGHT1 SomeValueRight1, abs(SomeNumber1_SUM - SomeNumber3_SUM) SomeNumber1_3diff, count_big(*) Cnt from dbo.vBigTestA_C_AGG with(noexpand) where SomeValue_RIGHT
                |--Compute Scalar(DEFINE: ([Expr1004]=CASE WHEN (0) IS NULL THEN (0) ELSE (1) END))                                                                                                                                                               
                     |--Compute Scalar(DEFINE: ([Expr1003]=abs([Scheduler].[dbo].[vBigTestA_C_AGG].[SomeNumber1_SUM]-[Scheduler].[dbo].[vBigTestA_C_AGG].[SomeNumber3_SUM]), [Expr1005]=[Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_LEFT1]+[Sche
                          |--Clustered Index Scan(OBJECT: ([Scheduler].[dbo].[vBigTestA_C_AGG].[IXC_vBigTestA_C_AGG_covering]), WHERE: ([Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_RIGHT1]='R' OR [Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValu
           
           
           Rows     PhysicalOp (LogicalOp) EstimateRows EstimateIO EstimateCPU AvgRowSize SubtreeCost  
           -------- --------------------- ------------ ---------- ----------- ---------- -----------  
           15716                          32448                                           0.5915293    
           0        Compute Scalar        32448        0           0.0032448     35       0.5915293    
           0        Compute Scalar        32448        0           0.007176      27       0.4390236    
           15716    Clustered Index Scan  32448        0.3527546   0.079093      34       0.4318476

    Time statistics:

           SQL Server Execution Times:
                 CPU time = 100 ms,  elapsed time = 786 ms.
           SQL Server parse and compile time: 
                 CPU time = 40 ms, elapsed time = 105 ms.

------------------------------------------------------------------------------------------------------------
Query 5: query cost (relative to the batch): 23%
------------------------------------------------------------------------------------------------------------

    Statement text:

           select SomeValue, AMonth, right(SomeValue,1) SomeValueRight1, count_big(*) Cnt from vBigTestA_C_FromTables where SomeValue in ('FR','XR','DR', 'RR') or SomeValue between 'JF' and 'JZ' group by SomeValue, AMonth, right(SomeValue,1

    Execution Plan:
    
           StmtText                                                                                                                                                                                                                                               
           -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
           select SomeValue, AMonth, right(SomeValue,1) SomeValueRight1, count_big(*) Cnt from vBigTestA_C_FromTables where SomeValue in ('FR','XR','DR', 'RR') or SomeValue between 'JF' and 'JZ' group by SomeValue, AMonth, right(SomeValue,1
                |--Compute Scalar(DEFINE: ([Expr1012]=CASE WHEN (0) IS NULL THEN (0) ELSE (1) END, [Expr1013]=right([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue],(1))))                                                                             
                     |--Sort(ORDER BY: (.[SomeValue] ASC))                                                                                                                                                                                                  
                          |--Hash Match(Aggregate, HASH: (.[SomeValue], .[AMonth]), RESIDUAL: ([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue] = [Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue] AND [Scheduler].[dbo].
                               |--Hash Match(Inner Join, HASH: ([m].[RowNum])=(.[MoreDataRowNum]))                                                                                                                                                    
                                    |--Clustered Index Scan(OBJECT: ([Scheduler].[dbo].[MoreData].[PK__MoreData__24DD5622] AS [m]))                                                                                                                   
                                    |--Compute Scalar(DEFINE: (.[AMonth]=[Scheduler].[dbo].[BigTest].[AMonth] as .[AMonth]))                                                                                                                    
                                         |--Index Scan(OBJECT: ([Scheduler].[dbo].[BigTest].[IX_BigTest_covering] AS ),  WHERE: ((right([Scheduler].[dbo].[BigTest].[SomeValue] as .[SomeValue],(1))<'A' OR right([Scheduler].[dbo].[BigTest]

           Rows     PhysicalOp (LogicalOp)  EstimateRows EstimateIO EstimateCPU AvgRowSize SubtreeCost  
           -------- ----------------------- ------------ ---------- ----------- ---------- -----------  
           3000                             1276.67                                        32.26978        
           0        Compute Scalar          1276.67      0          0.000127666   27       32.26978        
           3000     Sort                    1276.67      0.01126126 0.02064973    17       32.26965        
           3000     Hash Match (Aggregate)  1276.67      0          0.4595915     17       32.23774        
           148148   Hash Match (Inner Join) 63695.74     0          0.378274      21       31.77815        
           2000     Clustered Index Scan    2000         0.00682870 0.002357      11       0.009185703  
           0        Compute Scalar          63876.08     0          0.4           27       19.47069        
           148148   Index Scan              63876.08     14.67053   4.400157      21       19.07069

    Time statistics:

           SQL Server Execution Times:
                 CPU time = 10204 ms,  elapsed time = 13017 ms.
           SQL Server parse and compile time: 
                 CPU time = 20 ms, elapsed time = 140 ms.

------------------------------------------------------------------------------------------------------------
Query 6: query cost (relative to the batch): 0%
------------------------------------------------------------------------------------------------------------

    Statement text:

           select SomeValue_LEFT1+SomeValue_RIGHT1 SomeValue, AMonth, SomeValue_RIGHT1 SomeValueRight1, count_big(*) Cnt from dbo.vBigTestA_C_AGG with(noexpand) where (SomeValue_RIGHT1 = 'R' and SomeValue_LEFT1 in ('F','X','D','R')) or (Som

    Execution Plan:
    
           StmtText                                                                                                                                                                                                                                               
           -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
           select SomeValue_LEFT1+SomeValue_RIGHT1 SomeValue, AMonth, SomeValue_RIGHT1 SomeValueRight1, count_big(*) Cnt from dbo.vBigTestA_C_AGG with(noexpand) where (SomeValue_RIGHT1 = 'R' and SomeValue_LEFT1 in ('F','X','D','R')) or (Som
                |--Compute Scalar(DEFINE: ([Expr1003]=CASE WHEN (0) IS NULL THEN (0) ELSE (1) END))                                                                                                                                                               
                     |--Compute Scalar(DEFINE: ([Expr1004]=[Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_LEFT1]+[Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_RIGHT1]))                                                                                      
                          |--Concatenation                                                                                                                                                                                                                  
                                |--Clustered Index Seek(OBJECT: ([Scheduler].[dbo].[vBigTestA_C_AGG].[IXC_vBigTestA_C_AGG_covering]), SEEK: ([Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_LEFT1]='D' AND [Scheduler].[dbo].[vBigTestA_C_AGG].[Some
                                |--Clustered Index Seek(OBJECT: ([Scheduler].[dbo].[vBigTestA_C_AGG].[IXC_vBigTestA_C_AGG_covering]), SEEK: ([Scheduler].[dbo].[vBigTestA_C_AGG].[SomeValue_LEFT1]='J' AND [Scheduler].[dbo].[vBigTestA_C_AGG].[Some

           Rows     PhysicalOp (LogicalOp)  EstimateRows EstimateIO EstimateCPU AvgRowSize SubtreeCost  
           -------- ----------------------- ------------ ---------- ----------- ---------- -----------  
           3000                             3000                                              0.02513807   
           0        Compute Scalar          3000         0          0.0003      27            0.02513807   
           0        Compute Scalar          3000         0          0.0003      19            0.02483807   
           3000     Concatenation           3000         0          0.0003      19            0.02423807   
           480      Clustered Index Seek    480          0.00534722 0.000685    19            0.006032222  
           2520     Clustered Index Seek    2520         0.01497685 0.002929    19            0.01790585

    Time statistics:

           SQL Server Execution Times:
                 CPU time = 0 ms,  elapsed time = 179 ms.
           SQL Server parse and compile time: 
                 CPU time = 0 ms, elapsed time = 1 ms.

Tim Wilkinson

"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 10

Must be a nasty copy'n'paste error... The code doesn't work as posted...

GO must be the only thing on a line...

Every /* must be match with an */...

...

However, I agree that the indexed view is faster... much faster... in fact, 31.7578740 times faster (actual run

time on a single user server from your first example using a query specific tuned index instead of just a covering index)...

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(71760 row(s) affected)

32266 Table view duration (milliseconds)

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(71760 row(s) affected)

1016 Indexed view duration (milliseconds)

The DBCC messages are because I clear cache and buffers before each run.

From a CPU perspective, I got the following (top numbers are the table run) on a separate run...

SQL Server Execution Times:

CPU time = 24421 ms, elapsed time = 31702 ms.

SQL Server Execution Times:

CPU time = 406 ms, elapsed time = 1047 ms.

What a pain in the hiney to get there, though Like the book says,

you absolutely cannot have nulls in the aggragated view columns (a good table won't but it does happen),

you can't combine aggragates with columns so you need to make two views to get 1 indexed view,

you must build your own AVG as SUM(x)/COUNT_BIG(*),

forget about MIN or MAX (did test THAT, I'll take their word on that one based on AVG),

and if you absolutely need to do an outer join, just forget it.

I've not checked what the overhead cause by the index on the for doing an INSERT into the base tables

but I can imagine it would be any more than any other clustered index (which, by the way, can be

a very high cost depending on the index and data being inserted)

Still, with that much speed and low resource usage, you can do some awesome stuff.

So Tim, you were right and I understood what you were saying... yes, base table indexes are used on

normal views but indexed views are much faster and can be "worth it".

Thanks for taking the time to put the code example together.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)