Help writing a SQL

Question

Help writing a SQL

Viewing 15 posts - 46 through 60 (of 70 total)

You must be logged in to reply to this topic. Login to reply

Garadin One Orange Chip Points: 29613 More actions · Answer 1

RBarryYoung (9/22/2009)
Anyhow, here it is, note it probably still needs that Clustered Key thing for optimal performance

Are you SUUUUURE? I wouldn't want to play any favorites!! 😉

Thanks for posting this Barry. I'm gonna take a look at this in depth tomorrow.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 2

Seth: any updates on this?

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Garadin One Orange Chip Points: 29613 More actions · Answer 3

Sorry, I just got back from a cruise to the Bahamas, have been completely unplugged from the world for 4 days now. May take today to 'recover' from my 'vacation', but tomorrow I'll definitely have some new updates.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 4

Garadin (9/27/2009)
Sorry, I just got back from a cruise to the Bahamas, have been completely unplugged from the world for 4 days now. May take today to 'recover' from my 'vacation', but tomorrow I'll definitely have some new updates.

Hrrm. I can't honestly say that you're eliciting a lot of sympathy from me here ...

😛

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 5

I concede that my guts can be wrong 😀

But just to add to the debate , this goes some way to resolving quirky update ordering issues.

I havent seen anything quite like this on the web.

Seems ok on 2005, further testing is appreciated.

Drop table mytab

go

create table mytab

(

Idx char(1),

Col1 integer,

Col2 integer

)

go

Create clustered index idx1 on mytab(idx)

go

drop view vw1

go

create view vw1

as

Select top 99999999999 * from mytab

order by col1

go

delete from mytab

go

insert into mytab(Idx,Col1) values('Z',1)

insert into mytab(Idx,Col1) values('Y',2)

insert into mytab(Idx,Col1) values('X',4)

insert into mytab(Idx,Col1) values('W',3)

insert into mytab(Idx,Col1) values('V',6)

go

declare @roll integer

Select @roll =0

update vw1

set @roll = col1+@Roll,

col2 =@roll

go

select * from vw1

select * from mytab

Clear Sky SQL
My Blog[/url]

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 6

Or even....

declare @roll integer

Select @roll =0

;with cte(Idx , Col1 , Col2)

as

(

Select top 9999999 Idx , Col1 , Col2

from mytab order by col1

)

update cte

set @roll = col1+@Roll,

col2 =@roll

go

select * From mytab

order by col1

Clear Sky SQL
My Blog[/url]

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 7

Or how about this if you cant add a column to the original table.

drop table #mytab2

go

create table #mytab2

(

Idx char(1),

roll integer)

go

insert into #mytab2 (idx,roll)

select idx,NULL

from mytab

go

declare @roll integer

Select @roll =0

;with cte(Idx , Col1 , Col2 , idxupd)

as

(

Select top 9999999 mytab.Idx , Col1 , roll,#mytab2.idx

from mytab join #mytab2

on mytab.idx = #mytab2.idx

order by col1

)

update cte

set @roll = col1+@Roll,

col2 =@roll,

Idxupd = idx

output inserted.Idxupd,inserted.col2

Clear Sky SQL
My Blog[/url]

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 8

Dave Ballantyne (9/28/2009)
this goes some way to resolving quirky update ordering issues.

Heh... what ordering issues?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 9

Jeff Moden (9/28/2009)
Heh... what ordering issues?

Well , just that on your base table , you dont ( disclaimer More testing needed) need a clustered index in the order you want to update the data.

Clear Sky SQL
My Blog[/url]

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 10

Dave Ballantyne (9/28/2009)
Jeff Moden (9/28/2009)
Heh... what ordering issues?
Well , just that on your base table , you dont ( disclaimer More testing needed) need a clustered index in the order you want to update the data.

You have to trust me on this... if you don't have a clustered index in the order you want to update the data, it will fail in a most unpredictable manner. Further, if the clustered index engages during the update (and it frequently does despite your best attempts), it will still update in clustered index order.

If you don't have the correct clustered index on the original table, the only safe way to use the quirky update is to copy the data to a Temp Table and put the correct one on that.

I haven't quite finished the article yet (trying to trim it down from 26 MS Word pages), but I have the proof of the failures I'm talking about in the form of demonstrable code in the article.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Garadin One Orange Chip Points: 29613 More actions · Answer 11

Dave Ballantyne (9/28/2009)
Jeff Moden (9/28/2009)
Heh... what ordering issues?
Well , just that on your base table , you dont ( disclaimer More testing needed) need a clustered index in the order you want to update the data.

Also, if you decide not to take Jeff's word for it and want to test this yourself, make sure you don't insert the data in the exact order it needs to be in.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

Garadin One Orange Chip Points: 29613 More actions · Answer 12

Ok, so I had some time back at my machine to test your solution Barry. It comes very close to tying the quirky update, but doesn't *quite* manage it. (Once I slightly adjusted your script to match this scenario) I ran the time trials without the execution plan then re-ran to generate it. It is attached. One interesting thing I found is that yours improves slightly when re-running it, where as the quirky update stays approximately the same on repeated runs. It also returns the exact same number of rows. Very close second for those who do not "trust" the 3 part update. Thanks for generating this Barry, I'm going to see if I can get that varbinary code I created the other day working to see if it comes close to this.

Barry's Method

XML Built . Time Elapsed: 16 seconds

(64202 row(s) affected)

Select to Table Done. Total Time Elapsed: 18 seconds

Final Row Count, Barry Pseudo-XML Method:64202

Modified Code:

DECLARE @timerdatetime,

@FRCint

SET @timer = GETDATE()

--create clustered index ix_acct_ndt on MR (acct, ndt)

declare @xml varchar(MAX), @acct int, @nextdate DATETIME, @datstr as varchar(80)

select @xml = '', @acct = -1, @nextdate = '1900-01-01', @datstr = ''

Declare @i int, @z int, @rowstate int

Declare @S16 Varchar(MAX), @S14 Varchar(MAX), @S12 Varchar(MAX), @s10 Varchar(MAX)

Select @i=0, @S16='', @S14='', @S12='', @s10=''

select

@z = ROW_NUMBER() OVER(ORDER BY acct,ndt),

@rowstate = CASE WHEN @acct <> acct THEN 2 WHEN ndt >= @nextdate THEN 1 ELSE 0 END,

@nextdate = CASE @rowstate WHEN 0 THEN @nextdate ELSE DATEADD(day, 150, ndt) END,

@i = CASE @rowstate WHEN 0 THEN @i ELSE @i+1 END,

@datstr = CASE @rowstate WHEN 0 THEN '' ELSE '<r a="' + convert(varchar(20), acct) + '" t="' + convert(varchar(10), ndt, 121) + '"/>' END,

@S16 = CASE WHEN @rowstate=0 THEN @S16 WHEN @i%65536=0 THEN @S16+@S14+@S12+@S10+@xml+@datstr Else @S16 END,

@S14 = CASE WHEN @rowstate=0 THEN @S14 WHEN @i%65536=0 THEN '' WHEN @i%16384=0 THEN @S14+@S12+@S10+@xml+@datstr Else @S14 END,

@S12 = CASE WHEN @rowstate=0 THEN @S12 WHEN @i%16384=0 Then '' WHEN @i%4096=0 THEN @S12+@S10+@xml+@datstr Else @S12 END,

@s10 = CASE WHEN @rowstate=0 THEN @s10 WHEN @i%4096=0 Then '' WHEN @i%1024=0 THEN @s10+@xml+@datstr Else @s10 END,

@xml = CASE WHEN @rowstate=0 THEN @xml WHEN @i%1024=0 THEN '' ELSE @xml+@datstr END,

@acct = acct

from MR

ORDER BY acct,ndt

SELECT @xml = @S16+@S14+@S12+@S10+@xml

Declare @x as XML

select @x = '<d>'+@xml+'</d>'

PRINT 'XML Built . Time Elapsed: ' + CAST(DATEDIFF(s,@Timer,GETDATE()) AS varchar(10)) + ' seconds'

SELECT acct, ndt38

INTO #3

FROM (

SELECT

T.c.value('@a[1]', 'int') AS [acct],

T.c.value('@t[1]', 'varchar(24)') AS [ndt38]

FROM @x.nodes('/d/r') AS T(c)

) T

--where RIGHT(ndt38,1)='2'

PRINT 'Select to Table Done. Total Time Elapsed: ' + CAST(DATEDIFF(s,@Timer,GETDATE()) AS varchar(10)) + ' seconds'

SET @FRC = (SELECT COUNT(*) FROM #3)

PRINT 'Final Row Count, Barry Pseudo-XML Method:' + CAST(@FRC AS varchar(10))

[Edit] Issue posting the .sqlplan, will post when able.

[Edit2] Appears it is an issue with Chrome.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

Garadin One Orange Chip Points: 29613 More actions · Answer 13

I realized after the fact that it might have been a little unclear as to why I didn't believe Barry's solution didn't *beat* the quirky update, as the quirky takes 19 seconds when it has to create the temp table vs. Barry's 18.

The reason for this is that there are only 2 reasons why you need to make the temp table in the first place.

1. The big one. The clustered index. It's not realistic to assume that the clustered index on your table will always be the one needed for the quirky update.

2. The inability to add any fields to the base table. This is (in my opinion) a much smaller reason and one that won't come into play in most situations. The field required by the quirky update is a bit field which would either be a very minimal size increase or (depending on the other fields in the table) might not actually increase the size of the table at all.

The first reason seemed moot, as Barry's solution used the same clustered index to achieve those performance numbers. The second is minor, but it is a valid reason in some circumstances and his method would win by a few seconds in the situation if the clustered index was correct on the table but you were not allowed to add the field. When I said it almost beat it, I was comparing it to the quirky method sans temp table, where it was only 2-3 seconds behind.

Now, I decided to take my own advice and completely re-create test data in a new table and never index it. The results were... surprising. Note that because the data is now completely unindexed, a temp table is *required* for the quirky update. I'm honestly shocked at how fast Barry's method runs without any index available.

Quirky Method (w/Temp Table)

(2000000 row(s) affected)

Temp Table Insert Done. Time Elapsed: 12 seconds

(2000000 row(s) affected)

Update Done. Time Elapsed: 40 seconds

(64425 row(s) affected)

Select to Table Done. Total Time Elapsed: 42 seconds

Final Row Count, Quirky Method:64425

Barry Method: No Clustered Index.

(1 row(s) affected)

XML Built . Time Elapsed: 20 seconds

(64425 row(s) affected)

(1 row(s) affected)

Select to Table Done. Total Time Elapsed: 23 seconds

(1 row(s) affected)

Final Row Count, Barry Pseudo-XML Method:64425

Original CTE Method: No Clustered Index.

I re-ran this as well but cut it off at 2m 30s as it still wasn't a contender.

Barry's method appears to win hands down when the data is not ordered to begin with. What is confusing me (and why I can't figure out if I did something wrong or not) is that the quirky update itself seems to take significantly longer even after I've inserted the data into the temp table. (30 seconds for 2M rows? It seems like I did something wrong, I'm just not seeing it.)

Execution Plans and code attached. If anyone spots a mistake, please let me know.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 14

Garadin (9/28/2009)
Ok, so I had some time back at my machine to test your solution Barry. It comes very close to tying the quirky update, but doesn't *quite* manage it. (Once I slightly adjusted your script to match this scenario) I ran the time trials without the execution plan then re-ran to generate it. It is attached. One interesting thing I found is that yours improves slightly when re-running it, where as the quirky update stays approximately the same on repeated runs. It also returns the exact same number of rows. Very close second for those who do not "trust" the 3 part update. Thanks for generating this Barry, I'm going to see if I can get that varbinary code I created the other day working to see if it comes close to this...

I am actually quite surprised that it did as well as it did, and I have to give credit to Richard Fryar for his "string to XML" idea which seems to have fixed one of the major problems that my previous varbinary version had.

However, even though it may have worked well in this case, I still do not believe that it will scale as well as Jeff's solution. Even with all of the tricks that I employ in this approach, it is still basically filling up memory with a gigantic string, which works great, Until it runs out of memory. At that point it's performance will very quickly get 10x or even 100x worse, Not Good.

That's why I originally went from a Varchar(MAX)/NVarchar(MAX) approach to a Varbinary(MAX) one, it's probably the most efficient way to pack together are boatload of concatenated data. Unfortunately, the part of that that I though would be super-fast: extracting out the arrayed data using SUBSTRING and fixed-size matrix offset calculations, turned out to be terribly slow (I still don't know for sure why, but apparently big MAX-type strings cannot just be accessed with simple direct-memory offsets, duh (I should have realized that)).

Anyway, apparently, putting it into an XML-structured variable fixes that (so far) as the extracts times have been really good.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 15

Garadin (9/28/2009)
Also, if you decide not to take Jeff's word for it and want to test this yourself, make sure you don't insert the data in the exact order it needs to be in.

In my little test scripts i did post the data in a 'random' order and then created the clustered index so it was in a different order. 🙂

Anyway Jeff , im sure you've been there done it and got the T-Shirt 😀 and look forward to your completed document , but , i really am struggling to get incorrect results. Ive scaled up by test to be on adventureworks and no matter what order or filtering i use then my Cte-Quirky update returns the same as my 'control' cursor. Must get on with some 'real' work now 😉

Drop table #Balance

go

drop table #CurBalance

go

drop index Sales.SalesOrderHeader.idxOrderDate

go

create index idxOrderDate on Sales.SalesOrderHeader(Orderdate) include(SalesOrderId,SalesPersonId,subtotal,taxamt,freight)

go

create index idxSalesPerson on Sales.SalesOrderHeader(SalesPersonId) include(SalesOrderId,OrderDate,subtotal,taxamt,freight)

go

Create Table #Balance(

SalesOrderId integer,

RollingBalance money

)

Create Table #CurBalance(

SalesOrderId integer,

RollingBalance money

)

insert into #Balance(SalesOrderId,RollingBalance)

select SalesOrderID,NULL

from Sales.SalesOrderHeader SOH

join Sales.SalesPerson SP

on SP.SalesPersonID = SOH.SalesPersonID

join Sales.SalesTerritory ST

on St.TerritoryID = SP.TerritoryID

where OrderDate between '01jan03' and '01may03'

-- and SOH.SalesPersonID in(276 ,277)

-- CountryRegionCode = 'CA'

order by OrderDate, SOH.SalesPersonID

--order by SOH.SalesPersonID ,OrderDate

--Order by st.TerritoryID,OrderDate

--Order by st.CountryRegionCode,OrderDate

Declare @OrderYear integer,

@OrderMonth integer,

@SalesPersonId integer,

@TerritoryId integer,

@CountryRegionCode char(2),

@RollingBalance money

Select @OrderYear = 0

Select @OrderMonth = 0

Select @RollingBalance = 0

Select @SalesPersonId =0

;with cteValue(SalesOrderId,OrderDate,SalesPersonId, TotalDue ,RollingBalance,TerritoryID , CountryRegionCode)

as(

select top 99999999999 SOH.SalesOrderID,OrderDate,SOH.SalesPersonID,TotalDue,RollingBalance, SP.TerritoryID,CountryRegionCode

from Sales.SalesOrderHeader SOH

join Sales.SalesPerson SP

on SP.SalesPersonID = SOH.SalesPersonID

join Sales.SalesTerritory ST

on St.TerritoryID = SP.TerritoryID

join #Balance

on #Balance.SalesOrderId = SOH.SalesOrderID

where OrderDate between '01jan03' and '01may03'

-- and SOH.SalesPersonID in(276 ,277)

-- CountryRegionCode ='CA'

--order by OrderDate, SOH.SalesPersonID

order by SOH.SalesPersonID ,OrderDate

--Order by st.TerritoryID,OrderDate

-- Order by st.CountryRegionCode,OrderDate

)

update cteValue

set @RollingBalance = case when @SalesPersonId <> cteValue.SalesPersonId or

@OrderMonth <> DATEPART(mm,OrderDate) or

@OrderYear <> DATEPART(yy,OrderDate) or

@TerritoryId <> TerritoryId

then TotalDue

else @RollingBalance +TotalDue end,

RollingBalance = @RollingBalance,

@OrderYear = DATEPART(yy,OrderDate),

@OrderMonth = DATEPART(mm,OrderDate),

@SalesPersonId = cteValue.SalesPersonId,

@TerritoryId = cteValue.TerritoryID,

@CountryRegionCode = cteValue.CountryRegionCode

go

Declare @OrderYear integer,

@OrderMonth integer,

@SalesPersonId integer,

@TerritoryId integer,

@CountryRegionCode char(2),

@RollingBalance money,

@PrevOrderYear integer,

@PrevOrderMonth integer,

@PrevSalesPersonId integer,

@PrevTerritoryId integer,

@PrevCountryRegionCode char(2),

@SalesOrderId integer,

@OrderDate datetime,

@TotalDue money

Select @OrderYear = 0,

@OrderMonth = 0,

@RollingBalance = 0,

@SalesPersonId =0,

@PrevOrderYear = 0,

@PrevOrderMonth = 0,

@PrevSalesPersonId =0,

@PrevTerritoryId =0,

@PrevCountryRegionCode='XX'

declare balancecur cursor for

select SOH.SalesOrderID,OrderDate,SOH.SalesPersonID,TotalDue, SP.TerritoryID,CountryRegionCode

from Sales.SalesOrderHeader SOH

join Sales.SalesPerson SP

on SP.SalesPersonID = SOH.SalesPersonID

join Sales.SalesTerritory ST

on St.TerritoryID = SP.TerritoryID

where OrderDate between '01jan03' and '01may03'

--and SOH.SalesPersonID in(276 ,277)

--CountryRegionCode ='CA'

--order by OrderDate, SOH.SalesPersonID

order by SOH.SalesPersonID ,OrderDate

--Order by st.TerritoryID,OrderDate

--Order by st.CountryRegionCode,OrderDate

open balancecur

while(0=0) begin

fetch next from balancecur into @SalesOrderId,@OrderDate,@SalesPersonId,@TotalDue,@TerritoryId,@CountryRegionCode

if(@@FETCH_STATUS<>0) break

Select @OrderMonth = DATEPART(mm,@OrderDate),

@OrderYear = DATEPART(yy,@OrderDate)

if( @OrderMonth <> @PrevOrderMonth or

@OrderYear <> @PrevOrderYear or

@TerritoryId <> @PrevTerritoryId or

@CountryRegionCode <> @PrevCountryRegionCode or

@SalesPersonId <> @PrevSalesPersonId

) begin

Select @RollingBalance = 0,

@PrevOrderMonth = @OrderMonth,

@PrevOrderYear = @OrderYear,

@PrevSalesPersonId = @SalesPersonId,

@PrevCountryRegionCode = @CountryRegionCode ,

@PrevTerritoryId = @TerritoryId

end

Select @RollingBalance = @RollingBalance + @TotalDue

insert into #CurBalance(SalesOrderId,RollingBalance)

values(@SalesOrderId,@RollingBalance)

end

close balancecur

deallocate balancecur

go

select COUNT(*) from #CurBalance

go

select COUNT(*) from #Balance

go

select COUNT(*) from #CurBalance join #Balance

on #CurBalance.SalesOrderId = #Balance.SalesOrderId

and #CurBalance.RollingBalance= #Balance.RollingBalance

Clear Sky SQL
My Blog[/url]