Quirky Update query...works on server ...fails on PC.

  • Jeff Moden (1/6/2012)


    It would also be interesting if that problem causes a regular update to fail for that many rows... perhaps, just using ROW_NUMBER() over the PK.

    Not sure if this is what you had in mind, but a quick test in 256MB of this:

    ;

    WITH cteBS AS

    (

    SELECT

    RowCounter = ROW_NUMBER() OVER(

    ORDER BY sm.LocID, sm.ProdID, sm.StkTransDate, sm.TranID),

    *

    FROM dbo.StockMovements AS sm

    )

    UPDATE cteBS WITH (TABLOCKX)

    SET StkLocProdRunTotal = RowCounter

    OPTION (MAXDOP 1)

    ...runs successfully and correctly, with:

    Table 'StockMovements'.

    Scan count 1, logical reads 30947773, physical reads 46407, read-ahead reads 46487

    Table 'Worktable'.

    Scan count 1, logical reads 30523792, physical reads 56203, read-ahead reads 0

    (10090000 row(s) affected)

    By correctly, I mean this returns no rows after the UPDATE:

    WITH cteBS AS

    (

    SELECT

    RowCounter = ROW_NUMBER() OVER(

    ORDER BY sm.LocID, sm.ProdID, sm.StkTransDate, sm.TranID),

    *

    FROM dbo.StockMovements AS sm

    )

    SELECT *

    FROM cteBS

    WHERE RowCounter <> StkLocProdRunTotal

  • That's about what I had in mind. Thanks, Paul.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • SQL Kiwi (1/6/2012)


    Peter Brinkhaus (1/6/2012)


    Configuration 256MB: the script doesn't terminate. The process is blocking itself.

    The way I read your comments makes it sound like this mysterious self-blocking only occurs when TF 652 is OFF, would that be correct? Anything extra you can share about the nature of the blocking? (things like wait types and so on). I'm in the process of setting up a 32-bit VM here so I can install 32-bit SQL Servers and have a go myself.

    I finally got the same results as on SQL Server 2005. Results (all runs with a cold data cache):

    run 1: max. server memore 512 MB, traceflag 652=off: division by zero

    run 2: max. server memore 512 MB, traceflag 652=on: succeeded

    Table 'StockMovements'. Scan count 1, logical reads 30316289, physical reads 71123, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 66376, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 126251 ms, elapsed time = 181798 ms.

    run 3: max. server memore 1024 MB, traceflag 652=off: succeeded

    Table 'StockMovements'. Scan count 1, logical reads 30947108, physical reads 19, read-ahead reads 46496, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 100355 ms, elapsed time = 101714 ms.

    run 4: max. server memore 1024 MB, traceflag 652=on: succeeded

    Table 'StockMovements'. Scan count 1, logical reads 30316289, physical reads 5813, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 99685 ms, elapsed time = 106168 ms.

    run 5: max. server memore 256 MB, traceflag 652=off: division by zero

    run 6: max. server memore 256 MB, traceflag 652=on: succeeded

    Table 'StockMovements'. Scan count 1, logical reads 30316289, physical reads 82415, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 66376, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 113818 ms, elapsed time = 161112 ms.

    Note: initially I restarted the server after changing 'max. server memory' (according to BOL: To reduce the max server memory you may need to restart SQL Server to release the memory.). However I was not able to create table StockMovements using the script because SELECT TOP 8000000 ... INTO StockMovements ... caused a deadlock both with max. server memory 256MB and 1024MB. So I started with 512MB, created the testtables using the script and changed memory to 1024MB and 256MB without restarting the server.

    In case of interest, the deadlock was caused by trying to convert a lock to an exclusive lock on tempdb.sys.syscolpars. Ouput from DBCC TRACEON(1222, -1):

    Date,Source,Severity,Message

    01/07/2012 11:57:08,spid27s,Unknown,waiter id=process3612e38 mode=X requestType=convert

    01/07/2012 11:57:08,spid27s,Unknown,waiter-list

    01/07/2012 11:57:08,spid27s,Unknown,owner id=process3612e38 mode=S

    01/07/2012 11:57:08,spid27s,Unknown,owner id=process3612e38 mode=U

    01/07/2012 11:57:08,spid27s,Unknown,owner-list

    01/07/2012 11:57:08,spid27s,Unknown,keylock hobtid=562949956108288 dbid=2 objectname=tempdb.sys.syscolpars indexname=nc id=lock5da31c0 mode=U associatedObjectId=562949956108288

    01/07/2012 11:57:08,spid27s,Unknown,resource-list

    01/07/2012 11:57:08,spid27s,Unknown,CROSS JOIN sys.all_columns ac3

    01/07/2012 11:57:08,spid27s,Unknown,CROSS JOIN sys.all_columns ac2

    01/07/2012 11:57:08,spid27s,Unknown,FROM sys.all_columns ac1

    01/07/2012 11:57:08,spid27s,Unknown,INTO StockMovements

    01/07/2012 11:57:08,spid27s,Unknown,StkLocProdRunTotal = 0

    01/07/2012 11:57:08,spid27s,Unknown,StkTransDate = Dateadd(dd<c/> Abs(Checksum(Newid()))%Datediff(dd<c/> '2009'<c/> '2011')<c/> '2009')<c/>

    01/07/2012 11:57:08,spid27s,Unknown,Qty = -10 + CAST(Rand(Checksum(Newid())) * 10 AS INT)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,LocID = 1 +CAST(Abs(Checksum(Newid()) % 9 ) AS INT)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,ProdID = 1 +CAST(Abs(Checksum(Newid()) % 9000 ) AS INT)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,TranID = IDENTITY(INT<c/> 1<c/> 1)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,SELECT TOP 8000000---- NOTE 8 MILLION rows

    01/07/2012 11:57:08,spid27s,Unknown,--OUTWARD MOVEMENTS

    01/07/2012 11:57:08,spid27s,Unknown,inputbuf

    01/07/2012 11:57:08,spid27s,Unknown,CROSS JOIN sys.all_columns ac3

    01/07/2012 11:57:08,spid27s,Unknown,CROSS JOIN sys.all_columns ac2

    01/07/2012 11:57:08,spid27s,Unknown,FROM sys.all_columns ac1

    01/07/2012 11:57:08,spid27s,Unknown,INTO StockMovements

    01/07/2012 11:57:08,spid27s,Unknown,StkLocProdRunTotal = 0

    01/07/2012 11:57:08,spid27s,Unknown,StkTransDate = Dateadd(dd<c/> Abs(Checksum(Newid()))%Datediff(dd<c/> '2009'<c/> '2011')<c/> '2009')<c/>

    01/07/2012 11:57:08,spid27s,Unknown,Qty = -10 + CAST(Rand(Checksum(Newid())) * 10 AS INT)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,LocID = 1 +CAST(Abs(Checksum(Newid()) % 9 ) AS INT)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,ProdID = 1 +CAST(Abs(Checksum(Newid()) % 9000 ) AS INT)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,TranID = IDENTITY(INT<c/> 1<c/> 1)<c/>

    01/07/2012 11:57:08,spid27s,Unknown,SELECT TOP 8000000---- NOTE 8 MILLION rows

    01/07/2012 11:57:08,spid27s,Unknown,frame procname=adhoc line=3 stmtstart=46 sqlhandle=0x02000000a0c08108c99c578ed4a30afece45a28e8b418372

    01/07/2012 11:57:08,spid27s,Unknown,executionStack

    01/07/2012 11:57:08,spid27s,Unknown,process id=process3612e38 taskpriority=0 logused=2804 waitresource=KEY: 2:562949956108288 (8c02f49baf53) waittime=1071 ownerId=11282 transactionname=droptemp lasttranstarted=2012-01-07T11:57:07.673 XDES=0x6b5c280 lockMode=X schedulerid=4 kpid=3636 status=suspended spid=54 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2012-01-07T11:56:51.790 lastbatchcompleted=2012-01-07T11:56:37.883 clientapp=Microsoft SQL Server Management Studio - Query hostname=HELSINKI hostpid=7952 loginname=HELSINKI\Peter isolationlevel=read committed (2) xactid=10950 currentdb=2 lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200

    01/07/2012 11:57:08,spid27s,Unknown,process-list

    01/07/2012 11:57:08,spid27s,Unknown,deadlock victim=process3612e38

    01/07/2012 11:57:08,spid27s,Unknown,deadlock-list

    Looking at the owner- and waiting-list, the query seems to deadlock itself. Anybody got an idea why?

  • So what is the conclusion here? That you shouldn't try to do "enterprise-scale" quirky updates on a box that has less memory than most of today's handheld scientific calculators? 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/7/2012)


    So what is the conclusion here? That you shouldn't try to do "enterprise-scale" quirky updates on a box that has less memory than most of today's handheld scientific calculators? 😀

    I'll leave the conclusion to those considering to use the quirky update. I never had the need for it. I am just trying to give Paul some answers he asked for. And, as the deadlock problem shows, things can go well with less memory than most of today's handheld scientific calculators while fail with more.

  • Peter Brinkhaus (1/7/2012)


    I finally got the same results as on SQL Server 2005. Results (all runs with a cold data cache):

    Fantastic work Peter, and luckily no surprises, so great confirmation.

    Looking at the owner- and waiting-list, the query seems to deadlock itself. Anybody got an idea why?

    I originally misunderstood you - I thought you meant the quirky update ended up blocking itself. The deadlocked script looks very much like a bug (hopefully long since fixed).

  • Jeff Moden (1/7/2012)


    So what is the conclusion here? That you shouldn't try to do "enterprise-scale" quirky updates on a box that has less memory than most of today's handheld scientific calculators? 😀

    I think that conclusion, smiley notwithstanding, risks offending those that have spent significant personal time looking into this.

    My reasonably firm belief right now, is that SQL Server (since at least 2005 and on both processor architectures) exhibits behaviour that breaks the quirky update when memory is low enough so that (a) significant read-ahead is required to bring pages into memory; and (b) there is some kind of contention between what read-ahead is trying to do and other memory management features.

    So, my expectation is that this behaviour is likely to occur on a server with any amount of RAM, if the server as a whole is under memory pressure. The only sensible move at this point seems to be to recommend that the quirky update is only ever used with the safety check in place. Without the safety check, as we have seen, the quirky update can produce wrong results silently, even where all the published rules have been followed exactly.

    I have experienced test failures on 64-bit (where structures and memory use is generally larger than on 32-bit) with as much as 1GB max server memory set, with no concurrent workload obviously. Even if the point about larger memory systems under memory pressure is disputed, there will be many x64 systems out there running 1GB or so - think VMs, lower SKUs, that sort of thing. If this were my method, I would be taking all this pretty seriously. After all, the test table in question is only a few million rows in about 350MB - hardly up in any exclusive "Enterprise-scale" realm.

    To go much deeper, I would need to do some work with a debugger attached to SQL Server - but that is extremely time-intensive. Tests on a larger amount of memory with concurrent workload (memory pressure) should be easier.

  • SQL Kiwi (1/7/2012)


    I think that conclusion, smiley notwithstanding, risks offending those that have spent significant personal time looking into this.

    Sorry ol' friend. Didn't mean to come across as snarky. I realize that you good folks have put in a good amount of work on this subject and my comment wasn't meant to devalue that in any way.

    However, and without any offense to anyone, there have been a large number of posts on how to do massive deletes and updates and why they should be segmented to fewer numbers of rows even on enterprise-class machines. For example, updating a million rows on a given machine might take 3 seconds. Updating 2 million rows on the same machine might take only double that or 6 seconds. Yet, when updating only 3 million rows, instead of it taking only triple the time (9 seconds), it might take 20 minutes to 2 hours to do the single update (numbers came from a couple of personal experiences). It just seems incredibly odd to me that the participants of this thread haven't mentioned that, quirky update or not, it's not a good thing to do a single 10 million row update on a machine that's not much more capable than a toaster or on a machine that is already under memory pressure.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • As I have said earlier...and for the record

    I haven't had any issues before now...and this occurrence ONLY materialized on a memory compromised PC....it has never occurred on our servers.

    The reason why I raised it was because I was interested in understanding why it was failing on the PC...and I believe that others (Paul in particular) have identified the cause....for which I very much appreciate their efforts and detailed analysis.

    I always use the safety check...and this is what will save my ar5e...if and when a server comes under memory pressure. (these updates are only performed out of hours)

    @jeff..."it's not a good thing to do a single 10 million row update on a machine that's not much more capable than a toaster".... :-P:-P:-P......I do understand your sentiment and will be using your analogy to ask for a new PC.

    Thanks to all.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • SQL Kiwi (1/7/2012)


    The only sensible move at this point seems to be to recommend that the quirky update is only ever used with the safety check in place.

    P.S. I almost forgot... I absolutely agree with that statement and have since you made that particular change on the related thread.

    Shifting gears... I wonder if the old index hint that I used (you were actually a bit disappointed that I dumped the hint, IIRC) to use would fix the problems you good folks are running into with shy amounts of memory.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • as I spawned this thread...thought it only polite to report back

    based on my pc (with the memory of a toaster) .......

    using traceflag 652=ON.....works every time

    using WITH (INDEX(0)) in the cte...works every time

    though not surprisingly.....far slower in bot cases.

    regards

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Sorry about delay. Results from home lappy, specs posted earlier:

    /* maxmem 3072:

    Table 'StockMovements'. Scan count 1, logical reads 30947106, physical reads 7, read-ahead reads 70, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 107968 ms, elapsed time = 110757 ms.

    */

    /* 1024:

    Table 'StockMovements'. Scan count 1, logical reads 30947106, physical reads 7, read-ahead reads 70, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 107890 ms, elapsed time = 110236 ms.

    */

    /* 512:

    run 1. Msg 8134, Level 16, State 1, Line 104 Divide by zero error encountered.

    run 2. Msg 8134, Level 16, State 1, Line 104 Divide by zero error encountered.

    run 3. Msg 8134, Level 16, State 1, Line 104 Divide by zero error encountered.

    DBCC TRACEON (652, -1)

    run 1Table 'StockMovements'. Scan count 1, logical reads 30316287, physical reads 70767, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 66375, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    run 2Table 'StockMovements'. Scan count 1, logical reads 30316287, physical reads 70777, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 1, logical reads 30596864, physical reads 66370, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    DBCC TRACEOff (652, -1)

    run 1. Msg 8134, Level 16, State 1, Line 105 Divide by zero error encountered.

    run 2. Msg 8134, Level 16, State 1, Line 105 Divide by zero error encountered.

    */

    No deadlocks were detected.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (1/7/2012)


    Sorry about delay. Results from home lappy, specs posted earlier:[/code]

    Thanks Chris. I am able to get an error with max memory = 1000MB, but not 1024MB 🙂

    Intuitively, the cut-off seems likely to vary with the size of the table, among other things. Thanks again for the test - I think we have enough data points now to say at least that the issue is not limited to a particular architecture or SQL Server version.

  • J Livingston SQL (1/7/2012)


    as I spawned this thread...thought it only polite to report back

    based on my pc (with the memory of a toaster) .......

    using traceflag 652=ON.....works every time

    using WITH (INDEX(0)) in the cte...works every time

    though not surprisingly.....far slower in bot cases.

    Thank you also. With INDEX(0) I get two sorts (no table spool) in the plan; (INDEX(0) forces an unordered scan of the clustered index). The first sort is to get rows into order for the safety check (LocID, ProdID, StkTransDate, TranID), the second sort is a minor sort on the unique-ifer (LocID, ProdID, StkTransDate, TranID, unique-ifier) for the update. Defining the clustered index as unique would avoid this, by the way. I get the safety-check error with at both 512MB and 256MB on x64 R2, just to complete the picture.

  • Jeff Moden (1/7/2012)


    Sorry ol' friend. Didn't mean to come across as snarky. I realize that you good folks have put in a good amount of work on this subject and my comment wasn't meant to devalue that in any way.

    That's ok, I have relaxed about it a bit now, apologies for stating my point a bit too strongly, perhaps.

    However, and without any offense to anyone, there have been a large number of posts on how to do massive deletes and updates and why they should be segmented to fewer numbers of rows even on enterprise-class machines. For example, updating a million rows on a given machine might take 3 seconds. Updating 2 million rows on the same machine might take only double that or 6 seconds. Yet, when updating only 3 million rows, instead of it taking only triple the time (9 seconds), it might take 20 minutes to 2 hours to do the single update (numbers came from a couple of personal experiences). It just seems incredibly odd to me that the participants of this thread haven't mentioned that, quirky update or not, it's not a good thing to do a single 10 million row update on a machine that's not much more capable than a toaster or on a machine that is already under memory pressure.

    This is a valid point. I'm still curious about the exact mechanism involved because the thing about Quirky Update is that we've never had a really good repro of it failing with all the best practices followed before (leaving aside a fairly artificial example Hugo provided once). It is technically interesting to me, that's the biggest part of it. I have no issue with people making an informed decision to use QU.

Viewing 15 posts - 31 through 45 (of 55 total)

You must be logged in to reply to this topic. Login to reply