March 19, 2017 at 5:33 am
Sergiy - Saturday, March 18, 2017 4:33 PMHow many records per SKU?
Is it a unique key?
Is the uniqueness enforced through the UNIQUE constraint?Apart from that - the logic of the query seems troubled.
SKU(pkey) is the primary Key and Uniqueness is enforced through Primary Key Constraint.
Apart from that - the logic of the query seems troubled - Will you be able detail , what you mean by this ?
March 19, 2017 at 6:04 am
sqlrob - Sunday, March 19, 2017 5:33 AMSergiy - Saturday, March 18, 2017 4:33 PMHow many records per SKU?
Is it a unique key?
Is the uniqueness enforced through the UNIQUE constraint?Apart from that - the logic of the query seems troubled.
SKU(pkey) is the primary Key and Uniqueness is enforced through Primary Key Constraint.
Apart from that - the logic of the query seems troubled - Will you be able detail , what you mean by this ?
There is a technique named "warming up the updated pages" which in some cases helps improve performance of updates by putting the pages into the buffer using SELECT:
BEGN TRANSACTION
IF (SELECT TOP 1 available + overselling - reserved
FROM Inventory
WHERE PKey = 500 ) > 5
UPDATE Inventory
SET reserved = reserved + 1
,Modtime = getdate()
WHERE PKey = 500
COMMIT
go
As for logic - the query increases reserved number on some condition.
But it does not actually reserve any particular item.
An actual item reservation must be happening somewhere else.
On a busy system with multiple reservations happening in parallel the condition may change while execution moves from individual reservation to updating the aggregates.
So, you might have the situation when an item was actually reserved but Inventory was not updated because another reservation updated the Inventory in between those events.
This update must be executed only as a follow-up of an actual item reservation.
And it must be unconditional (only SKU filter to be applied).
If an item has been reserved then aggregated "reserved" value must be increased regardless of any other conditions.
_____________
Code for TallyGenerator
March 19, 2017 at 11:15 am
Sergiy - Sunday, March 19, 2017 6:04 AMsqlrob - Sunday, March 19, 2017 5:33 AMSergiy - Saturday, March 18, 2017 4:33 PMHow many records per SKU?
Is it a unique key?
Is the uniqueness enforced through the UNIQUE constraint?Apart from that - the logic of the query seems troubled.
SKU(pkey) is the primary Key and Uniqueness is enforced through Primary Key Constraint.
Apart from that - the logic of the query seems troubled - Will you be able detail , what you mean by this ?
There is a technique named "warming up the updated pages" which in some cases helps improve performance of updates by putting the pages into the buffer using SELECT:
BEGN TRANSACTION
IF (SELECT TOP 1 available + overselling - reserved
FROM Inventory
WHERE PKey = 500 ) > 5
UPDATE Inventory
SET reserved = reserved + 1
,Modtime = getdate()
WHERE PKey = 500
COMMIT
goAs for logic - the query increases reserved number on some condition.
But it does not actually reserve any particular item.
An actual item reservation must be happening somewhere else.
On a busy system with multiple reservations happening in parallel the condition may change while execution moves from individual reservation to updating the aggregates.
So, you might have the situation when an item was actually reserved but Inventory was not updated because another reservation updated the Inventory in between those events.This update must be executed only as a follow-up of an actual item reservation.
And it must be unconditional (only SKU filter to be applied).
If an item has been reserved then aggregated "reserved" value must be increased regardless of any other conditions.
I don't think that TOP 1 does what you think it does. It gets ANY ONE inventory record (and WHICH one is indeterminant due to not haveing an ORDER BY clause), and computes the value of the 3 columns with the math you have. If there are ever more than one inventory record you do not have a predictable, correct, repeatable query.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 19, 2017 at 7:46 pm
GilaMonster - Thursday, March 16, 2017 5:26 AMMost likely there is blocking. Can you check sys.dm_exec_requests when there's load and see what the query is waiting for?
This is the question you need to answer, as it will tell you why the updates are running slow. Everything else is guesswork and dubious forum advice. (Mess with the Lock Modes setting without even knowing what the problem is? Really??)
You can also watch them in Activity Monitor, and keep an eye on the Wait Type column (it's showing the wait_type column from sys.dm_exec_requests, which is one of the data sources for Activity Monitor).
Some likely Wait Types you may see there, and what they generally indicate:
1. LCK_M_* (such as LCK_M_X, LCK_M_U, LCK_M_S, etc. There's on for each type of lock) This means you're waiting on a lock (you're being blocked). This means that multiple simultaneous calls are hitting the same row, or there is a separate wider insert/update that grabbed locks on this row or table.
2. PAGELATCH_* (such as PAGELATCH_U and PAGELATCH_X. Note: NOT PAGEIOLATCH) This usually means hotspotting: separate updates are hitting different rows on the same physical disk page, and they are fighting over the page header to update the page's LSN and CHECKSUM values. Yet another way defrag hurts instead of helps.
3. PAGEIOLATCH_ This wait means the page is being fetched from disk.
There are many more, but these are the most likely based on your query plan. The Wait Type and Wait Resource, which is the object/index/row/whatever on which the session is waiting, are necessary for diagnosing the issue.
-Eddie
Eddie Wuerch
MCM: SQL
March 19, 2017 at 9:45 pm
Eddie Wuerch - Sunday, March 19, 2017 7:46 PMGilaMonster - Thursday, March 16, 2017 5:26 AMMost likely there is blocking. Can you check sys.dm_exec_requests when there's load and see what the query is waiting for?This is the question you need to answer, as it will tell you why the updates are running slow. Everything else is guesswork and dubious forum advice. (Mess with the Lock Modes setting without even knowing what the problem is? Really??)
You can also watch them in Activity Monitor, and keep an eye on the Wait Type column (it's showing the wait_type column from sys.dm_exec_requests, which is one of the data sources for Activity Monitor).
Some likely Wait Types you may see there, and what they generally indicate:
1. LCK_M_* (such as LCK_M_X, LCK_M_U, LCK_M_S, etc. There's on for each type of lock) This means you're waiting on a lock (you're being blocked). This means that multiple simultaneous calls are hitting the same row, or there is a separate wider insert/update that grabbed locks on this row or table.
2. PAGELATCH_* (such as PAGELATCH_U and PAGELATCH_X. Note: NOT PAGEIOLATCH) This usually means hotspotting: separate updates are hitting different rows on the same physical disk page, and they are fighting over the page header to update the page's LSN and CHECKSUM values. Yet another way defrag hurts instead of helps.
3. PAGEIOLATCH_ This wait means the page is being fetched from disk.
There are many more, but these are the most likely based on your query plan. The Wait Type and Wait Resource, which is the object/index/row/whatever on which the session is waiting, are necessary for diagnosing the issue.
-Eddie
Hi Eddie,
The wait type is LCK_M_U, which is waiting for acquire a Update Lock on the PAGE, which makes sense. When I used Profiler to trace the Update Query is taking more than 4 minutes to complete the update Query, when the server is under heavy Load like 1000 users trying to place order on same product.
March 19, 2017 at 9:47 pm
sqlrob - Sunday, March 19, 2017 9:45 PMEddie Wuerch - Sunday, March 19, 2017 7:46 PMGilaMonster - Thursday, March 16, 2017 5:26 AMMost likely there is blocking. Can you check sys.dm_exec_requests when there's load and see what the query is waiting for?This is the question you need to answer, as it will tell you why the updates are running slow. Everything else is guesswork and dubious forum advice. (Mess with the Lock Modes setting without even knowing what the problem is? Really??)
You can also watch them in Activity Monitor, and keep an eye on the Wait Type column (it's showing the wait_type column from sys.dm_exec_requests, which is one of the data sources for Activity Monitor).
Some likely Wait Types you may see there, and what they generally indicate:
1. LCK_M_* (such as LCK_M_X, LCK_M_U, LCK_M_S, etc. There's on for each type of lock) This means you're waiting on a lock (you're being blocked). This means that multiple simultaneous calls are hitting the same row, or there is a separate wider insert/update that grabbed locks on this row or table.
2. PAGELATCH_* (such as PAGELATCH_U and PAGELATCH_X. Note: NOT PAGEIOLATCH) This usually means hotspotting: separate updates are hitting different rows on the same physical disk page, and they are fighting over the page header to update the page's LSN and CHECKSUM values. Yet another way defrag hurts instead of helps.
3. PAGEIOLATCH_ This wait means the page is being fetched from disk.
There are many more, but these are the most likely based on your query plan. The Wait Type and Wait Resource, which is the object/index/row/whatever on which the session is waiting, are necessary for diagnosing the issue.
-Eddie
Hi Eddie,
The wait type is LCK_M_U, which is waiting for acquire a Update Lock on the PAGE, which makes sense. When I used Profiler to trace the Update Query is taking more than 4 minutes to complete the update Query, when the server is under heavy Load like 1000 users trying to place order on same product.request_mode=U
request_type=LOCK
request_status=WAIT
resource_type=KEY
wait_type=LCK_M_U
March 19, 2017 at 9:54 pm
Sergiy - Sunday, March 19, 2017 6:04 AMsqlrob - Sunday, March 19, 2017 5:33 AMSergiy - Saturday, March 18, 2017 4:33 PMHow many records per SKU?
Is it a unique key?
Is the uniqueness enforced through the UNIQUE constraint?Apart from that - the logic of the query seems troubled.
SKU(pkey) is the primary Key and Uniqueness is enforced through Primary Key Constraint.
Apart from that - the logic of the query seems troubled - Will you be able detail , what you mean by this ?
There is a technique named "warming up the updated pages" which in some cases helps improve performance of updates by putting the pages into the buffer using SELECT:
BEGN TRANSACTION
IF (SELECT TOP 1 available + overselling - reserved
FROM Inventory
WHERE PKey = 500 ) > 5
UPDATE Inventory
SET reserved = reserved + 1
,Modtime = getdate()
WHERE PKey = 500
COMMIT
goAs for logic - the query increases reserved number on some condition.
But it does not actually reserve any particular item.
An actual item reservation must be happening somewhere else.
On a busy system with multiple reservations happening in parallel the condition may change while execution moves from individual reservation to updating the aggregates.
So, you might have the situation when an item was actually reserved but Inventory was not updated because another reservation updated the Inventory in between those events.This update must be executed only as a follow-up of an actual item reservation.
And it must be unconditional (only SKU filter to be applied).
If an item has been reserved then aggregated "reserved" value must be increased regardless of any other conditions.
Thanks Sergiy for the quick response. I will try warming up the technique.
This query does not the Reserve the Item, provided there is stock avialable . The maths part checks for Stock availbility. It is not handled some where else.
March 19, 2017 at 10:37 pm
Sample Blocking capture..
March 20, 2017 at 1:03 am
sqlrob - Sunday, March 19, 2017 10:37 PMSample Blocking capture..
Thanks for that image, the details raise some interesting questions.
Those are Key locks, not Page locks (the Wait Resource starts with "KEY:". What follows that is the lock hash (identifier) for a given key value.
Is the column between CPU and Memory showing Physical Reads? If so, those numbers are awfully high for a single-row update using a primary key. Are there other queries that are also updating values in this table that could produce a collision, either on the specific row or on values in the indexes?
Also, all of the waiting sessions are two transaction levels deep. Are these updates being called in a loop as part of a larger batch? If so, can those separate calls be individually committed (meaning you will need to separately take steps to ensure the entire batch is fully executed and committed, even through errors and failures)?
You may be blaming the victim of a different call that should be changed instead.
Eddie Wuerch
MCM: SQL
March 20, 2017 at 4:23 am
I don't think that TOP 1 does what you think it does.
It does exactly what I think it does.
It makes sure that there is no way to get an error "subquery returned more than 1 value".
It should not be happening anyway, because the record is filtered out by an exact match to a PK value.
Only 1 record should be returned.
Therefore ORDER BY would be only an unnecessary overhead.
Unless the table definition is changed.
Then the logic of the query (probably not only this one) would require revisiting.
But while developers working on it the TOP 1 would prevent run-time errors in Production.
_____________
Code for TallyGenerator
March 20, 2017 at 9:37 am
Sergiy - Monday, March 20, 2017 4:23 AMI don't think that TOP 1 does what you think it does.
It does exactly what I think it does.It makes sure that there is no way to get an error "subquery returned more than 1 value".It should not be happening anyway, because the record is filtered out by an exact match to a PK value.Only 1 record should be returned.Therefore ORDER BY would be only an unnecessary overhead.Unless the table definition is changed.Then the logic of the query (probably not only this one) would require revisiting.But while developers working on it the TOP 1 would prevent run-time errors in Production.
My apologies. I had a brain cloud and had it in my head that there were multiple Inventory rows per Product - like a ledger system (carried over from another post I had answered recently). 🙂
Curious to think that a conditional, select (with lock(s) taken as additional overhead) and top operation prior to the update would lead to overall improved performance. Not saying it isn't viable for certain workloads, just an interesting proposition.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 20, 2017 at 9:40 am
Eddie Wuerch - Monday, March 20, 2017 1:03 AMsqlrob - Sunday, March 19, 2017 10:37 PMSample Blocking capture..Thanks for that image, the details raise some interesting questions.
Those are Key locks, not Page locks (the Wait Resource starts with "KEY:". What follows that is the lock hash (identifier) for a given key value.Is the column between CPU and Memory showing Physical Reads? If so, those numbers are awfully high for a single-row update using a primary key. Are there other queries that are also updating values in this table that could produce a collision, either on the specific row or on values in the indexes?
Also, all of the waiting sessions are two transaction levels deep. Are these updates being called in a loop as part of a larger batch? If so, can those separate calls be individually committed (meaning you will need to separately take steps to ensure the entire batch is fully executed and committed, even through errors and failures)?You may be blaming the victim of a different call that should be changed instead.
WAY back in this thread I mentioned the very high call volume and that the infrastructure may not be up to the task. I don't see where the OP ever answered that. I also mentioned doing differential file IO stall and wait stats analyses and using sp_whoisactive in diff mode too.
No one has asked about the potential for trigger(s) on the table either...
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
March 20, 2017 at 5:27 pm
TheSQLGuru - Monday, March 20, 2017 9:40 AMEddie Wuerch - Monday, March 20, 2017 1:03 AMsqlrob - Sunday, March 19, 2017 10:37 PMSample Blocking capture..Thanks for that image, the details raise some interesting questions.
Those are Key locks, not Page locks (the Wait Resource starts with "KEY:". What follows that is the lock hash (identifier) for a given key value.Is the column between CPU and Memory showing Physical Reads? If so, those numbers are awfully high for a single-row update using a primary key. Are there other queries that are also updating values in this table that could produce a collision, either on the specific row or on values in the indexes?
Also, all of the waiting sessions are two transaction levels deep. Are these updates being called in a loop as part of a larger batch? If so, can those separate calls be individually committed (meaning you will need to separately take steps to ensure the entire batch is fully executed and committed, even through errors and failures)?You may be blaming the victim of a different call that should be changed instead.
Thanks Edie.
Are there other queries that are also updating values in this table that could produce a collision, either on the specific row or on values in the indexes ?
- There are no other UPDATE QUERIES on that table doing update. But it gets blocked by same UPDATE Query run by other SPIDs. Having said that once SPID acquires exclusive lock on that Key, how is it possible to have a collision? If you could detail me your thoughts?
Also, all of the waiting sessions are two transaction levels deep. Are these updates being called in a loop as part of a larger batch? If so, can those separate calls be individually committed (meaning you will need to separately take steps to ensure the entire batch is fully executed and committed, even though errors and failures)?
- Good Observation. I will have a look at why they are having two transaction waiting?WAY back in this thread I mentioned the very high call volume and that the infrastructure may not be up to the task. I don't see where the OP ever answered that. I also mentioned doing differential file IO stall and wait stats analyses and using sp_whoisactive in diff mode too. - There is plenty of CPU and Memory when this happens. So I don't beleive that is the bottleneck. I will check the IO stall..
No one has asked about the potential for trigger(s) on the table either... - There are no Triggers in that table
March 20, 2017 at 8:22 pm
There is plenty of CPU and Memory when this happens. So I don't beleive that is the bottleneck. I will check the IO stall..
In my experience with a heavy load going on that is suffering performance problems (and there is no blocking to speak of) then having plenty of CPU and memory available means there is a very high probability of an IO bottleneck.
I would like to point out that this issue has been going on for at LEAST a week now, possibly much more than that before you decided to post here. I know of a number of consultants that could identify the root cause of your issue(s) in less than an hour, almost guaranteed. Perhaps in a little as a few minutes. About 45 seconds is my record, and that was from a 2-minute profiler trace on a system I had never seen. It only took that long because I wanted to make sure that what I saw as soon as I opened the trace was the worst thing the app was doing. 😎
I also noticed that you did mention in your OP that there were "no disk IO constraints". Can you be specific about what that actually means? Did you measure IO stalls during the load event?
Another question: did this process EVER work acceptably for thousands of concurrent operations? If so the obvious question is what changed. If not, I refer you again to my comment about getting a consultant on board. I find it hard to understand why you would go round-and-round on a forum for so long while your system is taking up to 4 minutes to complete a tiny update.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 14 posts - 16 through 28 (of 28 total)
You must be logged in to reply to this topic. Login to reply