December 10, 2018 at 11:46 am
Eric M Russell - Monday, December 10, 2018 10:36 AMsgmunson - Monday, December 10, 2018 8:41 AMSounds to me like page splits are likely here, and in large quantities. Might be worth taking a look into the number of varchar columns (or nvarchar) and see just how much usage these columns get as compared to their length. Have seen an awful lot of situations where nvarchar was in use and there was absolutely no reason for it, as the data never justified it. Then there's the varchar(4000) and no one ever used more than 30 characters. You say that no one is willing to engage in data structure changes, but to be honest, that might well be the "real problem"....Just to clarify; the definition of columns as VARCHAR don't cause page splits; it's updating the columns with a value larger than the original that results in an increase of row size and page splitting. So, LastName VARCHAR(4000) might not make sense in terms of data constraint, but it's no more prone to cause page splits than VARCHAR(40), so long as the user isn't updating the values after the initial insert.
+100.
Just to write it a different way, such "ExpAnsive" updates will cause massive fragmentation of the clustered index and any indexes that include such columns but the mere presence of variable width columns will not, by themselves, cause bad page splits. And, just to be sure, if the update results in no expansion of the data, it will not cause a page split unless the update in an "Out-of"order" update to the index key columns and it causes a row to move to move to a different page.
Last but not least, you CAN have an expansion of data during an update that won't split a page... There just has to be enough room on the page to be able to handle the expansive update without running out of room on the page.
Of course, updates that change the order of the row in an index can also be a huge source of bad page splits but not always (although t will almost always waste space on a page after it has moved to another page which also means you're wasting memory for pages loaded into memory). Like everything else, "It Depends"
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2018 at 11:46 am
Jeff Moden - Monday, December 10, 2018 11:46 AMEric M Russell - Monday, December 10, 2018 10:36 AMsgmunson - Monday, December 10, 2018 8:41 AMSounds to me like page splits are likely here, and in large quantities. Might be worth taking a look into the number of varchar columns (or nvarchar) and see just how much usage these columns get as compared to their length. Have seen an awful lot of situations where nvarchar was in use and there was absolutely no reason for it, as the data never justified it. Then there's the varchar(4000) and no one ever used more than 30 characters. You say that no one is willing to engage in data structure changes, but to be honest, that might well be the "real problem"....Just to clarify; the definition of columns as VARCHAR don't cause page splits; it's updating the columns with a value larger than the original that results in an increase of row size and page splitting. So, LastName VARCHAR(4000) might not make sense in terms of data constraint, but it's no more prone to cause page splits than VARCHAR(40), so long as the user isn't updating the values after the initial insert.
+100.
Just to write it a different way, such "ExpAnsive" updates will cause massive fragmentation of the clustered index and any indexes that include such columns but the mere presence of variable width columns will not, by themselves, cause bad page splits. And, just to be sure, if the update results in no expansion of the data, it will not cause a page split unless the update in an "Out-of"order" update to the index key columns and it causes a row to move to move to a different page.
Last but not least, you CAN have an expansion of data during an update that won't split a page... There just has to be enough room on the page to be able to handle the expansive update without running out of room on the page.
Of course, updates that change the order of the row in an index can also be a huge source of bad page splits but not always (although t will almost always waste space on a page after it has moved to another page which also means you're wasting memory for pages loaded into memory). Like everything else, "It Depends"
I never meant to imply that varchar columns alone cause page splits, but I do see how one might have come to that conclusion if they didn't know better. For that, my bad. I should have made that clear. However, looking at your varchar columns and understanding the nature of how they get updated is particularly valuable. "Know your data" is always a good idea...
Steve (aka sgmunson) π π π
Rent Servers for Income (picks and shovels strategy)
December 19, 2018 at 3:44 pm
So far running the optimize unknown hint has resolved the issue, things have been running ok.
The way the code was written was not the best, for now until it causes a larger issue it won't be rewritten.
When the user interface page is created it inserts initial rows in the table with no values, many are default to null.
Once the user starts to insert data the rows will be updated and there will be fragmentation created from it, now way around it.
Thanks for all the help and suggestions.
December 20, 2018 at 3:38 am
With a lengthy output column list and only a few rows in the result set, this query is made for a key lookup using a skinny index to pack as many rows as possible into each page. Your initial seeks are fast and the number of lookups to the clustered index to retrieve the remaining columns is small. Why leave it to chance when you can mimic this activity in code?
Assuming the clustered index is on a single column called ID_SUBMISSION_QUESTION_ANSWERS, first create this index:
CREATE INDEX ix_Tester ON dbo.SUBMISSION_QUESTION_ANSWERS (DOCUMENTID, REVISION, CONTEXT, ID_SUBMISSION_QUESTION_ANSWERS)
-- Remember, every ordinary index already contains the cluster keys, but only once.
Next, pick up the cluster keys of the required rows using the search criteria:
SELECT ID_SUBMISSION_QUESTION_ANSWERS
INTO #SUBMISSION_QUESTION_ANSWERS
FROM dbo.SUBMISSION_QUESTION_ANSWERS
WHERE DOCUMENTID = @documentID
AND REVISION = @revision
AND CONTEXT = @context
Finally, get the remaining columns by joining back to the main table:
SELECT
sqa.SUBMISSION_QUESTION_ANSWER_ID, sqa.DOCUMENTID, sqa.REVISION, sqa.QUESTIONNAIREID, sqa.SUBMISSION_QUESTION_DEFINITION_ID,
sqa.RANK, sqa.REQUIRED_FOR_SUBMISSION, sqa.QUESTION_TEXT, sqa.DATE_VALUE, sqa.INTEGER_VALUE, sqa.DECIMAL_VALUE, sqa.TEXT_VALUE, sqa.LIST_VALUE,
sqa.LIST_DESCRIPTION, sqa.LIST_ITEM_ID, sqa.DATATYPE, sqa.CONTEXT, sqa.AVAILABLE_ITEMS, sqa.REVISION_INDEPENDENT_ID, sqa.CORRESPONDING_AUTHOR, sqa.ACTIVE, sqa.DRAFT_MODE,
sqa.QUESTION_MAP_ID,
sqa.INCLUDE_IN_MERGE
FROM #SUBMISSION_QUESTION_ANSWERS t
INNER loop JOIN dbo.SUBMISSION_QUESTION_ANSWERS sqa
ON t.ID_SUBMISSION_QUESTION_ANSWERS = sqa.ID_SUBMISSION_QUESTION_ANSWERS
ORDER BY sqa.RANK, sqa.SUBMISSION_QUESTION_DEFINITION_ID
A couple of notes: keep the ordinary index as skinny as possible - don't worry about putting the ORDER BY columns in it because there's no guarantee that they will be used.
Try with and without the loop join hint. This hint forces the exact behavior you want but may not be necessary.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 4 posts - 16 through 18 (of 18 total)
You must be logged in to reply to this topic. Login to reply