Stalled Query?

Question

Stalled Query?

alex.sqldba

SSChampion

Points: 10254
More actions
September 5, 2017 at 2:24 pm

#402463

Evening Guys,
I am making a change to a table that contains NTEXT and an NVARCHAR(4000).
The table contains about 100-million rows.
There are 10 other fields that are all bigints
ALTER TABLE dbo.[MyTable] ALTER COLUMN [Notes] nvarchar(4000) COLLATE Latin1_General_CI_AS NULL
All I am doing is changing the collation of this field. The data type is staying the same as is its nullability.
Now in doing this the Transaction Log ran out of space. The query did not fail.
Also the Pagefile was on a volumne that ran out of space too. The query did not fail.
I maanged to assign more space (it's a VM) and now there is about 100GB free and available to tempdb, the log and the pagefile.
However, the query has so far taken 10 hours longer than when testing it originally and is still running.
I am concerned that this query is somehow stalled and will never complete.
sp_who2 suggests the spid is switching from runnable to suspended but its MOSTLY on suspended.
Fortunately, this is a test run for something we will do in production. So I can stop this, restart and break this as much as I need to.
So, is it possible for SQL Server to effectively stop processing but not actually fail the query? If so, other than just waiting until I can wait no more is there a better way to identify this?
IO for CPU and Disk is increasing as one would expect for a query that is running.
But I would have expected this query from previous tests to have completed by now.
What do you think?
Cheers
Alex

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 1

alex.sqldba - Tuesday, September 5, 2017 2:24 PM
Evening Guys,
I am making a change to a table that contains NTEXT and an NVARCHAR(4000).
The table contains about 100-million rows.
There are 10 other fields that are all bigints
ALTER TABLE dbo.[MyTable] ALTER COLUMN [Notes] nvarchar(4000) COLLATE Latin1_General_CI_AS NULL
All I am doing is changing the collation of this field. The data type is staying the same as is its nullability.
Now in doing this the Transaction Log ran out of space. The query did not fail.
Also the Pagefile was on a volumne that ran out of space too. The query did not fail.
I maanged to assign more space (it's a VM) and now there is about 100GB free and available to tempdb, the log and the pagefile.
However, the query has so far taken 10 hours longer than when testing it originally and is still running.
I am concerned that this query is somehow stalled and will never complete.
sp_who2 suggests the spid is switching from runnable to suspended but its MOSTLY on suspended.
Fortunately, this is a test run for something we will do in production. So I can stop this, restart and break this as much as I need to.
So, is it possible for SQL Server to effectively stop processing but not actually fail the query? If so, other than just waiting until I can wait no more is there a better way to identify this?
IO for CPU and Disk is increasing as one would expect for a query that is running.
But I would have expected this query from previous tests to have completed by now.
What do you think?
Cheers
Alex

What's the total number of reserved bytes for the table? How much of that is data?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

alex.sqldba SSChampion Points: 10254 More actions · Answer 2

Morning 🙂

Sp_spaceused returns the following:

name;rows;reserved;data;index_size;unused
my_table;197836642;198715216 KB;132216576 KB;66036736 KB;461904 KB

It actually contains closer to 200m records not 100 like in my first post.

Johan Bijnens SSC Guru Points: 135389 More actions · Answer 3

sit and wait. I guess it has to rewrite the whole table and its indexes. i.e. +190GB
(I/O bound operation)
It all depends on the original datatype and length of the column Notes

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

alex.sqldba SSChampion Points: 10254 More actions · Answer 4

Aye, I supose it is quite large amount of data. Just in my mind it only takes 12 hours to restore the entire DB which is also IO bound. And the fact this ran out of space twice has me a bit concerned. But ill give it another day before throwing the towel in.

Cheers!

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 5

alex.sqldba - Wednesday, September 6, 2017 12:48 PM
Aye, I supose it is quite large amount of data. Just in my mind it only takes 12 hours to restore the entire DB which is also IO bound. And the fact this ran out of space twice has me a bit concerned. But ill give it another day before throwing the towel in.
Cheers!

On that note, let me ask one final question. How many GB is the whole database (not including the transaction log file) and how many file groups are there?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

alex.sqldba SSChampion Points: 10254 More actions · Answer 6

This is a 4.5 TB database.
There are 9 file groups and two logs.

This table resides on a single filegroup.

All the indicies and keys have been drpopped so its a heap currently.

Cheers!

frederico_fonseca SSCoach Points: 16189 More actions · Answer 7

From my own experience on doing these type of changes my advise is to create another table with the desired datatypes and then do batch inserts from the source table onto this new table.

After all data is copied to the new table, drop the original one, rename the new table to the old name and add the required indexes.

Yes you do need more space on the datafile itself, but the whole operation is a lot faster and your log file will not grow nearly as much.

Being a once off operation and if it is not expected that the table(s) will grow and reuse the extra space then doing a shrink is also a possibility to reclaim the space.

process flow
1 - create new table batch loop - X number of records where X will vary depending on server specs, IO speed and so on. Each batch with a explicit begin transaction/commit insert into new table end loop 2 - drop original table 3 - rename new table to original name 4 - shrink database file if required 5 - create required indexes

Note that if the table is the only table on the filegroup then a better option to the shrink is to create a new file group for the new table and once the operation is finished just drop the old filegroup

alex.sqldba SSChampion Points: 10254 More actions · Answer 8

I like this method a lot. I shall give it ago on test soon.

Iincidentally, the script eventually failed with an out of log space message. Indicating that the extra log file I added was not used (perhaps not recognised because it didn't exist before the query started?) so its evident I also need a better way of managing the log for this process which the above batched inserts (Frederico) will help with.

Cheers!

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 9

alex.sqldba - Thursday, September 7, 2017 12:55 AM
This table resides on a single filegroup.

PERFECT! This is going to be easy and relatively painless IF you can make a new file group with the appropriate size.

1. Make the new file group for the database.
2. Create the table structure (with the new correlation on the desired column) in the new file group except, instead of making it a HEAP, add the correct clustered index or clustered PK. That way, we don't have to blow out the size of the NDF file by building it later. Of course, the table will temporarily need to be named something else.
3. If not already there, change the database to either the BULK LOGGED or SIMPLE recovery model (do understand what happens to Point-in-Time backups and restores for either).
4. Do a DBCC TRACEON(610). This enables a special type if minimal logging that will minimally log any new pages that are created during the copy process.
5. Do the suggested chunked inserts making sure that the order of the inserts is in the same order (use an ORDER BY on the source) as the clustered index even between inserts. Make sure that you use the WITH (TABLOCK) on the table being inserted to (required for minimal logging without TF610 but works fine with it as well). Also, if your chunked inserts have ANY variables in the INSERT/SELECT code, make sure that you use OPTION (RECOMPILE) on the code (that's not documented in any MS documentation I've ever seen but it's usually required).
6. Verify that the new table is golden.
7. Drop the old table and the old file group it lived in.
8. Rename the new table.
9. Build any non-clustered indexes you may need.
10. Set the database back to the recovery model you started with. If you did go to SIMPLE and started from FULL, take a DIF backup to reestablish the log chain.
11. Drink beer and celebrate!

What this will do is...
1. Nearly double the speed of transfer because only the occasionally partially full page and changes to the B-Tree will be logged, which is tiny compared to the data.
2. You end up with a clustered index on the table without having to build it, which would cause the space requirements of the NDF file to double. If you'd rather just keep the table as a heap (not my recommendation but not my table, either), just eliminate adding the clustered index to the new table to begin with.
3. Since logging will be minimal, you won't have to baby sit the log file space. There's a pretty good chance the log file won't even use 1GB in the evolution.

I also recommend that you write something to a "log" table for each iteration of the chunking code so that 1. You can check the progress and 2. Know where to pick up if something goes haywire so that you don't need to start all over.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 10

frederico_fonseca - Thursday, September 7, 2017 2:24 AM
From my own experience on doing these type of changes my advise is to create another table with the desired datatypes and then do batch inserts from the source table onto this new table.
After all data is copied to the new table, drop the original one, rename the new table to the old name and add the required indexes.
Yes you do need more space on the datafile itself, but the whole operation is a lot faster and your log file will not grow nearly as much.
Being a once off operation and if it is not expected that the table(s) will grow and reuse the extra space then doing a shrink is also a possibility to reclaim the space.
process flow
1 - create new table batch loop - X number of records where X will vary depending on server specs, IO speed and so on. Each batch with a explicit begin transaction/commit insert into new table end loop 2 - drop original table 3 - rename new table to original name 4 - shrink database file if required 5 - create required indexes
Note that if the table is the only table on the filegroup then a better option to the shrink is to create a new file group for the new table and once the operation is finished just drop the old filegroup

You don't want to do a shrink on something this size.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

alex.sqldba SSChampion Points: 10254 More actions · Answer 11

Jeff, beauty thanks! This is excellent.

I am looking forward to tomorrow to trying this out. That traceflag looks super not just for this hurdle but for some future bits too!

Thanks so much

Alex

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 12

Thanks for the feedback. There's a whole lot more to all of this, especially concerning minimal logging and TF610. Although the document is quite old (written for SL Server 2008), it still applies. See the article at the following link. The title of the article (just to whet your appetite) is "The Data Loading Performance Guide".
https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

alex.sqldba SSChampion Points: 10254 More actions · Answer 13

Jeff, (and others),

I've not go to testing this yet but just had the thought: would the db's remain synchronised that are in availability groups, or would I need to break the DB out of the group first?

Would AG's survivve bulk log mode and the subsequent minimal logging during re-loading the table?

Cheers
Alex

Jeff Moden SSC Guru Points: 1004474 More actions · Answer 14

alex.sqldba - Saturday, September 9, 2017 12:28 PM
Jeff, (and others),
I've not go to testing this yet but just had the thought: would the db's remain synchronised that are in availability groups, or would I need to break the DB out of the group first?
Would AG's survivve bulk log mode and the subsequent minimal logging during re-loading the table?
Cheers
Alex

I don't know for sure because I don't use AG but I strongly suspect that AG will NT survive a trip through the Bulk Logged Recovery Model never mind the subsequent minimal logging.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)