July 14, 2009 at 4:21 pm
kevinwinters (7/10/2009)
I am running this query in the query analyzer:update claimfile set recreason = ''
The computer is not a dog but it has been running for 3 hours - is there a better way to change a field to be "blank" instead of "null" in 3 million records?
If that is really the code you executed, then you are updating the value of recreason for every single row in the table to an empty string. You have stated that what you wanted to do is update all rows where recreason is null to an empty string. That would be:
UPDATE claimfile SET recreason = '' WHERE recreason IS NULL;
The question I have is how many rows do you have where the value is null, and do you have an index on the column recreason? Run the following:
SELECT count(*) FROM claimfile WHERE recreason IS NULL;
I also would like to ask why you are doing this, what problem are you trying to solve? If you change all of the nulls to empty string - but do not fix the table and/or the application it is just going to input nulls again. Is there a problem with having a null in that column?
One way around all of this would be to create a computed column and use that instead of the column with a null. Or, you could create a view where the column is converted for you. For example:
ALTER TABLE claimfile ADD COLUMN COALESCE(recreason, '') AS recreason_nonulls;
CREATE VIEW dbo.claimfile_view AS
SELECT {list of columns}, COALESCE(recreason, '') AS recreason_nonulls
FROM claimfile;
GO
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
July 14, 2009 at 4:50 pm
yes, that is the exact query.
*every* row of that field is null, I am trying to make every row of that field blank so I can concatenate the field to some new information
What I have in my effort to automate is:
I create that column after the table is imported, typically the table is 3 million records or more.
Then I have a several queries that I run 1 at a time that adds information to that field but concatenates the original information in that field called "recreason". Like:
UPDATE [testtable] SET [testtable].rfa = 'Y', [testtable].recreason = [testtable].recreason +' '+'DRG' where blah blah.....
I need that field to end up after 10 queries to look like " DRG KOT DER " (say that 3 of the queries were "hits")
Now if recreason is null - this doesn't work but if it is blank then it does work....
You cna probably guess I am not an expert but I am willing to learn if this is a lousy effort.
Thanks
July 14, 2009 at 5:23 pm
kevinwinters (7/14/2009)
yes, that is the exact query.*every* row of that field is null, I am trying to make every row of that field blank so I can concatenate the field to some new information
What I have in my effort to automate is:
I create that column after the table is imported, typically the table is 3 million records or more.
Then I have a several queries that I run 1 at a time that adds information to that field but concatenates the original information in that field called "recreason". Like:
UPDATE [testtable] SET [testtable].rfa = 'Y', [testtable].recreason = [testtable].recreason +' '+'DRG' where blah blah.....
I need that field to end up after 10 queries to look like " DRG KOT DER " (say that 3 of the queries were "hits")
Now if recreason is null - this doesn't work but if it is blank then it does work....
You cna probably guess I am not an expert but I am willing to learn if this is a lousy effort.
Thanks
I don't know what the logic is for each update query on this column. If possible, I would not create a new column and update it - instead, I would create a view with the calculated value as a column.
But, to solve the immediate problem - don't update the columns value to empty string first. You can use either the ISNULL or COALESCE function instead. I recommend using the COALESCE function because it is the ANSI standard function, but ISNULL would work just as well (there are other differences that could pose problems, but I don't think they will in this case - for you. You can look up the functions in Books Online for further details).
Using ISNULL:
UPDATE [testtable] SET [testtable].rfa = 'Y', [testtable].recreason = ISNULL([testtable].recreason, '') +' '+'DRG' where blah blah.....
Using COALESCE:
UPDATE [testtable] SET [testtable].rfa = 'Y', [testtable].recreason = COALESCE([testtable].recreason, '') +' '+'DRG' where blah blah.....
Both functions check the value of the column, and if the value is null returns the empty string.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
July 14, 2009 at 5:32 pm
Thank you Jeffery, that sounds like a good solution to my immediate problem, I'm off to try that tonight.
July 15, 2009 at 2:34 am
Hello
I noticed that you have the column data type as "Text"
This is a special datatype that is used to store huge amounts of text (in excess of thousands of characters, normally)
In your sample data it seems as if you only want to store a small amount of text, say less than a few hundred characters.
You may find a huge improvement in the performance of your query if you use varchar(250) [Or whatever size works for your expected data] instead of using the special "Text" data type.
HTH
Regards,
Joon
July 15, 2009 at 5:19 am
Robert klimes (7/14/2009)
I believe that the default constraint will only work for newly inserted row not existing ones.
Looks like you're right - didn't realize that. Unfortunate :/
July 16, 2009 at 8:35 am
A spid can show as blocking itself - a 'feature' added in SQL 2005. IIRC it happens when a block of memory is latched and it blocks continued execution until the memory is filled with the IO it expects from the IO subsystem. My guess is that this system sees a lot of IO stalls. Could also be excessive CXPACKET waits - especially if it is a hyperthreading-enabled system.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 16, 2009 at 10:59 am
Doing large updates in one batch will surely lead to blocking because of the fact that SQL Server will do lock escalation from row locking to page/table locking since that's easier for it to do so.
Always try to batch the update statement, this will keep the blocking to a minimum and also importantly prevent the transaction log from growing large.
You could use the below code to do mass update on a db table:-
DECLARE @Updates TABLE (PK_Id int NOT NULL PRIMARY KEY)
DECLARE @x int
SET @x = 0 -- Dummy statement to set @@ROWCOUNT = 1
WHILE @@ROWCOUNT > 0
UPDATE TOP(1000) SomeHugeTable
SET SomeColumn = 'something'
OUTPUT inserted.PK_Id --Output the updated records to a table so that these aren't updated the subsequent times.
INTO @Updates
FROM dbo.SomeHugeTable AS SomeHugeTable
WHERE NOT EXISTS (
SELECT *
FROM @Updates AS Updates
WHERE Updates.PK_Id = SomeHugeTable.PK_Id
)
Hope this helps.
Thanks,
Amol
Amol Naik
July 19, 2009 at 8:39 pm
kevinwinters (7/14/2009)
yes, that is the exact query.*every* row of that field is null, I am trying to make every row of that field blank so I can concatenate the field to some new information
If that column truly is a TEXT datatype, you'll be disappointed because you cannot "concatenate" a TEXT column... period.
Also, Lowell has the exact correct idea of updating that many rows in smaller batches because every server has a "tipping" point. 1 Million rows may take just 3 seconds to update. On that same server, 2 Million rows may take just 6 seconds to update. But, 3 million rows may take 3 hours because of the "tipping" point. It's not really predictable when the "tipping" point may occur but it generally WON'T occur for a million rows or less.
The only thing I'd add to Lowell's code is a WITH (TABLOCKX) hint because of the number of rows you're updating. Yes, that will lock everyone out for the duration of the update but that might happen anyway because of the position and number of rows being updated. If you have full control of the table during the update, the update will occur very quickly and most users won't even notice that they've been temporarily blocked.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 20, 2009 at 9:24 am
Here is what I do when I need to update big tables (50M+ records)
Select
Column_1,
Column_2,
CAST('' as varchar(50)) As recreason
INTO t_claimfile
DROP TABLE
EXEC sp_rename 't_claimfile','claimfile'
Execution time between a few seconds and a few minutes (depending on the size of the table
Good luck!
Jacob
Jacob Milter
July 20, 2009 at 10:14 am
If you do that, just remember to recreate indexes, verify FK's, establish the PK, setup constraints, copy any triggers, and rebuild any views that reference the table. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 11 posts - 16 through 25 (of 25 total)
You must be logged in to reply to this topic. Login to reply