April 9, 2012 at 10:25 pm
Hey guys, im stuck again..
I have already migrated all 6 million records and Ids to my normalized database. Integrity and FK constraints are set up just great. Now i'm trying to fix lots of errors i found on the data i migrated. Lets cut to the point:
This is the main table:
tbl_NewPeople: {*idperson(pk), name, lastname, fatherId, motherId ,(whole bunch more irrelevant fields) }
fatherId and motherId are both varchar, and they have tons of garbage.
This is how they should be formatted:
An 11-digit integer number with no spaces, punctuation signs or symbols, for example: 22365987454
Some of them are good to go..but MOST of them are like this:
223-56598-546 (Separated with '-')
265655 (not even the required amount of digits)
- (only '-')
(even blank space!)
sdferf (some of them got letters)
Ovbiously the individual who designed the originalPeople table did not think of this..but its my task to set it right..or at least homogenize the existing data to a point that makes some sense
Probabbly the field will have to keep its original varchar type, since i have already migrated all the data to its place, and it took me several days to do so since im working on 1 gb of ram...
April 10, 2012 at 2:20 am
The LIKE clause could provide the level of checking required. A demonstration is listed below.
The next choice you want to make is what you want to do with the erroneous rows.
Do you want to
1) replace the columns that contain the error (losing the incorrect values)
2) create a new column to hold the correct values
3) create a new column to hold the erronoeous values and then remove the errors from the columns
create table Mess(
ID int,
Mother varchar(11),
Father varchar(11)
)
truncate table Mess
insert into Mess values (1,'12345678901','23456789012')
insert into Mess values (2,'-','')
insert into Mess values (3,'1234ABC','12909-90990')
insert into Mess values (4,'99999999999','12909-90990')
-- Single Query : find errors
select *
from Mess
where Mother NOT LIKE
('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
OR Father NOT LIKE
('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
-- Unioned : find error and identify column
select 'Mother' as ErrorCol, *
from Mess
where Mother NOT LIKE
('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
UNION
select 'Father' as ErrorCol,*
from Mess
where Father NOT LIKE
('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
order by ID
-- Edit Choice 1 : Editing the column to remove errors
update mess
set Mother = NULL
where Mother NOT LIKE ('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
update mess
set Father = NULL
where Father NOT LIKE ('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
-- Edit Choice 2 : Adding extra columns
alter table mess add Mother2 varchar(11)
alter table mess add Father2 varchar(11)
update mess
set Mother2 = NULL
where Mother NOT LIKE ('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
update mess
set Father2 = NULL
where Father NOT LIKE ('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
-- Edit Choice 3 : Adding extra column to hold errors
alter table mess add Mother_Old varchar(11)
alter table mess add Father_Old varchar(11)
update mess
set Mother_Old = Mother, Father_Old = Father
update mess
set Mother = NULL
where Mother NOT LIKE ('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
update mess
set Father = NULL
where Father NOT LIKE ('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
The choice is yours.
Fitz
April 10, 2012 at 8:27 am
I was about to try these things when i found out my pc was completely jammed..it wouldnt do anything..
Right after i posted my last message, i decided to leave it on overnight running a query..apparentely it lasted all night long and consumed all my RAM..I had to shutdown the pc manually by force. I turned it back on and now the database im working on says (In Recovery). It wouldnt let me do anything and its just there. I cant see the tables or anything..its kind of blocked. Im about to have a stroke..what does this means..and how can i manage to restore it?
April 10, 2012 at 8:37 am
The edits suggested by Mark are fine where there's no obvious way of getting the correct value, but I imagine that before doing what he suggests it might be a good idea to correct the cases where the error is a formatting error only: for example the value '223-56598-546' mentioned by the OP should probably be changed to '22356598546', and if the fields are varchar(N) with N higher than 11 any field which is an 11 digit number plus some leading spaces and/or some trailing spaces should have the leading and trailing spaces trimmed away (and of course leading and trailing spaces on a string like '223-56598-546' don't make that impossible to correct either). That sort of editing is likely to be easy, and will reduce the number of hard errors that have to be dealt with.
Tom
April 10, 2012 at 8:53 am
Wait.
In recovery means that SQL is busy running crash recovery on the database. Wait until it is finished.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 10, 2012 at 9:03 am
Sometimes a database taking a ling time to run crash recovery means you need to work on the transaction log because you have too many VLF's. Check out this post by Kimberly Tripp, http://www.sqlskills.com/BLOGS/KIMBERLY/post/Transaction-Log-VLFs-too-many-or-too-few.aspx
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
April 10, 2012 at 10:56 am
I was waiting for the auto recovery to do its job, but after a couple hours of wait, it throwed an error saying something like the process cannot be completed. Im pretty sure im lacking some serious RAM.
About the blog post you suggested http://www.sqlskills.com/BLOGS/KIMBERLY/post/Transaction-Log-VLFs-too-many-or-too-few.aspx, i went through it and now i have an idea of why it happened..yet i have no clue on how to deal with the .log file. It weights almost 4gb and that is all i know..
When using the .log file, what am i supposed to look for?
How do i open/review/browse the .log?
April 10, 2012 at 2:47 pm
No, nothing to do with RAM. Post the error it gave you and we can advise.
What .log file? SQL transaction log files are .ldf by default, and you don't open them to look at, they're part of SQL databases
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 10, 2012 at 10:22 pm
In order for me to see the message again, i would need to give it 3+ hours running, i dont have that much time for trying troubleshooting that might not guarantee a quick way around. I'm supposed to deliver tomorrow first thing in the moorning :\. I still have the original people table and all the treatment queries stored somewhere..I will start off from the beginning overnight and hope for the best :(. I will get back to you later guys..thanks for keeping up
Viewing 9 posts - 16 through 23 (of 23 total)
You must be logged in to reply to this topic. Login to reply