July 31, 2013 at 7:51 am
Greetings friends,
I have a .dta file that's formatted in a strange way and I would like to discuss ways of importing into a staging database.
The first field of each record consists of a record type. For each entity there will be one record of type 01. Also, for each record type 01 there will be one or more record type 02. There maybe zero or more record type 03, 04, 05 and 06
For example
01*9517469000*117844165*0335*The Occupier* CRICKET CLUB***
02*1*Ground*Tea Room*35.40*9.50*336
02*2*Ground*Changing*28.00*9.50*266
02*3*Ground*Kitchen*16.40*9.50*156
02*4*Ground*Scorer*3.80*9.50*36
02*5*Ground*Mower Shed*15.00*2.38*36
03*Cricket Square*2.0*100.00*+200
03*Cricket Ground*2.0*850.00*+1726
01*9517470000*200027165*0335*The Occupier* CRICKET CLUB***
02*1*Ground*Club Room*32.60*19.00*619
02*2*Ground*Changing Room*21.60*12.67*274
02*3*Ground*Internal Storage*2.80*9.50*27
02*4*Ground*Bar*13.90*19.00*264
03*Cricket Square*1.0*100.00*+100
03*Cricket Pitch*2.8*500.00*+1420
01*9517471000*203509165*0335*The Occupier*CRICKET & FOOTBALL CLUB***
02*1*Ground*Pavilion*86.20*20.00*1724
02*2*Ground*Garage*23.40*10.00*234
02*3*Ground*Changing Room*25.00*13.33*333
03*Cricket Square*1.0*100.00*+100
03*Playing Field*1.8*500.00*+910
01*9517472000*200736165*0335*The Occupier*PT GNDOUSE****ROWOSE*
02*1*Ground*Kitchen*37.14*16.67*619
I would like to create 6 staging tables to hold the different types but I'm not sure if it's best to load all the data into SQL first then split it out or go down the SSIS route for this kind of thing? Obviously once the data is split there must be a unique ID which links the records together. Any suggestions?
Thanks in advance.
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
July 31, 2013 at 9:43 am
I would like to create 6 staging tables to hold the different types but I'm not sure if it's best to load all the data into SQL first then split it out or go down the SSIS route for this kind of thing? Obviously once the data is split there must be a unique ID which links the records together. Any suggestions?
Thanks in advance.
Looks like there is no natural "linkage" beween ros in the data file provided. The only thing what links your records is their natural sequencial position.
I guess in order to ensure the link/rleationship between your data rows, you need to preprocess your file and ad something which will identify the group. It needs to be done row by row, otherwise there will be no guarantee that you will link rows properly.
Or, ask your data provider to do a bit better job than this sh*t.
July 31, 2013 at 9:47 am
I can see that records of type 1 have something looking like ID: *9517469000*
Basically you need this one in each related row.
It should not take a lot of time for provider to add it into their extract.
Otherwise, create a script (vb, pearl or anything else for working with text files), which will grab this bit of data from line with type "01" and copy it into all subsequent lines of text, until next line which starts with "01"
July 31, 2013 at 10:11 am
In addition to what Eugene has already pointed out, it looks like the type 01 and type 02 rows contain different # of fields. This is going to cause you issues if you try to import with bcp or with a standard flat file source component in SSIS. This looks to me like a script source in the making. Once in the script source, you can read the rows in, determine the row type, and re-route them in the pipeline on different routes based on type. The script task will also let you tag the type 02 rows with the previous type 01 identifier to maintain the relationship.
July 31, 2013 at 10:27 am
Yea I've already started with the script task in SSIS route. I will post another update over in the next 24 hours.
Thanks to you both for your contribution.
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
August 2, 2013 at 10:07 am
To keep it simple, I've used streamreader to get each line and output the records into the relevant failes of Type01 to Typ07
public void Main()
{
String inputstring;
String strUARN;
String[] strRT1;
strUARN = "999999";
StreamReader sr = new StreamReader(@"C:\Users00bie\TestData.txt");
StreamWriter sw_type01 = new StreamWriter(@"C:\Users00bie\Type01.txt");
StreamWriter sw_type02 = new StreamWriter(@"C:\Users00bie\Type02.txt");
StreamWriter sw_type03 = new StreamWriter(@"C:\Users00bie\Type03.txt");
StreamWriter sw_type04 = new StreamWriter(@"C:\Users00bie\Type04.txt");
StreamWriter sw_type05 = new StreamWriter(@"C:\Users00bie\Type05.txt");
StreamWriter sw_type06 = new StreamWriter(@"C:\Users00bie\Type06.txt");
StreamWriter sw_type07 = new StreamWriter(@"C:\Users00bie\Type07.txt");
while ((inputstring = sr.ReadLine()) != null)
{
strRT1 = inputstring.Split('*');
if (strRT1[0] == "01")
{
strUARN = strRT1[2].ToString();
sw_type01.WriteLine(strUARN + '*' + inputstring);
}
if (strRT1[0] == "02")
{
sw_type02.WriteLine(strUARN + '*' + inputstring);
}
if (strRT1[0] == "03")
{
sw_type03.WriteLine(strUARN + '*' + inputstring);
}
if (strRT1[0] == "04")
{
sw_type04.WriteLine(strUARN + '*' + inputstring);
}
if (strRT1[0] == "05")
{
sw_type05.WriteLine(strUARN + '*' + inputstring);
}
if (strRT1[0] == "06")
{
sw_type06.WriteLine(strUARN + '*' + inputstring);
}
if (strRT1[0] == "07")
{
sw_type07.WriteLine(strUARN + '*' + inputstring);
}
}
sw_type01.Close();
sw_type02.Close();
sw_type03.Close();
sw_type04.Close();
sw_type05.Close();
sw_type06.Close();
sw_type07.Close();
Dts.TaskResult = (int)ScriptResults.Success;
}
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply