September 19, 2011 at 11:05 am
Hello,
My name is Joe and I am truly a newbie at any type of programming. I mean really new. I was a technician for Porsche for 33 years and when I got cancer I just couldn't take the heat of the shop anymore. A friend has a 41 year old lawnmower repair shop and his current database program keeps eating files so he asked me to help. The database has about 48,000 tab delimited files in it. (I learn quickly). It also has 2 large text files, (also tab delimited) that contain:
1. parts (44,000 tab delimited part number entries one line at a time was easy to import into excel).
2. Taxes on invoices or workorders (same story 48,332 entries imported into excel in about a minute
I had never used access until last week but I sort of knew what it was. I did a little reading and I created tables for all of my data types so nothing would be entered twice, I created an index of tables that has a one to many relationship with each table I create. I then created the following tables:
1. Index_Of_tables (previously mentioned)
2. Home_Info
3. Home_Info_Details- (subdata sheet for home info) (Some of the homes have more than one client who may have more than one piece or type of equipment)
4. Customer_Info
5. Customer_Info_Details- subdata of custinfo
6. Company_Info
7. Company_Info_Details- subdata of company info
8. Phone_Type (Cell, Home, Business, Fax)
9 4 phone tables (cell, home, company, fax) each with a subdata sheet called phone details
10. I then created a whole bunch of table for vendors (1 table), manufacturers (There are 53 manufacturers of lawn equipment Kin deals with from 30 or more different vendors), inventory, equipment type, sales type (internet, counter, workorder, wholesale, etc), sales, purchases,
11. I have specified the subdata sheets for almost all of the tables and many of the subdata sheets use other subdata sheets as their subdatasheet or I created a new sheet to fit.
12. Arranging the tables in this manner forced Access to write SQL queries for the subdatasheets automatically as my original model had the whole project being ported to a remote SQL server where 84 year old Kin couldn't hurt it. I also didn't have to learn Visual Basic For Applications language on top of studying SQL every day.
13. I borrowed a few free forms from other downloadable Access Templates and remapped them to my subdata info and the whole thing worked the first time I tried it.
14. I spent 5 days on the phone and I have most all of the vendors either sending me weekly price list tables or giving me a portal to access their live data. ( I am learning).
15. I seperated the 2 text files which I had already imported into excel/ Access/ csv
16. Now is the part where I got stucK.............I cant get the original files (which have no file extension so windows sees tham as folders) into the database. There are three types of files in this one HUGE folder.:w00t: Workorders, Telephone numbers, Counter Sales, and each file either has the transaction number or the telephone number as its title so the filename is part of the data!!!:w00t:
17. I seperated the files which were named after the telephone number and put them in a seperate folder called old_phone. (These files contain only the invoice number(s) associated with that phone number).
18. I seperated the Counter sale files which have no user information but must be kept intact because they contain parts sales (inventory tracking information) including the filename which is the transaction number. I then seperated the Work Order Files, and Counter Sale files in the same fashion and now have all 3 of these data types in seperate folders
19. As I stated the filename for each file is the invoice number, countersale number or telephone number for all of the files and the data is stored in tab delimited form and has all of the client and transaction information in it.
18. I have tried about 100 times to import this information along with the filenames into ANY type of table with no luck. I do have a whole bunch of really long text files or csv files but they all look like one really big order from 16,000 people:hehe:
19. I AM STUCK!! I am the type of person that does not ask for help until it becomes impractical to go without outside help. If you have taken the time to read this far I appreciate your valuable time. Getting this data into the database is all that is stopping me from being able to bill Kin and have him write me a check. (I did this whole project for 1500 dollars). I REALLY need this money so I can pay my bills so anything you could do to help would be appreciated. If I did something wrong in the database design that you think will haunt me later I would appreciate tips on that as well.
If you need any automotive advice I have the answer to just about any automotive question in my head and would gladly reciprocate.
September 19, 2011 at 11:23 am
Not knowing what the files look like, I need to assume a bit. Of course, that means we are all in trouble.
Are the files in folders or embedded into access?
The first line of code starts with coffee. The last line ends with alcohol.
September 19, 2011 at 11:24 am
Just to clarify:
Are you using SSIS (Integration Services) to try to import these files, or are you trying to do it via Access, or are you trying to do it via SQL Server?
I'd do it via SSIS. You can create a For Each Next object in there that will loop through all the files in a directory. It can store the file name in a variable that you can use as part of your data. So long as all the files have the same structure (same columns, same delimiter), that will be a very efficient way to pull the data into an SQL Server database.
Is that what you're looking to do?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 19, 2011 at 11:54 am
Hello and thank your for answering.
I love the coffee to alchohol transition as this project has been the source of a couple of drinks in the evening so far.
The files appear in folders in windows because they have no extension, ( I did make a copy and use a rename dos command to append a tsv or tab separated vaue to the files. Excel can see them but only one at a time.
Here is an example of the tab seperated ascii files as viewed in notepad:
(Note: because the original program used the invoice number as the filename the invoice number is not present IN the file)
The filename appears just like this: 117001
(I also forgot to add that these files are just sitting in a folder and I am just trying to get them into a table so that I can see them using my local copy of SQL)
When you choose which program to open the file with and use notepad you get this:
SAMPLE CLIENT
229-1953
4221 N.E. 17 TERR.
OAKLAND PARK
FLORIDA
06/18/08
7/10/2008 4:10:27 PM
MURRAY R/B PUSH
CHECK OUT SERVICE *( CUST. NRW PLUG )*
CLEAN FUEL SYSTEM CARB. OVERHALL KIT FUELLINE AIR FILTER & GASKET BLADE OIL
KIN 06/03/08 ( PAID CHECK # 791 )
1
WAL
K11-LMR
1
BS
491588S
2
BS
272653S
7
TEC
430173
1
ORE
97-121
1
LAB
2
1
LAB
1
September 19, 2011 at 12:04 pm
I am going to have to go with GSquared here. You really would be served well by investigating SSIS and using the for while functions to load the data.
The first line of code starts with coffee. The last line ends with alcohol.
September 19, 2011 at 12:16 pm
Mr GSquared,
With the terms you use I am assuming that you have been doing this for a while and I appreciate your time. I have tried to import using Access, Excel and about 20 other text merging utitlities. I am just trying to form a table which would include the filenames and the the data inside. I could actually auto-generate the telephone numbers associated with the invoices once I get the Work Order data into a table.
I feel bad that I do not know the SSIS term as I am just getting familiar with terms like Join and One To Many. I am embarrassed that i did not learn more terms before posting and I will continue to look up these terms as I see or hear them.
P.S. I re-read your post and to answer your question that is exactly what I am looking to do. That would give me the filename as a searchable column as well as the tab delimited data which I could port to a new table with column headings appropriate for the type of data. Then I woill split this data into the existing table types and deliver it to the client!!!!!
September 19, 2011 at 12:45 pm
Hello again,
I have discovered that SSIS stands for SQL Server Inegration Services and it looks pretty cool. Unfortunately I have not been able to find a download for this. I did download a tutorial and I am looking through the component list on the MSDN site to see if I can find the component GSquared made reference to in his post. I have only a small version of SQL present on my old laptop that must have installed with Visual Studio, (I'm g;lad I got something I could use from that download), and I wasn't even sure that was there until I did a localhost check on the files which I placed in my public folder and then was able to manipulate them using my old desktop. I have all intentions on hosting the data on a remote, (and hopefully more powerful) server and I do not know if I have room or hopsepwer enough in my old laptop, (2.8GHZ processor 1.25 Gig Ram) to run this program if I do find it. Is the reason I dont see a download link because SSIS in already present on my system or is this just another falure of the Microsoft Technet search engine to actually give me the download link?
Thanks again for your help.
P.S. The files have a new home sitting directly on the root drive. They are in 3 folders: C:\Old_Inv, C:\old_phone, and C:\Old_Work_Order
September 19, 2011 at 12:54 pm
SSIS is part of one of the paid versions of SQL Server: Standard,Enterrpsie,Developer or Datacenter;
the free version fo SQL (SQL Express) installed with Visual Studio, or downloaded for free does not contain SSIS.
Lowell
September 19, 2011 at 1:05 pm
is that what it really looks like, or is it really tab delimited, so it's something more like this with the row of data on a single line?
SAMPLE CLIENT.⇒229-19534221 N.E. 17 TERR.⇒OAKLAND PARK⇒FLORIDA⇒06/18/08⇒⇒7/10/2008 4:10:27 PM⇒MURRAY R/B PUSH⇒CHECK OUT SERVICE *( CUST. NRW PLUG )*⇒CLEAN FUEL SYSTEM CARB. OVERHALL KIT FUELLINE AIR FILTER & GASKET BLADE OIL⇒KIN 06/03/08 ( PAID CHECK # 791 )⇒1⇒WAL⇒K11-LMR ⇒1⇒BS ⇒491588S ⇒2⇒BS ⇒272653S ⇒7⇒TEC⇒430173 ⇒1⇒ORE⇒97-121 ⇒1⇒LAB⇒2 ⇒1⇒LAB⇒1
pc_doc2001 (9/19/2011)
When you choose which program to open the file with and use notepad you get this:SAMPLE CLIENT
229-1953
4221 N.E. 17 TERR.
OAKLAND PARK
FLORIDA
06/18/08
7/10/2008 4:10:27 PM
MURRAY R/B PUSH
CHECK OUT SERVICE *( CUST. NRW PLUG )*
CLEAN FUEL SYSTEM CARB. OVERHALL KIT FUELLINE AIR FILTER & GASKET BLADE OIL
KIN 06/03/08 ( PAID CHECK # 791 )
1
WAL
K11-LMR
1
BS
491588S
2
BS
272653S
7
TEC
430173
1
ORE
97-121
1
LAB
2
1
LAB
1
Lowell
September 19, 2011 at 1:08 pm
Hello Lowell and thank you for your reply,
Ignore the reviews of the 2006 Toshiba Satellite!!! I am off-loading all of my music, video, picture and web design software files onto a portable hard drive to make room for a full version of SQL Server Standard on my hard drive all while responding to this forum and double-checking the relationship tables in Access 2010. I did get the technical specs, (my Laptop qualifies!!), and my yearly MSDN suscription will allow me a license number for this software. (I have 2 MCSE Certificates and I have been an MSDN member for about 12 years).
Thank you, Joe
September 19, 2011 at 1:16 pm
Hello again Lowell,
This is great. I get the feeling that I am not as alone in this project anymore and I really apreciate that. The data looks EXACTLY like I showed you and I originally thought that the data was separated by the enter key but since that wasn't a choice in the lists of data types I did a little bit of reading and I read that even though the enter key was programatically used, tab delimited indicates that the next part of data is coming after the end of a paragraph statement. (Looks good from my house) 🙂 When I chose Tab delimited as the type of data to import into: Word, Excel, Access, and a bunch of other programs it always comes out the same way which makes it really hard to figure out how to sort it.
Addendum: I got a link from my hours of searching on Google and the MSDN website and I found this script for vb that lists al of the names of a file so that I can at least get the names of the files in a table.
http://www.word.mvps.org/FAQs/MacrosVBA/InsertFileNames.htm.
I just got finished downloading SQL and I will install it when I have cleared out enough room. (Man, I was happier when I thought I could get away without installing a fulll blown copy of SQL on my local computer. It sounded great. Build, test, deploy, get paid. If anyone knows a work-around to installing SQL server on my laptop I would appreciate it. I will still get plenty of opportunity to use SQL as Kin, (The owner of the Lawnmower shop) has told me that he would feel more comfortable having his data on-site and letiing our new internet clients get the data from us. (Just in case we get another hurricane that kills all of the power and phone lines we will still be able to service the equipment needed to put everyting back together again). Thanks again, Joe
The best things in life may be free but they rarely pay the bills.
September 20, 2011 at 6:14 am
In that sample data, what does this part mean?
1
WAL
K11-LMR
1
BS
491588S
2
BS
272653S
7
TEC
430173
1
ORE
97-121
1
LAB
2
1
LAB
1
How would you translate that into rows/columns?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 20, 2011 at 10:31 am
Hello GSquared,
I was hoping you would look back on this post. From your previous extremely quick answer ir seems that you are blessed with plenty of knowledge on SQL.
First to answer your question the sample that I posted is a Work order file and the 3 items are:
1. Quantity of part
2. Manufacturer code (3 letter code indicating the manufacturer of the part)
3. Manfacturer part number
These 3 numbers are repeated as many times as needed indicating the amount of parts sold on that repair order (LAB is the code for labor in this antiquated system) In the Excel template that I have that is waiting for data I have room for 20 parts because the most I have on any invoice is 18. (Red Cross buying spare parts during a repair after Hurricane Wilma)
I will elaborate:
In the large folder that was created by the original program I actually have 5 data types:
1. Parts info (A large single page text file with one part number, part description, manufacturer code, etc.)
2. Tax Info ( A large single page text file much like the parts.txt)
3. Counter Sale Number files with a 1- 5 digit filename numbered sequentially. (These are the counter sale invoices and the filename is the transaction ID The first transaction is 1 the latest is in the 14,000 range)
4. Telephone number files (named by the phone number and containing invoice numbers that relate to that phone number)
5. Work Order Number files (named by the Work Order number and containing all of the client details and the items that client purchased during the repair on their equipment)
The Work Order files and counter sale files are the ones I am the most concerned about because I will auto-generate Work Order numbers by phone and counter sales numbers by phone data as well as the tax data by date when I get the information into the database to keep from having inconsistencies.
I'm sorry to give such a long answer but when I am lucky enough to have the audience of great technicians the clearer I speak the better chance of getting the information I need. I must mention that I am in Fort Lauderdale, Florida USA so even though I have Windows Messenger running and I answer any post questions as soon as I see the email I believe we are in different time zones so I may not see updates to the post right away.
Thanks again for your time, Joe
September 21, 2011 at 6:26 am
Got it on the bottom half. Now the top part.
SAMPLE CLIENT
229-1953
4221 N.E. 17 TERR.
OAKLAND PARK
FLORIDA
06/18/08
7/10/2008 4:10:27 PM
MURRAY R/B PUSH
CHECK OUT SERVICE *( CUST. NRW PLUG )*
CLEAN FUEL SYSTEM CARB. OVERHALL KIT FUELLINE AIR FILTER & GASKET BLADE OIL
KIN 06/03/08 ( PAID CHECK # 791 )
1
WAL
K11-LMR
1
BS
491588S
2
BS
272653S
7
TEC
430173
1
ORE
97-121
1
LAB
2
1
LAB
1
Is the part of the file I marked in bold consistent? Does it have a fixed pattern, like "always 12 rows", or "the last row is always like ...."? Anything like that?
If so, it should be possible to tell an import process where to start turning it into Qty, Mft, PtNo values, and what to put elsewhere. If not, it's going to be quite tricky to automate pulling in a lot of these files.
Edit: Or do you have a list of the Manufacturer Codes? If so, it can look for the first row that matches one of those, and start parsing the table from the row right above that.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 21, 2011 at 8:04 am
Yea I was thinking alternatively just looking for the line "KIN 06/03/08 ( PAID CHECK # 791 )", since that seems also like a fixed point. Doing a raw file import against the entire file, with one row per line, and putting an identity field on the table, then saying "where ID > ###", corresponding to that number above.
Then, you can find the relevant pieces of information by finding rows and using a modulus operator to find only the one corresponding to the first in a set, and then self-joining using the identity + 1 and +2 to find the other two rows.
So all in all, the query would do something like:
DECLARE @ID INT
SELECT TOP 1 @ID = PK FROM TABLE WHERE Field LIKE 'KIN %' ORDER BY PK
SELECT Row1.Field AS Quantity, Row2.Field AS ManufacturerCode, Row3.Field AS ManufacturerPartNumber
FROM TABLE Row1
JOIN TABLE Row2 ON (Row1.PK + 1) = Row2.PK
JOIN TABLE Row3 ON (Row1.PK + 2) = Row3.PK
WHERE ((@ID - PK) % 3) = 1 AND PK > @ID
Then inserting that row into a new table, and then looping again where @ID > the previous ID.
So whole script becomes something like:
CREATE TABLE #RawRecords
(
PK INT IDENTITY PRIMARY KEY,
[Field] NVARCHAR(MAX)
)
-- Sample record insertion. Real insertion would be done through SSIS.
INSERT INTO #RawRecords (Field)
VALUES ('SAMPLE CLIENT')
INSERT INTO #RawRecords (Field)
VALUES ('229-1953')
INSERT INTO #RawRecords (Field)
VALUES ('4221 N.E. 17 TERR.')
INSERT INTO #RawRecords (Field)
VALUES ('OAKLAND PARK')
INSERT INTO #RawRecords (Field)
VALUES ('FLORIDA')
INSERT INTO #RawRecords (Field)
VALUES ('06/18/08')
INSERT INTO #RawRecords (Field)
VALUES ('')
INSERT INTO #RawRecords (Field)
VALUES ('7/10/2008 4:10:27 PM')
INSERT INTO #RawRecords (Field)
VALUES ('MURRAY R/B PUSH')
INSERT INTO #RawRecords (Field)
VALUES ('CHECK OUT SERVICE *( CUST. NRW PLUG )*')
INSERT INTO #RawRecords (Field)
VALUES ('CLEAN FUEL SYSTEM CARB. OVERHALL KIT FUELLINE AIR FILTER & GASKET BLADE OIL')
INSERT INTO #RawRecords (Field)
VALUES ('KIN 06/03/08 ( PAID CHECK # 791 )')
INSERT INTO #RawRecords (Field)
VALUES ('1')
INSERT INTO #RawRecords (Field)
VALUES ('WAL')
INSERT INTO #RawRecords (Field)
VALUES ('K11-LMR')
INSERT INTO #RawRecords (Field)
VALUES ('1')
INSERT INTO #RawRecords (Field)
VALUES ('BS')
INSERT INTO #RawRecords (Field)
VALUES ('491588S')
INSERT INTO #RawRecords (Field)
VALUES ('2')
INSERT INTO #RawRecords (Field)
VALUES ('BS')
INSERT INTO #RawRecords (Field)
VALUES ('272653S')
INSERT INTO #RawRecords (Field)
VALUES ('7')
INSERT INTO #RawRecords (Field)
VALUES ('TEC')
INSERT INTO #RawRecords (Field)
VALUES ('430173')
INSERT INTO #RawRecords (Field)
VALUES ('1')
INSERT INTO #RawRecords (Field)
VALUES ('ORE')
INSERT INTO #RawRecords (Field)
VALUES ('97-121')
INSERT INTO #RawRecords (Field)
VALUES ('1')
INSERT INTO #RawRecords (Field)
VALUES ('LAB')
INSERT INTO #RawRecords (Field)
VALUES ('2')
INSERT INTO #RawRecords (Field)
VALUES ('1')
INSERT INTO #RawRecords (Field)
VALUES ('LAB')
INSERT INTO #RawRecords (Field)
VALUES ('1')
CREATE TABLE #ManufacturerPartOrder
(
PK INT IDENTITY PRIMARY KEY,
Quantity INT,
ManufacturerCode VARCHAR(20),
ManufacturerPartNumber VARCHAR(20)
)
DECLARE @MaxID INT
SELECT @MaxID = MAX(PK) FROM #RawRecords WHERE Field LIKE 'KIN %'
DECLARE @ID INT
SET @ID = 0
WHILE @ID < @MaxID
BEGIN
SELECT
TOP 1 @ID = PK
FROM #RawRecords
WHEREField LIKE 'KIN %'
AND PK > @ID
ORDER BY PK
INSERT INTO #ManufacturerPartOrder (Quantity, ManufacturerCode, ManufacturerPartNumber)
SELECT
Row1.Field AS Quantity,
Row2.Field AS ManufacturerCode,
Row3.Field AS ManufacturerPartNumber
FROM #RawRecords Row1
JOIN #RawRecords Row2 ON (Row1.PK + 1) = Row2.PK
JOIN #RawRecords Row3 ON (Row1.PK + 2) = Row3.PK
WHERE ((Row1.PK - @ID) % 3) = 1 AND Row1.PK > @ID
END
SELECT * FROM #ManufacturerPartOrder
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply