November 7, 2009 at 5:16 pm
Hello all. First off, im not sure if this is the forum i should be posting this on. If it's not im sorry and please direct me to the correct one.
The scenario im in right now got me puzzled as to what to do (and boy, i like being puzzled).
Ok, so i have 5 .csv files to import, all separated by TAB (\t) with different amount of columns.
The application we have in the company import the files into a single table with the amount of columns i specify in the application, but the problem is: i need to specify the maximum amount of columns(22) but instead of inserting nulls for the columns missing on the files that have lesser columns the app just keeps inserting and filling those columns, which completly ruins queries i need to run later on.
Now, i can tell the application to import everything into a single varchar column.
what i did then was to add a new column to the table to know which file is which worked well if a cursor.
i then added a CHAR(9)(tab) before and after each line so i could separate every field with a tally table
so now i got a 1 column table with everything i need separated by file type, but i need to pivot it back into actual columns so i can create 5 temp tables to work with.
im not sure this is the correct approach, perhaps something simpler can be done but at this moment i cant think of any and would love some ideas
Thiago Dantas
--
Thiago Dantas
@DantHimself
November 7, 2009 at 8:25 pm
See the following articles and keep in mind that the MAX of a single row will always be the value of that row.;-)
http://www.sqlservercentral.com/articles/T-SQL/63681/
http://www.sqlservercentral.com/articles/Crosstab/65048/
--Jeff Moden
Change is inevitable... Change for the better is not.
November 7, 2009 at 8:36 pm
November 7, 2009 at 9:01 pm
I agree... some readily consumable sample data and a CREATE TABLE statement would be nice.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 7, 2009 at 10:40 pm
As a bit of a side bar... if you're using the 32 bit version of SQL Server 2005, doing such a thing as importing a file with an unknown number of columns into a table becomes a bit trivial. Here's some example code I wrote for someone else a while back. Long live "Jet". 😛
/**********************************************************************************************************************
Purpose:
Script to read from a TSV (Tab Separated Values) table having an unknown number of columns of unknown data type.
This demo is setup to read a given file called Artists_mini.txt located in C:\Temp. I'm saving the "any file"
version for an article ;-)
Author: Jeff Moden - 28 Aug 2008
**********************************************************************************************************************/
--===== Declare the local variables we'll need. Names are self-documenting
DECLARE @RootKey SYSNAME,
@key SYSNAME,
@Value_Name SYSNAME,
@Type SYSNAME,
@PreviousValue SYSNAME,
@NewValue SYSNAME
--===== Preset the "constants". These are self documenting as well
SELECT @RootKey = N'HKEY_LOCAL_MACHINE',
@key = N'SOFTWARE\Microsoft\Jet\4.0\Engines\Text',
@Value_Name = N'Format',
@Type = N'REG_SZ',
@NewValue = N'TabDelimited' --May be a character using N'Delimited(,)' where the comma is the character
--Original setting is usually N'CSVDelimited'
--===== Remember the previous value so we can set it back
EXEC Master.dbo.xp_Instance_RegRead
@RootKey = @RootKey,
@Value_Name = @Value_Name,
@Value = @PreviousValue OUTPUT
--===== Show what the original delimeter setting was set to.
-- This, of course, may be commented out
SELECT 'Previous delimiter setting = ' + @PreviousValue
--===== All set... define the new temporary delimiter
EXEC Master.dbo.xp_Instance_RegWrite
@RootKey = @RootKey,
@Value_Name = @Value_Name,
@Type = @Type,
@Value = @NewValue
--===== Read the TAB delimited file without knowing any of the columns
-- Notes: "Database =" identifies the directory the file is in
-- The FROM clause identifies the file name
SELECT *
INTO #MyHead
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Text;Database=C:\Temp;HDR=YES;FMT=Delimited',
'SELECT * FROM Artists_mini.txt')
--===== Show the content of the table we just populated
SELECT * FROM #MyHEad
--===== Restore the original delimiter setting
EXEC Master.dbo.xp_Instance_RegWrite
@RootKey = @RootKey,
@Value_Name = @Value_Name,
@Type = @Type,
@Value = @PreviousValue
--===== Cleanup after the demo
DROP TABLE #MyHead
--Jeff Moden
Change is inevitable... Change for the better is not.
November 8, 2009 at 3:52 am
im not responsible for the app that does the importing, all i know is that it calls bcp but im not sure on the parameters passed and how it imports multiple files in the same table.
i tried getting sample data yestarday but realized it wont be possible till monday, thanks for the replys so far
--
Thiago Dantas
@DantHimself
November 9, 2009 at 5:12 am
CREATE TABLE TEMP_IMPORT (DATA VARCHAR(8000), ID INT IDENTITY(1,1) PRIMARY KEY)
GO
INSERT INTO TEMP_IMPORT
SELECT 'Id_Contacto'+CHAR(9)+'IdOperacion_SIR'+CHAR(9)+'Apellido_RazonSocial'+CHAR(9)+'Tipo_Doc'+CHAR(9)+'Nro_Doc'+CHAR(9)+'Cuil_Cuit'+CHAR(9)+'Tipo_Persona'+CHAR(9)+'Relacion'+CHAR(9)+'Sexo'+CHAR(9)+'Estado_Civil'+CHAR(9)+'Fecha_Nacimiento'+CHAR(9)+'Profesion'+CHAR(9)+'Categoria_IVA'+CHAR(9)+'Actividad'+CHAR(9)+'Caracteristica_de_la_Actividad'+CHAR(9)+'Tipo_Sociedad'+CHAR(9)+'Estado'+CHAR(9)+'Fecha_Verificacion'+CHAR(9)+'Usuario_Verificador' UNION ALL
SELECT '647247'+CHAR(9)+'2690693'+CHAR(9)+'AARAO PEREIRA DE SOUZA'+CHAR(9)+'CPF'+CHAR(9)+'62965930744'+CHAR(9)+''+CHAR(9)+'Física'+CHAR(9)+'Titular'+CHAR(9)+'2'+CHAR(9)+''+CHAR(9)+'15/06/1959'+CHAR(9)+'PROPRIETARIO DE ESTABELECIMENTO COMERCIAL'+CHAR(9)+'Sin Datos'+CHAR(9)+''+CHAR(9)+''+CHAR(9)+''+CHAR(9)+'Contatado'+CHAR(9)+''+CHAR(9)+'' UNION ALL
SELECT '647251'+CHAR(9)+'2672875'+CHAR(9)+'ABADIA BARBARA AMANCIO DE SOUSA'+CHAR(9)+'CPF'+CHAR(9)+'44689799687'+CHAR(9)+''+CHAR(9)+'Física'+CHAR(9)+'Titular'+CHAR(9)+'1'+CHAR(9)+''+CHAR(9)+'25/05/1957'+CHAR(9)+'OUTROS - TRABALHADORES APOSENTADOS E PENSIONISTAS'+CHAR(9)+'Sin Datos'+CHAR(9)+''+CHAR(9)+''+CHAR(9)+''+CHAR(9)+'Contatado'+CHAR(9)+''+CHAR(9)+'' UNION ALL
SELECT 'IdContacto'+CHAR(9)+'IdTelefono'+CHAR(9)+'Tipo_Telefono'+CHAR(9)+'CodPais'+CHAR(9)+'CodArea'+CHAR(9)+'Telefono'+CHAR(9)+'Interno'+CHAR(9)+'Observaciones'+CHAR(9)+'Fecha_Verificacion'+CHAR(9)+'Usuario_Verificador' UNION ALL
SELECT '647247'+CHAR(9)+'785400'+CHAR(9)+'Residencial'+CHAR(9)+'55'+CHAR(9)+'27'+CHAR(9)+'32221741'+CHAR(9)+''+CHAR(9)+''+CHAR(9)+'02/10/2008 10:34:31 a.m.'+CHAR(9)+'serasa' UNION ALL
SELECT '647251'+CHAR(9)+'5048356'+CHAR(9)+'Residencial'+CHAR(9)+'55'+CHAR(9)+'34'+CHAR(9)+'34331543'+CHAR(9)+''+CHAR(9)+''+CHAR(9)+'02/10/2008 11:55:07 a.m.'+CHAR(9)+'serasa' UNION ALL
SELECT 'Cartera'+CHAR(9)+'Nombre Agencia/Estudio'+CHAR(9)+'IdOperacion SIR'+CHAR(9)+'OperId'+CHAR(9)+'Numero Operacion'+CHAR(9)+'Numero Caso'+CHAR(9)+'IdCaso SIR'+CHAR(9)+'Tipo Garantia'+CHAR(9)+'Garantia'+CHAR(9)+'Tipo Producto'+CHAR(9)+'Producto'+CHAR(9)+'Campania'+CHAR(9)+'Campania Gestion'+CHAR(9)+'Segmento'+CHAR(9)+'Lote'+CHAR(9)+'Tipo Cartera'+CHAR(9)+'Fecha Inicio Mora'+CHAR(9)+'Clasificacion BCRA'+CHAR(9)+'Estado Operacion'+CHAR(9)+'SubEstadoOperacion'+CHAR(9)+'AM Responsable'+CHAR(9)+'Fecha_Ultima_Asignacion' UNION ALL
SELECT 'Mercantil'+CHAR(9)+'ASSET'+CHAR(9)+'2630285'+CHAR(9)+'02000451150'+CHAR(9)+'02000451150'+CHAR(9)+'1000870430'+CHAR(9)+'1933761'+CHAR(9)+'Sem Garantia'+CHAR(9)+'Sem Garantia'+CHAR(9)+'Eprestimos'+CHAR(9)+'ADIANTAMENTO DEPOSITANTE'+CHAR(9)+'Campaña Mayor a 65 años'+CHAR(9)+''+CHAR(9)+'C-ST < 10.000 y Atraso < 1825 '+CHAR(9)+'A '+CHAR(9)+'Mercantil I'+CHAR(9)+'16/12/2002'+CHAR(9)+'Sin Datos'+CHAR(9)+'Activa'+CHAR(9)+'En Gestión'+CHAR(9)+'rarroyo'+CHAR(9)+'25/06/2009' UNION ALL
SELECT 'Mercantil'+CHAR(9)+'ASSET'+CHAR(9)+'2630302'+CHAR(9)+'0000001204388040505'+CHAR(9)+'0000001204388040505'+CHAR(9)+'10019781709'+CHAR(9)+'1933772'+CHAR(9)+'Sem Garantia'+CHAR(9)+'Sem Garantia'+CHAR(9)+'Cartao Real Visa'+CHAR(9)+'MB VISA CARD'+CHAR(9)+''+CHAR(9)+''+CHAR(9)+'C-ST < 10.000 y Atraso < 1825 '+CHAR(9)+'A '+CHAR(9)+'Mercantil I'+CHAR(9)+'05/05/2004'+CHAR(9)+'Sin Datos'+CHAR(9)+'Activa'+CHAR(9)+'En Gestión'+CHAR(9)+'rarroyo'+CHAR(9)+'25/06/2009'
--manually added the CHAR(9) (tab) cause forum formating was breaking things
UPDATE TEMP_IMPORT SET DATA = CHAR(9) + DATA + CHAR(9)
this is a bit of the data i have to import, i added the header and 2 lines of 3 different files and it should ilustrate my problem
i know how many columns each file has and i know the first column name and last column name of each, all second columns are uniquely named throughout the 5 files
i used this cursor to separate file types inside the table
ALTER TABLE TEMP_IMPORT ADD ARQ TINYINT
GO
DECLARE @TIPO VARCHAR(500), @Last VARCHAR(50) , @COUNTER INT
SELECT @TIPO = '',@LAST = '', @COUNTER = 0
DECLARE CONTRATO CURSOR FOR
SELECT DATA,ID FROM TEMP_IMPORT ORDER BY ID
OPEN CONTRATO
FETCH NEXT FROM CONTRATO INTO @TIPO,@COUNTER
WHILE @@FETCH_STATUS = 0
BEGIN
IF @TIPO LIKE '%IdOperacion_SIR%'
SET @Last = 'IdOperacion_SIR'
IF @TIPO LIKE '%IdDomicilio%'
SET @Last = 'IdDomicilio'
IF @TIPO LIKE '%IdDireccionMail%'
SET @Last = 'IdDireccionMail'
IF @TIPO LIKE '%Moneda%'
SET @Last = 'Moneda'
IF @TIPO LIKE '%IdTelefono%'
SET @Last = 'IdTelefono'
IF @TIPO LIKE '%Nombre Agencia/Estudio%'
SET @Last = 'Nombre Agencia/Estudio'
UPDATE TEMP_IMPORT SET ARQ = CASE @Last WHEN 'IdOperacion_SIR' THEN 0
WHEN 'IdDomicilio' THEN 1
WHEN 'IdDireccionMail' THEN 2
WHEN 'Moneda' THEN 3
WHEN 'IdTelefono' THEN 4
WHEN 'Nombre Agencia/Estudio' THEN 5 END
WHERE ID = @COUNTER
FETCH NEXT FROM CONTRATO INTO @TIPO,@COUNTER
END
CLOSE CONTRATO
DEALLOCATE CONTRATO
now, im not sure how to extract the data from this, as each columns from the .csv files need to go to different places and the files have relations between themselves.
i used this to turn everything into a single column table
SELECT TOP 50000 IDENTITY(int,1,1) AS N
INTO #TALLY
FROM SYSCOLUMNS,SYSOBJECTS
ALTER TABLE #TALLY ADD CONSTRAINT PK_TALLY PRIMARY KEY (N ASC)
GO
SELECT SUBSTRING(DATA,N+1,CHARINDEX(CHAR(9),DATA,N+1)-N-1)
FROM #TALLY
RIGHT OUTER JOIN TEMP_IMPORT ON ISNUMERIC(ID) = 1
WHERE N < LEN(DATA)
AND SUBSTRING(DATA,N,1) = CHAR(9)
now i dont know how to proceed
Thiago Dantas
--
Thiago Dantas
@DantHimself
November 9, 2009 at 3:19 pm
as a crappy solution, i created 5 tables with the correct amount of columns, exported 5 new files from the main table, re-imported the files to the correct tables and carried on with my life.
i would still like to learn if theres a way to avoid this whole re-export-import deal as im eager to change it all again.
--
Thiago Dantas
@DantHimself
November 12, 2009 at 9:35 pm
Ok... I'm confused. I thought you said you weren't the one responsible for importing the data.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 13, 2009 at 3:17 am
im not the one that clicks the "Import" button on the app
im the one responsible for what the "Import" button will do
the company i work for has a credit recovery system and our clients receive shipments in different layouts from different banks/stores/etc and my job is to make it so that everything the bank/store/etc sends get importerd
its mostly a SUBSTRING fest hehe
--
Thiago Dantas
@DantHimself
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply