July 17, 2014 at 7:10 am
Good day,
I have been told to import 100G data generated in SSMS and in .rpt format (10+ files) back to another SQL2008R2 server, I am wondering what would be the best practice to do this in a one time basis.
Thank you in advance, any suggestion is appreciated.
July 17, 2014 at 8:53 am
I should have mentioned in the OP that I can't ask for a different format at this moment, I need to get this done for this batch of data.
Thanks.
July 17, 2014 at 9:04 am
I would have to see the file format before even thinking about options.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
July 17, 2014 at 10:19 am
Since it's from SSMS, it's likely just a standard text file .rpt, rather than a Crystal Reports .rpt.
I suggest using SSIS to import that data. It will of course be a fairly long process to read, interpret and load 100GB of text data.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 17, 2014 at 10:22 am
if it is plain text, i'd just use bulk insert instead of building an SSIS package;
for me, that would be faster than anything else.
Lowell
July 17, 2014 at 1:33 pm
Lowell (7/17/2014)
if it is plain text, i'd just use bulk insert instead of building an SSIS package;for me, that would be faster than anything else.
I'm thinking you're gonna have to do some serious manipulation on that .rpt file to load it to SQL tables, in which case I'd want the transformation capabilities of SSIS.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 17, 2014 at 1:51 pm
ScottPletcher (7/17/2014)
Since it's from SSMS, it's likely just a standard text file .rpt, rather than a Crystal Reports .rpt.I suggest using SSIS to import that data. It will of course be a fairly long process to read, interpret and load 100GB of text data.
It is plain text although the file extension is very misleading .rpt
I decide to take your approach and here is what I've done and what I am having issue:
raw data (test.csv):
326567 98 1026 11 35638468 3232282837 2259294439 2886995029 1 395 0
281578 9 1026 11 48210209 3232282839 3247785220 2886998653 1 119 0
Here is the format file:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="17"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="6" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="7" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="8" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="9" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="10" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="11" xsi:type="CharFixed" LENGTH="20"/>
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r"
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Field1" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="2" NAME="Field2" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="3" NAME="Field3" xsi:type="SQLINT" LENGTH="17"/>
<COLUMN SOURCE="4" NAME="Field4" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="5" NAME="Field5" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="6" NAME="Field6" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="7" NAME="Field7" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="8" NAME="Field8" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="9" NAME="Field9" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="10" NAME="Field10" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="11" NAME="Field11" xsi:type="SQLINT" LENGTH="20"/>
</ROW>
</BCPFORMAT>
Here is the table's schema:
CREATE TABLE [dbo].[LogDB](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Field1] [int] NULL,
[Field2] [bigint] NULL,
[Field3] [bigint] NULL,
[Field4] [bigint] NULL,
[Field5] [bigint] NULL,
[Field6] [bigint] NULL,
[Field7] [bigint] NULL,
[Field8] [bigint] NULL,
[Field9] [bigint] NULL,
[Field10] [bigint] NULL,
[Field11] [bigint] NULL,
CONSTRAINT [PK_LogDB] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Here is my script:
BULK INSERT Ticket.dbo.LogDB
FROM 'C:\temp\test.csv'
WITH
(
FORMATFILE = 'C:\temp\format.xml'
);
It prompts me this error:
Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to bigint.
How do I fix it? Thank you so much for your help.
BTW: the original raw data file .rpt contains column names in its first line and separated line as the second line. Is there an option in Bulk input that can automatically generate the table based on the raw data file's name as well as the table's column names?
July 17, 2014 at 1:58 pm
halifaxdal (7/17/2014)
ScottPletcher (7/17/2014)
Since it's from SSMS, it's likely just a standard text file .rpt, rather than a Crystal Reports .rpt.I suggest using SSIS to import that data. It will of course be a fairly long process to read, interpret and load 100GB of text data.
It is plain text although the file extension is very misleading .rpt
I decide to take your approach and here is what I've done and what I am having issue:
raw data (test.csv):
326567 98 1026 11 35638468 3232282837 2259294439 2886995029 1 395 0
281578 9 1026 11 48210209 3232282839 3247785220 2886998653 1 119 0
Here is the format file:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="17"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="6" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="7" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="8" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="9" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="10" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="11" xsi:type="CharFixed" LENGTH="20"/>
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r"
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Field1" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="2" NAME="Field2" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="3" NAME="Field3" xsi:type="SQLINT" LENGTH="17"/>
<COLUMN SOURCE="4" NAME="Field4" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="5" NAME="Field5" xsi:type="SQLINT" LENGTH="12"/>
<COLUMN SOURCE="6" NAME="Field6" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="7" NAME="Field7" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="8" NAME="Field8" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="9" NAME="Field9" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="10" NAME="Field10" xsi:type="SQLINT" LENGTH="21"/>
<COLUMN SOURCE="11" NAME="Field11" xsi:type="SQLINT" LENGTH="20"/>
</ROW>
</BCPFORMAT>
Here is the table's schema:
CREATE TABLE [dbo].[LogDB](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Field1] [int] NULL,
[Field2] [bigint] NULL,
[Field3] [bigint] NULL,
[Field4] [bigint] NULL,
[Field5] [bigint] NULL,
[Field6] [bigint] NULL,
[Field7] [bigint] NULL,
[Field8] [bigint] NULL,
[Field9] [bigint] NULL,
[Field10] [bigint] NULL,
[Field11] [bigint] NULL,
CONSTRAINT [PK_LogDB] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Here is my script:
BULK INSERT Ticket.dbo.LogDB
FROM 'C:\temp\test.csv'
WITH
(
FORMATFILE = 'C:\temp\format.xml'
);
It prompts me this error:
Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to bigint.
How do I fix it? Thank you so much for your help.
BTW: the original raw data file .rpt contains column names in its first line and separated line as the second line. Is there an option in Bulk input that can automatically generate the table based on the raw data file's name as well as the table's column names?
I suspect the termination is \r rather than just \r. But also verify all the data lengths, just to be sure:
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r"
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 17, 2014 at 2:10 pm
Thanks ScottPletcher, I removed and also removed the last line which shows some info like "totally xxx rows"
The script returns me error:
Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to bigint.
Where is the varchar from?
The raw data does have lots extra big numbers and I just cannot use varchar for those fields.
July 17, 2014 at 2:32 pm
I got some hints when I tried another file, it seems the first couple characters in the raw data file has some non-numeric character so the script throws error like:
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value '326567 ' to data type int.
Those three characters are EF BB EF, what are they for? the file is in ANSI/ASCII format.
I also tried to save the raw file in a different format UTF-8, same error.
If I save the file as Unicode, EF BB EF were removed, but then I got error:
Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to bigint.
As I can inspect each character in hex code, I suspect that the hex 20 (which means blank space) is causing the issue
July 18, 2014 at 11:05 am
The format file is:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="17"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="12"/>
<FIELD ID="6" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="7" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="8" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="9" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="10" xsi:type="CharFixed" LENGTH="21"/>
<FIELD ID="11" xsi:type="CharFixed" LENGTH="20"/>
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r" LENGTH="2">
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="Field1" xsi:type="SQLCHAR" LENGTH="12"/>
<COLUMN SOURCE="2" NAME="Field2" xsi:type="SQLCHAR" LENGTH="12"/>
<COLUMN SOURCE="3" NAME="Field2" xsi:type="SQLCHAR" LENGTH="17"/>
<COLUMN SOURCE="4" NAME="Field2" xsi:type="SQLCHAR" LENGTH="12"/>
<COLUMN SOURCE="5" NAME="Field2" xsi:type="SQLCHAR" LENGTH="12"/>
<COLUMN SOURCE="6" NAME="Field2" xsi:type="SQLCHAR" LENGTH="21"/>
<COLUMN SOURCE="7" NAME="Field2" xsi:type="SQLCHAR" LENGTH="21"/>
<COLUMN SOURCE="8" NAME="Field2" xsi:type="SQLCHAR" LENGTH="21"/>
<COLUMN SOURCE="9" NAME="Field2" xsi:type="SQLCHAR" LENGTH="21"/>
<COLUMN SOURCE="10" NAME="Field2" xsi:type="SQLCHAR" LENGTH="21"/>
<COLUMN SOURCE="11" NAME="Field2" xsi:type="SQLCHAR" LENGTH="20"/>
</ROW>
</BCPFORMAT>
Why SQL throws error:
Msg 9436, Level 16, State 48, Line 1
XML parsing: line 16, character 9, end tag does not match start tag
It points to </RECORD> as the error location.
Thanks.
July 18, 2014 at 11:18 am
Now I think the issue happens in the last field which presumably is 20 characters but in the case the value is single digit it uses only 1 character, however, there are cases that it has a large number as its value, like 1631571 instead of 0
That will automatically push the 0D 0A.
That also means the csv is not completely fix-width for all columns, there is an exception to be handled, either in bulk insert format or in SSIS design.
The SSIS design works fine until it hits the special line and screws up after that line.
Any idea for this?
I am attaching the csv here for easier to understand. I changed .csv to .txt so it can be uploaded.
Thanks to all replies.
July 18, 2014 at 11:24 am
I put a "\" (backslash) immediately followed by "n" in my previous comments, after the \r, but the site must remove it as some type of special editing character.
Try setting the end terminator to:
TERMINATOR="\r\~~n"
but removing the ~~ first, so the \ and the n are directly next to each other.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 18, 2014 at 12:23 pm
ScottPletcher (7/18/2014)
I put a "\" (backslash) immediately followed by "n" in my previous comments, after the \r, but the site must remove it as some type of special editing character.Try setting the end terminator to:
TERMINATOR="\r\~~n"
but removing the ~~ first, so the \ and the n are directly next to each other.
I tried your reply ( removing the ~~):
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r\~~n" LENGTH="2">
It still says
Msg 9436, Level 16, State 48, Line 1
XML parsing: line 16, character 9, end tag does not match start tag
July 18, 2014 at 12:59 pm
halifaxdal (7/18/2014)
ScottPletcher (7/18/2014)
I put a "\" (backslash) immediately followed by "n" in my previous comments, after the \r, but the site must remove it as some type of special editing character.Try setting the end terminator to:
TERMINATOR="\r\~~n"
but removing the ~~ first, so the \ and the n are directly next to each other.
I tried your reply ( removing the ~~):
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r\~~n" LENGTH="2">
It still says
Msg 9436, Level 16, State 48, Line 1
XML parsing: line 16, character 9, end tag does not match start tag
REMOVE THE ~~ FROM THE STRING.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 15 posts - 1 through 15 (of 58 total)
You must be logged in to reply to this topic. Login to reply