January 7, 2019 at 8:26 am
Jeff Moden - Saturday, January 5, 2019 9:05 AMRGP - Saturday, January 5, 2019 3:37 AMJeff Moden - Friday, January 4, 2019 11:55 PMp.p.s. BTW, what's the data type of that JOB_INFO column?Thanks Jeff, & I really appreciate your time on this, I'll test this out soon ...[JOB_INFO] [nvarchar](400) NULL, for your reference DDL of table is:
.[JOB](
[JOB_ID] [bigint] IDENTITY(1,1) NOT NULL,
[JOB_DEFINITION_ID] [bigint] NOT NULL,
[PROCESS_ID] [uniqueidentifier] NULL,
[PARENT_JOB_ID] [bigint] NULL,
[PREVIOUS_BATCH_ID] [uniqueidentifier] NULL,
[FROM_ID] [nvarchar](400) NOT NULL,
[FROM_SOURCE_ID] [bigint] NULL,
[FROM_SESSION_ID] [bigint] NULL,
[TO_ID] [nvarchar](400) NULL,
[TO_SOURCE_ID] [bigint] NULL,
[TO_SESSION_ID] [bigint] NULL,
[JOB_STATUS] [bigint] NOT NULL,
[JOB_INFO] [nvarchar](400) NULL,
[LOCKED_BY] [uniqueidentifier] NULL,
[UDATE] [datetime] NOT NULL,
[CDATE] [datetime] NOT NULL
) ON [PRIMARY]GO
ALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [UDATE]
GOALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [CDATE]
GO"]
Ok... thanks. Even more concerning than the Job_Info column is the fact that the FROM_ID and TO_ID columns are defined as NVARCHAR(400) and not just for copy performance. It's unusual for "ID" columns to have such a wide variable width and if they really are used as "IDs" in joins and the like, I can see there being some real performance issues in many other areas.
Shifting gears back to the original problem, obviously the data will take longer to transfer than my 24 second example simply because the byte size of most columns in the real table is at least double that which I tested with. Because of the NVARCHAR(400) columns, I wouldn't be surprised if the copy took up to 4 minutes or so. Of course, that's a total swag on my part but just want to set expectations.
For the recommended query, I got a message like:
Msg 544, Level 16, State 1, Line 15
Cannot insert explicit value for identity column in table 'JOB' when IDENTITY_INSERT is set to OFF.
Do you think I need to enable the Indentity_Insert ON option ?
Query :
--=====================================================================================================================
-- We're ready for the faster way to do inserts from the source table.
-- This to 00:00:24 (hh:mi:ss) on my laptop. Yeah... you're reading that correctly... 24 seconds.
--=====================================================================================================================
--===== Ensure that we're in the Bulk-Logged Recovery Model.
-- You'll need to change the name of the database here.
ALTER DATABASE [Test] SET RECOVERY BULK_LOGGED WITH NO_WAIT
;
--===== You said the target table would be empty. Let's make sure that's true although it's not necessary. You can
-- still get "Minimal Logging" even with rows in the table and a Clustered Index (only) since 2008.
TRUNCATE TABLE Test.dbo.JOB
;
--===== Now, it's simple and fast and "Minimally Logged" when we transfer data to the target table.
INSERT INTO Test.dbo.JOB WITH (TABLOCK) --Required for "Minimal Logging"
(
JOB_ID, JOB_DEFINITION_ID, PROCESS_ID, PARENT_JOB_ID, PREVIOUS_BATCH_ID, FROM_ID, FROM_SOURCE_ID
, FROM_SESSION_ID, TO_ID, TO_SOURCE_ID, TO_SESSION_ID, JOB_STATUS, JOB_INFO, LOCKED_BY, UDATE, CDATE
)
SELECT JOB_ID
,JOB_DEFINITION_ID
,PROCESS_ID
,PARENT_JOB_ID
,PREVIOUS_BATCH_ID
,FROM_ID
,FROM_SOURCE_ID
,FROM_SESSION_ID
,TO_ID
,TO_SOURCE_ID
,TO_SESSION_ID
,JOB_STATUS
,JOB_INFO
,LOCKED_BY
,UDATE
,CDATE
FROM Admin.dbo.Job
ORDER BY JOB_ID --Because the Clustered Indexes are identical, this isn't necessary. It's just a safeguard for "Minimal Logging".
OPTION (RECOMPILE) --Helps guarantee "Minimal Logging"
;
--===== If your database started out in the FULL Recovery Model, change it back to that.
-- You'll need to change the name of the database here.
ALTER DATABASE [Test] SET RECOVERY FULL WITH NO_WAIT
;
January 7, 2019 at 8:58 am
RGP - Monday, January 7, 2019 8:26 AMJeff Moden - Saturday, January 5, 2019 9:05 AMRGP - Saturday, January 5, 2019 3:37 AMJeff Moden - Friday, January 4, 2019 11:55 PMp.p.s. BTW, what's the data type of that JOB_INFO column?Thanks Jeff, & I really appreciate your time on this, I'll test this out soon ...[JOB_INFO] [nvarchar](400) NULL, for your reference DDL of table is:
.[JOB](
[JOB_ID] [bigint] IDENTITY(1,1) NOT NULL,
[JOB_DEFINITION_ID] [bigint] NOT NULL,
[PROCESS_ID] [uniqueidentifier] NULL,
[PARENT_JOB_ID] [bigint] NULL,
[PREVIOUS_BATCH_ID] [uniqueidentifier] NULL,
[FROM_ID] [nvarchar](400) NOT NULL,
[FROM_SOURCE_ID] [bigint] NULL,
[FROM_SESSION_ID] [bigint] NULL,
[TO_ID] [nvarchar](400) NULL,
[TO_SOURCE_ID] [bigint] NULL,
[TO_SESSION_ID] [bigint] NULL,
[JOB_STATUS] [bigint] NOT NULL,
[JOB_INFO] [nvarchar](400) NULL,
[LOCKED_BY] [uniqueidentifier] NULL,
[UDATE] [datetime] NOT NULL,
[CDATE] [datetime] NOT NULL
) ON [PRIMARY]GO
ALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [UDATE]
GOALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [CDATE]
GO"]
Ok... thanks. Even more concerning than the Job_Info column is the fact that the FROM_ID and TO_ID columns are defined as NVARCHAR(400) and not just for copy performance. It's unusual for "ID" columns to have such a wide variable width and if they really are used as "IDs" in joins and the like, I can see there being some real performance issues in many other areas.
Shifting gears back to the original problem, obviously the data will take longer to transfer than my 24 second example simply because the byte size of most columns in the real table is at least double that which I tested with. Because of the NVARCHAR(400) columns, I wouldn't be surprised if the copy took up to 4 minutes or so. Of course, that's a total swag on my part but just want to set expectations.
For the recommended query, I got a message like:
Msg 544, Level 16, State 1, Line 15
Cannot insert explicit value for identity column in table 'JOB' when IDENTITY_INSERT is set to OFF.Do you think I need to enable the Indentity_Insert ON option ?
Query :
--=====================================================================================================================
-- We're ready for the faster way to do inserts from the source table.
-- This to 00:00:24 (hh:mi:ss) on my laptop. Yeah... you're reading that correctly... 24 seconds.
--=====================================================================================================================
--===== Ensure that we're in the Bulk-Logged Recovery Model.
-- You'll need to change the name of the database here.
ALTER DATABASE [Test] SET RECOVERY BULK_LOGGED WITH NO_WAIT
;
--===== You said the target table would be empty. Let's make sure that's true although it's not necessary. You can
-- still get "Minimal Logging" even with rows in the table and a Clustered Index (only) since 2008.
TRUNCATE TABLE Test.dbo.JOB
;
--===== Now, it's simple and fast and "Minimally Logged" when we transfer data to the target table.
INSERT INTO Test.dbo.JOB WITH (TABLOCK) --Required for "Minimal Logging"
(
JOB_ID, JOB_DEFINITION_ID, PROCESS_ID, PARENT_JOB_ID, PREVIOUS_BATCH_ID, FROM_ID, FROM_SOURCE_ID
, FROM_SESSION_ID, TO_ID, TO_SOURCE_ID, TO_SESSION_ID, JOB_STATUS, JOB_INFO, LOCKED_BY, UDATE, CDATE
)
SELECT JOB_ID
,JOB_DEFINITION_ID
,PROCESS_ID
,PARENT_JOB_ID
,PREVIOUS_BATCH_ID
,FROM_ID
,FROM_SOURCE_ID
,FROM_SESSION_ID
,TO_ID
,TO_SOURCE_ID
,TO_SESSION_ID
,JOB_STATUS
,JOB_INFO
,LOCKED_BY
,UDATE
,CDATE
FROM Admin.dbo.Job
ORDER BY JOB_ID --Because the Clustered Indexes are identical, this isn't necessary. It's just a safeguard for "Minimal Logging".
OPTION (RECOMPILE) --Helps guarantee "Minimal Logging"
;
--===== If your database started out in the FULL Recovery Model, change it back to that.
-- You'll need to change the name of the database here.
ALTER DATABASE [Test] SET RECOVERY FULL WITH NO_WAIT
;
Perfectly completed the insert in 5 mins!! ... I enabled set identity_insert ON...
Thank you Jeff for your continuous support in helping resolve this and making my Insert faster π
January 7, 2019 at 9:47 am
RGP - Monday, January 7, 2019 8:58 AMPerfectly completed the insert in 5 mins!! ... I enabled set identity_insert ON...
Thank you Jeff for your continuous support in helping resolve this and making my Insert faster π
How long was it taking before you rewrote it?
January 7, 2019 at 10:22 am
Jonathan AC Roberts - Monday, January 7, 2019 9:47 AMRGP - Monday, January 7, 2019 8:58 AMPerfectly completed the insert in 5 mins!! ... I enabled set identity_insert ON...
Thank you Jeff for your continuous support in helping resolve this and making my Insert faster πHow long was it taking before you rewrote it?
It was taking around 15 mins
January 7, 2019 at 10:39 am
RGP - Monday, January 7, 2019 10:21 AMJonathan AC Roberts - Monday, January 7, 2019 9:47 AMRGP - Monday, January 7, 2019 8:58 AMPerfectly completed the insert in 5 mins!! ... I enabled set identity_insert ON...
Thank you Jeff for your continuous support in helping resolve this and making my Insert faster πHow long was it taking before you rewrote it?
It was taking around 15 mins
That's great, a 3 times faster improvement.
January 7, 2019 at 11:44 am
RGP - Monday, January 7, 2019 10:21 AMJonathan AC Roberts - Monday, January 7, 2019 9:47 AMRGP - Monday, January 7, 2019 8:58 AMPerfectly completed the insert in 5 mins!! ... I enabled set identity_insert ON...
Thank you Jeff for your continuous support in helping resolve this and making my Insert faster πHow long was it taking before you rewrote it?
It was taking around 15 mins
If there were no IDENTITY column, it would be faster still because there would be no whole table sort in TempDB.
You could still have the data from the identity column but, since it's a copy of a table, you don't need the identity property on the target table.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 7, 2019 at 12:14 pm
p.s. You can also keep the IDENTITY column without a sort by loading up an identical staging table that has no IDENTITY column and then SWITCH it into the table that does have the IDENTITY column, provide that the target table is empty when you do the SWITCH or the target "table" is a partition of a larger table. That will avoid the penalty of the sort in TempDB.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 7, 2019 at 12:18 pm
Thanks, thatβs a good suggestion to try out!
Viewing 8 posts - 16 through 22 (of 22 total)
You must be logged in to reply to this topic. Login to reply