March 5, 2013 at 8:52 am
Hi,
I have two tables (one is main table and another one is staging table)
Ex: tableA (contains 1.2 billion rows, trigger for arrival timestamp for insert and update and non clustered index).
tableA_staging (92 million rows, no trigger and no index)
I am currently trying it and takes around 10 hour and still data is not transferred.
Can you please suggest some best way to transfer data from staging table to main table.
Thanks in advance.
March 5, 2013 at 10:47 am
monilps (3/5/2013)
Hi,I have two tables (one is main table and another one is staging table)
Ex: tableA (contains 1.2 billion rows, trigger for arrival timestamp for insert and update and non clustered index).
tableA_staging (92 million rows, no trigger and no index)
I am currently trying it and takes around 10 hour and still data is not transferred.
Can you please suggest some best way to transfer data from staging table to main table.
Thanks in advance.
Which exact direction you are transferring your data? Staging to main?
Please post complete DDL for your tables and trigger.
March 5, 2013 at 11:02 am
/****** Object: Table [dbo].[pp] ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[pp](
[e_id] [bigint] NULL,
[p_id] [bigint] NULL,
[pr_id] [int] NULL,
[s_name] [varchar](256) NULL,
[pro_id] [varchar](512) NULL,
[timestamp] [varchar](30) NULL,
[extra] [varchar](max) NULL,
[checked] [bit] NULL,
[timestamp_ts] [datetime] NULL,
[arrival_timestamp] [datetimeoffset](7) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
CREATE TABLE [dbo].[pp_staging](
[e_id] [bigint] NULL,
[p_id] [bigint] NULL,
[pr_id] [int] NULL,
[s_name] [varchar](256) NULL,
[pro_id] [varchar](512) NULL,
[timestamp] [varchar](30) NULL,
[extra] [varchar](max) NULL,
[checked] [bit] NULL,
[timestamp_ts] [datetime] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER TRIGGER [dbo].[patient_property_arrival] ON [dbo].[patient_property]
AFTER INSERT, UPDATE
AS
UPDATE dbo.pp
SET arrival_timestamp = SYSDATETIMEOFFSET()
FROM inserted i
WHERE dbo.pp.p_id = i.p_id
AND dbo.pp.s_name = i.s_name
AND dbo.pp.pro_id = i.pro_id
AND dbo.pp.timestamp_ts = i.timestamp_ts
AND dbo.pp.checked = i.checked
let me know if you need any additinal info. thanks.
March 6, 2013 at 12:28 am
Please supply the indexes on pp (patenit_property) and pp_staging as well as the INSERT statement used to copy the data.
One thing I can say is that if you were to alter your trigger to skip the update if arrival_timestamp were supplied and supply "arrival_timestamp = SYSDATETIMEOFFSET()" in your INSERT statement you would save yourself a ton of I/O. You would need to do some impact analysis however to ensure a change like this would not compromise your data should other inserts or updates supply invalid values for that column thereby circumventing the usefulness of the trigger. If that were a concern there are other things you could do with CONTEXT_INFO to skip the work in the trigger for only your batch process.
ALTER TRIGGER [dbo].[patient_property_arrival] ON [dbo].[patient_property]
AFTER INSERT, UPDATE
AS
BEGIN
IF NOT UPDATE(arrival_timestamp)
BEGIN
UPDATE dbo.pp
SET arrival_timestamp = SYSDATETIMEOFFSET()
FROM inserted i
WHERE dbo.pp.p_id = i.p_id
AND dbo.pp.s_name = i.s_name
AND dbo.pp.pro_id = i.pro_id
AND dbo.pp.timestamp_ts = i.timestamp_ts
AND dbo.pp.checked = i.checked;
END
END
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
March 6, 2013 at 8:56 am
monilps (3/5/2013)
CREATE TABLE [dbo].[pp]...CREATE TABLE [dbo].[pp_staging]...
ALTER TRIGGER [dbo].[patient_property_arrival] ON [dbo].[patient_property]...
let me know if you need any additinal info. thanks.
so we have 3 tables involved here, what is the source and destination tables of the data being transfered? Are the tables in the same database? Are you transferring the data in one big gulp or is it broken out into smaller batches of say 1000 to 5000 records?
March 6, 2013 at 9:18 am
so we have 3 tables involved here, what is the source and destination tables of the data being transfered? Are the tables in the same database?
-- Same database
Are you transferring the data in one big gulp or is it broken out into smaller batches of say 1000 to 5000 records?
-- One big gulp
** Correction ** ALTER TRIGGER [dbo].[patient_property_arrival] ON [dbo].[pp]...
CREATE NONCLUSTERED INDEX [NonClusteredIndex_pp] ON [dbo].[pp]
(
[p_id] ASC,
[s_name] ASC,
[pro_id] ASC,
[checked] ASC,
[timestamp_ts] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
March 6, 2013 at 11:08 am
monilps (3/6/2013)
2 tables involved here...Same database...
One big gulp...
OK, trying to move a large number of records in one big batch is probably why it looks like no records have been put into your destination from your source table. They won't be committed until the statement completes so the new records are only visible to the session doing the insert. This is a bit of a wild guess, but if you need to remove the records from the staging table and put them into the permanent table, something like this would do, I use a script like this for a system:
WHILE EXISTS(SELECT TOP 1 NULL FROM [dbo].[pp_staging])
BEGIN
DELETE TOP (5000) pps
OUTPUT DELETED.[e_id], DELETED.[p_id], DELETED.[pr_id], DELETED.[s_name], DELETED.[pro_id],
DELETED.[timestamp], DELETED.[extra], DELETED.[checked], DELETED.[timestamp_ts], SYSDATETIMEOFFSET()
INTO [dbo].[pp] ([e_id], [p_id], [pr_id], [s_name], [pro_id],
[timestamp], [extra], [checked], [timestamp_ts], [arrival_timestamp])
FROM [dbo].[pp_staging] pps
END
If you need to keep the records in the staging table, it gets a bit trickier, because then you need to somehow keep track of which rows in the staging table you've already done and haven't done.
March 6, 2013 at 12:24 pm
When you run your process you can track progress by checking the rowcount of the table in sys.partitions with the iso level READ_UNCOMMITTED.
Did you understand what I meant about offloading the work the trigger does onto the INSERT...SELECT?
If it is locking the table for too long you may need to break the insert into smaller batches.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply