I have a table from a Linked Server that I'm pulling data and would like to know if there is a more efficient way based upon this table design. I need to full reload and not use the "t_stamp >= DATEDIFF_BIG(MS, '19691231 20:00:00', GETDATE()) - (1000 * 60 * 30))" in the where clause.
I know what the min and max for t_stamp is for the table, but that is such a large pull I was wondering if I could break up the select
and grab say 50,000 records between the min and max until no more records to process. The select is doing and Insert into a staging table for other processing.
Thanks, and if more info is needed ..
Select
distinct ql.Line,ql.Tag_Description,sd.floatvalue as 'CaptureValue',DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00') as 'DateRecorded'
from
SM_Query_Lookups ql,
rem.db1.dbo.sqlt_data_1_2021_08 sd
where
(t_stamp >= DATEDIFF_BIG(MS, '19691231 20:00:00', GETDATE()) - (1000 * 60 * 30))
and ql.Tag_ID = sd.tagid
CREATE TABLE [dbo].[sqlt_data_1_2021_08](
[tagid] [int] NOT NULL,
[intvalue] [bigint] NULL,
[floatvalue] [float] NULL,
[stringvalue] [nvarchar](255) NULL,
[datevalue] [datetime] NULL,
[dataintegrity] [int] NULL,
[t_stamp] [bigint] NOT NULL,
PRIMARY KEY CLUSTERED
(
[tagid] ASC,
[t_stamp] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
December 7, 2022 at 10:12 pm
Take a look at https://techcommunity.microsoft.com/t5/sql-server-blog/sql-server-2016-minimal-logging-and-impact-of-the-batchsize-in/ba-p/385537. You may be able to take advantage of minimal logging to bulk load the data.
BCP or bulk insert are an option for loading a table from a file exported out of the source database using bcp. If the table is very large, you can use batch size to control transaction log growth.
How many rows in the source table ?
December 7, 2022 at 11:11 pm
The source table has 2 billion but not all records are chosen since it does a lookup ql.Tag_ID = sd.tagid
Any example of a script to do it in batches of 50K and using the min\max of T_Stamp as the driver?
THanks.
December 8, 2022 at 1:31 am
The source table has 2 billion but not all records are chosen since it does a lookup ql.Tag_ID = sd.tagid
Any example of a script to do it in batches of 50K and using the min\max of T_Stamp as the driver?
THanks.
I don't know which table is the linked server table but joining to it means that all rows must be examined. You need something more discreet.
You, apparently, have an "accidental" many-to-many join going on and you have the need to use DISTINCT because of that problem.
Consider using OPENQUERY so the query will run on the remote server and give it some decent criteria to pull the data with. It'll run a lot fast.
I also recommend that you avoid using the old fashion method of joining in whatever fix you come up with.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 8, 2022 at 1:55 am
This is the table name from the Linked Server -- rem.db1.dbo.sqlt_data_1_2021_08
SM_Query_Lookups tbl is on the local server .. The remote server contains all of the values I'm using for my insert to Staging table on
local server..
December 8, 2022 at 2:17 am
I don't know the data well enough to say with conviction, but there has to be some criteria that you could use in an open query, other than a join, to bring the linked server data over in a hurry and the further filter it with a local join.
Of course, with the way things have been going today, I might just be low on caffeine.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 8, 2022 at 2:35 am
In my original query I was using this on linked server
t_stamp >= DATEDIFF_BIG(MS, '19691231 20:00:00', GETDATE()) - (1000 * 60 * 30)) along with the local join..
December 8, 2022 at 1:53 pm
I'm having syntax problems doing OpenQuery with pulling values..
DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00') as 'DateRecorded'
December 8, 2022 at 5:14 pm
Gosh, Bruin... how do you expect anyone to help with such little information?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 8, 2022 at 7:07 pm
Sorry Jeff.
I have been working with open query trying to get the values I need for my Insert into Staging Table. The query I have is returning some of my values but not getting the local table values.
Insert Into Tag_Data_Capture_staging
(EquipID,TagName,CaptureValue,DateRecorded)
Values needed from query
ql.Line,ql.Tag_Description,sd.floatvalue as 'CaptureValue', DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00') as 'DateRecorded'
Also, If I know what the min and max for t_stamp is for the remote table and seeing how the remote table is large I was wondering if I could break up the select and grab say 50,000 records between the min and max until no more records to process. Not sure how to incorporate that into script.
SELECT
*
FROM OPENQUERY (srv1, 'SELECT tagid,floatvalue,DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'') as ''DateRecorded''
FROM rem.dbo.sqlt_data_4_2022_12') sd
left
JOIN process.dbo.SM_Query_Lookups ql
on ql.Tag_ID = rem.tagid
OPTION (MAXDOP 1);
December 8, 2022 at 7:42 pm
I'm sure you have heard of
All the above can help you with
the remaining work (e.g. build the process) its really up to you as you are the one getting paid to do this work.
December 8, 2022 at 9:57 pm
Sorry Jeff.
I have been working with open query trying to get the values I need for my Insert into Staging Table. The query I have is returning some of my values but not getting the local table values.
Insert Into Tag_Data_Capture_staging (EquipID,TagName,CaptureValue,DateRecorded)
Values needed from query
ql.Line,ql.Tag_Description,sd.floatvalue as 'CaptureValue', DATEADD(s,t_stamp/1000,'1969-12-31 20:00:00') as 'DateRecorded'
Also, If I know what the min and max for t_stamp is for the remote table and seeing how the remote table is large I was wondering if I could break up the select and grab say 50,000 records between the min and max until no more records to process. Not sure how to incorporate that into script.
SELECT
*
FROM OPENQUERY (srv1, 'SELECT tagid,floatvalue,DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'') as ''DateRecorded''
FROM rem.dbo.sqlt_data_4_2022_12') sd
left
JOIN process.dbo.SM_Query_Lookups ql
on ql.Tag_ID = rem.tagid
OPTION (MAXDOP 1);
You're still joining to the OPENQUERY. That means the OPENQUERY has to read all the rows from the linked server. There's also no criteria on the linked table. How many rows are you retrieving from the linked table?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 8, 2022 at 11:04 pm
How many relevant unique tag_ids are there? Are they relatively static, or are new ones inserted continually?
Can't you insert those into a table on the linked server (or just hardcode in an in clause if very few) so that you can avoid the join back to the local table? Even if rapidly changing, pushing those to the linked server if necessary could probably still be relatively quick if indexed correctly.
December 8, 2022 at 11:34 pm
there are over 500 unique tag_ids
December 13, 2022 at 12:24 pm
Insert Into Tag_Data_Capture_staging
(TagName,CaptureValue,DateRecorded)
select *
from OPENQUERY(srv1, 'select tagid,floatvalue,DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'') from rem.ignition.dbo.sqlt_data_1_2021_12
where DATEADD(s,t_stamp/1000,''1969-12-31 20:00:00'') between ''12-09-2021'' AND ''12-10-2021'' and floatvalue is not null');
I did the Insert to local table and used criteria to narrow down data that is retrieved. DO you see any better way to do the Open Query ?
Thanks for your replies.
Viewing 15 posts - 1 through 15 (of 55 total)
You must be logged in to reply to this topic. Login to reply