SQL Select for 70 Million records

Question

SQL Select for 70 Million records

yuvipoy

SSChampion

Points: 10469
More actions
October 14, 2020 at 12:32 pm

#3797099

Hi,
We have two servers. Server A is source server which has a table with 5 columns (2 int, 2varchar(50) , 1 datetime)with 50 million records init. On daily base from Server B SSIS package is used to read the table and just inserts into server B destination table.
Due you business logic i am unable to say the reason for not using any lookup or other options as such now, Currently table will be truncated daily before the load . "Select Col1, Col2, Col3,Col4,Col5 from my70million" is the sql statement in SSIS source -SQL task.
Here query use to take some 25 mins to complete , sometime it uses to take 35-45 mins of time ,this might be because of other process would run in parallel, how to make reduce the time consumption and make a static time across all days with minimal time increase if the record count is more.
Any possible suggestions to reduce the time consumption. NO where clause or anyother joins to the table.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 398911 More actions · Answer 1

Hardware.

Since you're not in any way filtering data, the only solution is great big giant hardware. Spend lots of money and this will go faster.

Now, you could look at possibly breaking down the query into chunks, processing 100,000 rows at a time say. You may find this is faster. It will use less transaction log space. It'll allow for a little less locking & blocking. It'll use fewer resources over all for each given transaction. All that may result in a performance enhancement.

Otherwise, spend money.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Phil Parkin SSC Guru Points: 246991 More actions · Answer 2

As it's a full table scan, there's not so much you can do.

You could read this & then experiment with the DefaultBuffer settings.

Definitely use Fast Load and destination table lock.

If there are constraints on the target table, or a PK or indexes, consider removing them before doing the load.

Fine-tune the window and perform the run when systems are not busy with other tasks.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

ScottPletcher SSC Guru Points: 101130 More actions · Answer 3

1. If the varchar columns repeat values frequently, encode the varchar(50) columns as ints, using a separate encoding/lookup table. This would be, by far, the biggest performance improvement.

2. Page compress the table on the originating server, if it saves significant space (it wouldn't save much after encoding, but until then it could). You can check this using EXEC sys.sp_estimate_data_compression_savings. Check whether compressing the table on the destination server helps or not.

3. Use a trigger(s) to mark adds / changes / deletes on the original table so you don't have to copy the whole table every time. The triggers need to very efficient. Don't try to do this without triggers, since it's too easy to miss modifications. This assumes you don't have processes that load data in a way that avoids firing triggers.

Edit: Any bulk inserts that don't use triggers will have to make sure they load the columns that mark modifications.

This reply was modified 4 years, 2 months ago by ScottPletcher. Reason: Added clarification for bulk inserts if a trigger is used

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jeffrey Williams SSC Guru Points: 90312 More actions · Answer 4

In the OLEDB destination - change the commit and batch sizes from the default values. The default values essentially send all 70 million rows as a single transaction which then has to all be committed after all rows have been loaded.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

yuvipoy SSChampion Points: 10469 More actions · Answer 5

Thanks Phil. As suggested by you i could see buffer setting has already changed 50% more from default, fast load option is already enabled, table lock to destination table is already done, fast load option also already enabled. Is there still any other options to look into it other the link shared by you. As i could see most of the measures are already in place.

Phil Parkin SSC Guru Points: 246991 More actions · Answer 6

yuvipoy wrote:

Thanks Phil. As suggested by you i could see buffer setting has already changed 50% more from default, fast load option is already enabled, table lock to destination table is already done, fast load option also already enabled. Is there still any other options to look into it other the link shared by you. As i could see most of the measures are already in place.

The advice given by Jeffrey Williams is also worth a try.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel