Table Optimization for fast retrival

Question

Table Optimization for fast retrival

rvasanth

SSC Eights!

Points: 873
More actions
February 26, 2012 at 9:08 pm

#256431

Hi,
I am using SQL Server 2008 R2.
I am having an table with 10M records (around 40 fields). If i fetch 1M records with simple select statement, it takes more time to retrive the result. Atleast it takes 25-30 sec. I have created proper index, statistics, everything.
Is there anyway by which i can reduce the retriival time to max 5 sec. How can i optimize this.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 1

rvasanth (2/26/2012)
Hi,
I am using SQL Server 2008 R2.
I am having an table with 10M records (around 40 fields). If i fetch 1M records with simple select statement, it takes more time to retrive the result. Atleast it takes 25-30 sec. I have created proper index, statistics, everything.
Is there anyway by which i can reduce the retriival time to max 5 sec. How can i optimize this.

No. It can't be optimized to run in 5 seconds simply because you're returning so much information. I also have to ask why you're returning a million rows to the screen? Who's going to read a million rows?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

rvasanth SSC Eights! Points: 873 More actions · Answer 2

I am not exactly showing it in the screen.

I just selecting and moving to temp table and doing further processing. Sometime, i export it to a textfile.

Nadrek SSC-Insane Points: 20039 More actions · Answer 3

You didn't really provide enough information for us to help you. That said - check for internal (index) and external (filesystem, SAN) fragmentation, check your query plan (you said you had indexes and statistics; I assume they're up to date and being used?), and check your bottlenecks all the way down to physical spindles. You're already pulling only the bare minimum of data required, correct?

If you really want to have a million rows returned very quickly, then I'd suggest creating a table structure such that you're selecting a solid, contiguous set of values out of a clustered index with a high fillfactor, and if necessary, off of extremely fast storage. Perhaps see what SQL Server compression (Enterprise+) does. Put the index/tables to be read from on one filegroup on dedicated spindles, and the one to be written on a separate filegroup/database on dedicated spindles. Perhaps investigate solid state storage. Perhaps select fewer fields for your output, and lower the amount of data you're dealing with.

Doing quick math, 1 million rows, at a wild guess (since you didn't tell us any lengths) of 500 bytes/row, is 500 megabytes of data. Doing this in 5 seconds is in excess of 100MB/s read, plus a separate 100MB/s write (you mentioned putting these in a temp table, I believe); if it's sequential, this is easily within the range of even modern consumer hardware, though I wouldn't try it on a 2Gb/s SAN if both spindle sets are on the same single fiber link. If it's random, then you're looking a lot of spindles, solid state storage, reading/writing less data, making it sequential, or changing you requirements to suit your budget.

If you want 1 million rows at 8000 bytes/row in 5 seconds, then you're looking at 1.6TB/s read, plus 1.6TB/s write; this is within the realm of the fastest SSD storage available, such as modern FusionIO ioDrive2 Duo and OCZ Z-Drive R4's.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 4

rvasanth (2/26/2012)
I am not exactly showing it in the screen.
I just selecting and moving to temp table and doing further processing. Sometime, i export it to a textfile.

Do an IO stall and wait stats analysis while your query is running. I bet you will find that your server hardware is not up to moving that amount of data in 5 seconds. make sure your system is optimally configured. If it still isn't fast enough you likely have no recourse other than to buy better hardware (CPU maybe, maybe more RAM and almost certainly a better IO subsystem).

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 5

rvasanth (2/26/2012)
I am not exactly showing it in the screen.
I just selecting and moving to temp table and doing further processing. Sometime, i export it to a textfile.

Ok... that's some information we didn't have before. I believe what you're going to need to do is see the second link in my signature line below so folks can help figure out what the performance problem may be.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Eric M Russell SSC Guru Points: 125563 More actions · Answer 6

rvasanth (2/26/2012)
Hi,
I am using SQL Server 2008 R2.
I am having an table with 10M records (around 40 fields). If i fetch 1M records with simple select statement, it takes more time to retrive the result. Atleast it takes 25-30 sec. I have created proper index, statistics, everything.
Is there anyway by which i can reduce the retriival time to max 5 sec. How can i optimize this.

It would help if the table is clustered on the same key you're using to select this sub-set of 1 million records. If you're only selecting a sub-set of columns (only a handful of columns), then consider a non-clustered "covered index" that includes all the columns contained in the select.

Also, keep in mind the following when selecting a large interim result into a temp table. If you SELECT INTO the temp table, thus creating it dynamically, then it's only an insert operation and there will generally be little or no transaction logging.

select (...) into #t from (...);

However, if you select into a pre-created temp table like in the example below, the inserts will be logged to the tempdb transaction log, because you are essentially performing a data modification operation on an existing object. Between the temp table and the transaction log, you have maybe 2x more I/O than the first example.

create table #t (...);

insert into #t (...) select (...);

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Eric M Russell (2/28/2012)
Also, keep in mind the following when selecting a large interim result into a temp table. If you SELECT INTO the temp table, thus creating it dynamically, then it's only an insert operation and there will generally be little or no transaction logging.
select (...) into #t from (...);
However, if you select into a pre-created temp table like in the example below, the inserts will be logged to the tempdb transaction log, because you are essentially performing a data modification operation on an existing object. Between the temp table and the transaction log, you have maybe 2x more I/O than the first example.
create table #t (...);
insert into #t (...) select (...);

On 2008 both select into and insert ... select can be minimally logged, insert... select requires that the table is empty to start and a tablock hint specified for the target table. Both insert... select and select .. into will be logged in the transaction log (in TempDB if it's a temp table)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Eric M Russell SSC Guru Points: 125563 More actions · Answer 8

GilaMonster (2/28/2012)
On 2008 both select into and insert ... select can be minimally logged, insert... select requires that the table is empty to start and a tablock hint specified for the target table. Both insert... select and select .. into will be logged in the transaction log (in TempDB if it's a temp table)

The other day, I was tasked with refactoring a stored procedure that inserts millions of rows into several temp tables prior to returning it's resultset. I will say that the problem I was troubleshooting was not reducing total I/O but rather a tempdb log file that kept filling the HD.

So, when inserting a select into an empty pre-created temp table, if tempdb.mdf grew by 50mb, the tempdb.ldf would grow by something like 150mb. When inserting into an empty temp table with two indexes on it, the ldf would grow by 5x as much as the mdf.

However, when I selected into the temp table, the mdf would be 50mb and the ldf only about 10mb.

I'll dig up my unit test script and see if placing a TABLOCK hint on the INSERT INTO .. SELECT.. will reduce the file growth of the tempdb.ldf file. However, it seems odd that I would have to place a TABLOCK on a private (non-global) *temporary* table.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

It's the rules of minimal logging, doesn't matter what table it is, it's syntactical rules.

See 'Operations that can be minimally logged' and 'Insert' in BoL. Neither is unlogged, but both can (in 2008) be minimally logged.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass