Table Scan

Question

Table Scan

Rog Saber

SSCrazy

Points: 2548
More actions
March 3, 2011 at 2:50 pm

#232937

I'm selecting from a table and a view. I'm trying to reduce the timeit takes to return the data. Here is an example of my select and a definition of the view. The table has quite a few indexes on it.
When I run an 'execution plan', it states: Table scan 85%. Will a query always have to do a table scan? My query currently takes about 45 seconds. RT_TABLE has 986,129 rows.
SELECT DISTINCT RT_TABLE.col_init,
RT_TABLE.col_no,
RT_TABLE.ship_date_time,
more columns.....
RT_TABLE.trouble_yorn,
'N' as free_runner,
RT_TABLE.divert_yorn,
RT_TABLE.shop_move,
RT_TABLE.trip_status,
RT_TABLE.rqst_dlv_date_time,
datediff(day,RT_TABLE.trip_cls_date_time, getdate()),
datediff(dd,getdate(),RT_TABLE.orig_eda_eta_origin),
datediff(dd, getdate(),RT_TABLE.orig_eda_eta_cust)
FROM RT_TABLE,
RTV_CURRENT_TABLE
WHERE
( RT_TABLE.col_init = RTV_CURRENT_TABLE.col_init ) and
( RT_TABLE.col_no = RTV_CURRENT_TABLE.col_no ) and
( RT_TABLE.ship_date_time = RTV_CURRENT_TABLE.maxRTTRIPshipdatetime ) and
( RT_TABLE.free_runner = 'N')
View
SELECT RT_TABLE.col_init, RT_TABLE.col_no, max(RT_TABLE.ship_date_time) maxRTTRIPshipdatetime
FROM RT_TABLE
GROUP BY RT_TABLE.col_init,
RT_TABLE.col_no

Viewing 15 posts - 1 through 15 (of 39 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

Can you post the actual execution plan together will both table definition including any index definition if they exist? (I'd expect no index on the tables since you mention a table scan rather than a clustered index scan.)

Regarding your view definition:

Since there is nothing to limit the number of rows (meaning no WHERE clause), you're actually asking for all rows. Hence the table scan.

Depending on the way the view is called and the indexing of the referenced tables you might see a index seek or a seek and a lookup or still a table scan. It depends.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Rog Saber SSCrazy Points: 2548 More actions · Answer 2

There are 15 Non-Unique, Non-Clustered indexes on RT_TRIP. I don't know how to get the execution plan in text - in only shows up graphically.

***** RT_TRIP DEF

CREATE TABLE [dbo].[RT_TRIP](

[car_init] [char](4) NOT NULL,

[car_no] [char](6) NOT NULL,

[ship_date_time] [datetime] NOT NULL,

[cust_no] [char](13) NULL,

[route_code] [varchar](20) NULL,

[trip_no] [char](15) NULL,

[orig_city] [char](9) NULL,

[orig_state] [char](2) NULL,

[dest_city] [char](9) NULL,

[dest_state] [char](2) NULL,

[release_city] [char](9) NULL,

[release_state] [char](2) NULL,

[return_city] [char](9) NULL,

[return_state] [char](2) NULL,

[car_type] [char](1) NULL,

[comm_code] [char](10) NULL,

[fleet_id] [char](5) NULL,

[orig_eda_eta_cust] [datetime] NULL,

[orig_eda_eta_origin] [datetime] NULL,

[bad_order_days] [decimal](3, 0) NULL,

[car_condition] [char](1) NULL,

[car_status] [char](1) NULL,

[trip_status] [char](1) NULL,

[sight_code] [char](1) NULL,

[clm_date_time] [datetime] NULL,

[loc_city] [char](9) NULL,

[loc_state] [char](2) NULL,

[le] [char](1) NULL,

[road] [char](4) NULL,

[dest_arr_date_time] [datetime] NULL,

[dest_arr_type] [char](3) NULL,

[cust_dlv_date_time] [datetime] NULL,

[cust_dlv_type] [char](3) NULL,

[cust_rel_date_time] [datetime] NULL,

[cust_rel_type] [char](3) NULL,

[trip_cls_date_time] [datetime] NULL,

[trip_cls_type] [char](3) NULL,

[business_group] [char](8) NULL,

[divert_yorn] [char](1) NULL,

[shop_move] [char](1) NULL,

[one_way_round] [char](1) NULL,

[proc_date_time] [datetime] NULL,

[free_runner] [char](1) NULL,

[supplier_shipment] [char](1) NULL,

[prev_trip_close_date_time] [datetime] NULL,

[freight_pmt_code] [char](1) NULL,

[comments] [char](120) NULL,

[shipment_type] [char](2) NULL,

[bol_no] [char](15) NULL,

[shipper_no] [char](13) NULL,

[off_route_status] [decimal](3, 0) NULL,

[off_route_city] [char](9) NULL,

[off_route_state] [char](2) NULL,

[cust_po_no] [char](25) NULL,

[rqst_dlv_date_time] [datetime] NULL,

[extract_yorn_1] [char](1) NULL,

[original_eta_to_cust] [datetime] NULL,

[original_eta_to_origin] [datetime] NULL,

[eta_manual_close] [char](1) NULL,

[eta_proc_date] [datetime] NULL,

[supplier_code] [char](5) NULL,

[supplier_name] [varchar](30) NULL,

[csr_no] [varchar](25) NULL,

[unit_train_id] [char](15) NULL,

[bill_detention_yorn] [char](1) NULL,

[detention_amount] [decimal](6, 2) NULL,

[total_time_at_dest] [decimal](6, 0) NULL,

[charge_time_at_dest] [decimal](6, 0) NULL,

[free_time] [decimal](3, 0) NULL,

[extract_date] [datetime] NULL,

[detention_extract_yorn] [char](1) NULL,

[det_uom] [char](1) NULL,

[hot_list_flag] [char](1) NULL,

[trip_ref1] [char](20) NULL,

[trip_ref2] [char](20) NULL,

[trip_ref3] [char](20) NULL,

[trip_ref4] [char](20) NULL,

[hot_list_flag_mech] [char](1) NULL,

[hot_list_flag_leas] [char](1) NULL,

[sales_rep] [varchar](25) NULL,

[master_trip_no] [varchar](25) NULL,

[master_origin] [char](5) NULL,

[gps_alert_flag] [char](1) NULL,

[import_source] [char](25) NULL,

[trouble_yorn] [char](1) NULL,

[trip_ref5] [char](25) NULL,

[trip_ref6] [char](25) NULL

) ON [PRIMARY]

****VIEW DEF

CREATE VIEW [dbo].[RTV_CURRENT_TRIP] (car_init, car_no, maxMYTABLEshipdatetime) AS

SELECT RT_TRIP.car_init, RT_TRIP.car_no, max(RT_TRIP.ship_date_time) maxMYTABLEshipdatetime

FROM RT_TRIP

GROUP BY RT_TRIP.car_init,

RT_TRIP.car_no

*****SELECT

SELECT DISTINCT RT_TRIP.car_init,

RT_TRIP.car_no,

RT_TRIP.ship_date_time,

RT_TRIP.cust_no,

RT_TRIP.route_code,

RT_TRIP.fleet_id,

RT_TRIP.business_group,

RT_TRIP.orig_city,

RT_TRIP.orig_state,

RT_TRIP.dest_city,

RT_TRIP.dest_state,

RT_TRIP.car_status,

RT_TRIP.bol_no,

RT_TRIP.trip_no,

RT_TRIP.cust_po_no,

RT_TRIP.csr_no,

RT_TRIP.sales_rep,

RT_TRIP.hot_list_flag,

RT_TRIP.trip_ref1,

RT_TRIP.trip_ref2,

RT_TRIP.trip_ref3,

RT_TRIP.trip_ref4,

RT_TRIP.dest_arr_date_time,

RT_TRIP.dest_arr_type,

RT_TRIP.cust_dlv_date_time,

RT_TRIP.cust_dlv_type,

RT_TRIP.cust_rel_date_time,

RT_TRIP.cust_rel_type,

RT_TRIP.trip_cls_date_time,

RT_TRIP.trip_cls_type,

RT_TRIP.orig_eda_eta_cust,

RT_TRIP.orig_eda_eta_origin,

RT_TRIP.trip_ref5,

RT_TRIP.trip_ref6,

RT_TRIP.trouble_yorn,

'N' as free_runner,

RT_TRIP.divert_yorn,

RT_TRIP.shop_move,

RT_TRIP.trip_status,

RT_TRIP.rqst_dlv_date_time,

datediff(day,RT_TRIP.trip_cls_date_time, getdate()),

datediff(dd,getdate(),RT_TRIP.orig_eda_eta_origin),

datediff(dd, getdate(),RT_TRIP.orig_eda_eta_cust)

FROM RT_TRIP,

RTV_CURRENT_TRIP

WHERE

( RT_TRIP.car_init = RTV_CURRENT_TRIP.car_init ) and

( RT_TRIP.car_no = RTV_CURRENT_TRIP.car_no ) and

( RT_TRIP.ship_date_time = RTV_CURRENT_TRIP.maxMYTABLEshipdatetime ) and

( RT_TRIP.free_runner = 'N')

Union

SELECT DISTINCT RT_TRIP.car_init,

RT_TRIP.car_no,

RT_TRIP.ship_date_time,

RT_TRIP.cust_no,

RT_TRIP.route_code,

RT_TRIP.fleet_id,

RT_TRIP.business_group,

RT_TRIP.orig_city,

RT_TRIP.orig_state,

RT_TRIP.dest_city,

RT_TRIP.dest_state,

RT_TRIP.car_status,

RT_TRIP.bol_no,

RT_TRIP.trip_no,

RT_TRIP.cust_po_no,

RT_TRIP.csr_no,

RT_TRIP.sales_rep,

RT_TRIP.hot_list_flag,

RT_TRIP.trip_ref1,

RT_TRIP.trip_ref2,

RT_TRIP.trip_ref3,

RT_TRIP.trip_ref4,

RT_TRIP.dest_arr_date_time,

RT_TRIP.dest_arr_type,

RT_TRIP.cust_dlv_date_time,

RT_TRIP.cust_dlv_type,

RT_TRIP.cust_rel_date_time,

RT_TRIP.cust_rel_type,

RT_TRIP.trip_cls_date_time,

RT_TRIP.trip_cls_type,

RT_TRIP.orig_eda_eta_cust,

RT_TRIP.orig_eda_eta_origin,

RT_TRIP.trip_ref5,

RT_TRIP.trip_ref6,

RT_TRIP.trouble_yorn,

'Y' as free_runner,

RT_TRIP.divert_yorn,

RT_TRIP.shop_move,

RT_TRIP.trip_status,

RT_TRIP.rqst_dlv_date_time,

datediff(day,RT_TRIP.trip_cls_date_time, getdate()),

datediff(dd,getdate(),RT_TRIP.orig_eda_eta_origin),

datediff(dd, getdate(),RT_TRIP.orig_eda_eta_cust)

FROM RT_TRIP,

RTV_CURRENT_TRIP

WHERE

( RT_TRIP.car_init = RTV_CURRENT_TRIP.car_init ) and

( RT_TRIP.car_no = RTV_CURRENT_TRIP.car_no ) and

( RT_TRIP.ship_date_time = RTV_CURRENT_TRIP.maxMYTABLEshipdatetime ) and

( RT_TRIP.free_runner = 'Y') and

( RT_TRIP.trip_status = 'O')

LutzM SSC Guru Points: 107049 More actions · Answer 3

Rog Saber (3/3/2011)
There are 15 Non-Unique, Non-Clustered indexes on MY_TABLE. I don't know how to get the execution plan in text - in only shows up graphically.

15 indexes? Can you post those?

Regarding the xecution plan:

Run the query with IncludeActualExecutionPlan activated.

Right click on the plan and choose "Save Execution Plan as..."

Enter a name and save it as sqlplan.

Attach the file. Done.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Rog Saber SSCrazy Points: 2548 More actions · Answer 4

I've attached the exec plan. Is there a way to do the indexes easily or do I have to do them 1 by one?

Rog Saber SSCrazy Points: 2548 More actions · Answer 5

OK, maybe I'm off on the wrong track here and someone can help me. The query I am running is coming from a proc. When I first ran the proc this AM it took 1:09. When I ran it right after that it took about 42 seconds. I just ran it again and it takes about 8 seconds but it is almost 9 PM and there's very few resources hitting the server. So, even if I make the query faster, looks like I am at the mercy of the server and all of the people on it.

Does this sound logical?

If so, any suggestions?

When a proc is first run it has to compile that's why it takes the longest right? If so, when does the 'clock start over again'? In other words, right after I ran it this morning, it was faster. When will I take that first time 'hit' again?

WayneS SSC Guru Points: 95461 More actions · Answer 6

Rog Saber (3/3/2011)
I've attached the exec plan. Is there a way to do the indexes easily or do I have to do them 1 by one?

Try this index out:

USE [BUNGE_TEST]

GO

CREATE NONCLUSTERED INDEX [IX_RT_TRIP_free_runner_cover]

ON [dbo].[RT_TRIP] ([free_runner])

INCLUDE ([car_init],[car_no],[ship_date_time],[cust_no],[route_code],[trip_no],[orig_city],[orig_state],[dest_city],[dest_state],[fleet_id],[orig_eda_eta_cust],[orig_eda_eta_origin],[car_status],[trip_status],[dest_arr_date_time],[dest_arr_type],[cust_dlv_date_time],[cust_dlv_type],[cust_rel_date_time],[cust_rel_type],[trip_cls_date_time],[trip_cls_type],[business_group],[divert_yorn],[shop_move],[bol_no],[cust_po_no],[rqst_dlv_date_time],[csr_no],[hot_list_flag],[trip_ref1],[trip_ref2],[trip_ref3],[trip_ref4],[sales_rep],[trouble_yorn],[trip_ref5],[trip_ref6])

GO

SQL estimates that this index will improve performance of this query by some 46+%...

Edit: This execution plan also shows that the query would benefit (~20%) by modifying the index [RT_TRIP_STATUS_IDX2] to include these columns:

car_init, car_no, ship_date_time, cust_no, route_code, trip_no, orig_city, orig_state, dest_city, dest_state, fleet_id, orig_eda_eta_cust, orig_eda_eta_origin, dest_arr_date_time, dest_arr_type, cust_dlv_date_time, cust_dlv_type, cust_rel_date_time, cust_rel_type, trip_cls_date_time, trip_cls_type, business_group, divert_yorn, shop_move, free_runner, bol_no, cust_po_no, rqst_dlv_date_time, csr_no, hot_list_flag, trip_ref1, trip_ref2, trip_ref3, trip_ref4, sales_rep, trouble_yorn, trip_ref5, trip_ref6

I assume that you realize that adding all of these columns to these two indexes might cause issues with insert/update/delete operations on this table - if it's a heavily used table, adding this might not be your best bet.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

WayneS SSC Guru Points: 95461 More actions · Answer 7

Rog Saber (3/3/2011)
OK, maybe I'm off on the wrong track here and someone can help me. The query I am running is coming from a proc. When I first ran the proc this AM it took 1:09. When I ran it right after that it took about 42 seconds. I just ran it again and it takes about 8 seconds but it is almost 9 PM and there's very few resources hitting the server. So, even if I make the query faster, looks like I am at the mercy of the server and all of the people on it.
Does this sound logical?
If so, any suggestions?
When a proc is first run it has to compile that's why it takes the longest right? If so, when does the 'clock start over again'? In other words, right after I ran it this morning, it was faster. When will I take that first time 'hit' again?

If there isn't an execution plan for this query in SQL's cache, it has to create it. If it gets dropped from cache, the next time it's run it will need to be built again. This is more prone to happen when SQL is busy, and under memory pressure, and it clears out the caches not being used to free up memory.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Rog Saber SSCrazy Points: 2548 More actions · Answer 8

I added the index you suggested and no noticeable difference. I will drop it then add it again tomorrow and test when the users are hitting the DB harder to see.

Thank you

LutzM SSC Guru Points: 107049 More actions · Answer 9

Why not changing the query to get rid of the UNION?

The only difference I see between the two UNIONed SELECTs is the value of free_runner. And this would be identical to the value of the related column.

So I would change the query to

SELECT DISTINCT

...

RT_TRIP.free_runner as free_runner,

...

FROM RT_TRIP,

RTV_CURRENT_TRIP

WHERE

( RT_TRIP.car_init = RTV_CURRENT_TRIP.car_init ) and

( RT_TRIP.car_no = RTV_CURRENT_TRIP.car_no ) and

( RT_TRIP.ship_date_time = RTV_CURRENT_TRIP.maxRTTRIPshipdatetime ) and

( RT_TRIP.car_init = RTV_CURRENT_TRIP.car_init ) and

( RT_TRIP.car_no = RTV_CURRENT_TRIP.car_no ) and

( RT_TRIP.ship_date_time = RTV_CURRENT_TRIP.maxRTTRIPshipdatetime ) and

(( RT_TRIP.free_runner = 'Y') and

( RT_TRIP.trip_status = 'O')

OR ( RT_TRIP.free_runner = 'N')

)

The index Wayne recommended should help additionally.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Rog Saber SSCrazy Points: 2548 More actions · Answer 10

Rog Saber

SSCrazy

Points: 2548

March 4, 2011 at 9:26 am

#1294744

OK Thank you - I'll give it a try.

WayneS SSC Guru Points: 95461 More actions · Answer 11

LutzM (3/4/2011)
Why not changing the query to get rid of the UNION?
The only difference I see between the two UNIONed SELECTs is the value of free_runner. And this would be identical to the value of the related column.
So I would change the query to
SELECT DISTINCT
...
RT_TRIP.free_runner as free_runner,
...
FROM RT_TRIP,
RTV_CURRENT_TRIP
WHERE
( RT_TRIP.car_init = RTV_CURRENT_TRIP.car_init ) and
( RT_TRIP.car_no = RTV_CURRENT_TRIP.car_no ) and
( RT_TRIP.ship_date_time = RTV_CURRENT_TRIP.maxRTTRIPshipdatetime ) and
( RT_TRIP.car_init = RTV_CURRENT_TRIP.car_init ) and
( RT_TRIP.car_no = RTV_CURRENT_TRIP.car_no ) and
( RT_TRIP.ship_date_time = RTV_CURRENT_TRIP.maxRTTRIPshipdatetime ) and
(( RT_TRIP.free_runner = 'Y') and
( RT_TRIP.trip_status = 'O')
OR ( RT_TRIP.free_runner = 'N')
)
The index Wayne recommended should help additionally.

Good catch Lutz!

One other thing... it appears that the two selects will return mutually exclusive sets of data - no rows will be the same between the two. If this is truly the case, then replace the UNION with a UNION ALL. A union forces SQL to sort both queries, and to weed out any duplicates. If you know that there won't be any duplicates, then use UNION ALL and the expensive sorts and dupe checking won't be performed. As Paul would say, "don't use a UNION when you can use a UNION ALL"

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 12

To add to the evaluation, the 230 megabytes you're scanning in from RT_Trip can't be helping. The Scan is happening for the Free_Runner = 'N' component, but it's working on a seek for the trip_status of the second component.

This makes sense, since the first unioned query doesn't have the same WHERE requirements, but it would appear that Free_Runner is merely a Y/N column, and won't have the selectivity to force a seek unless it's very heavily weighted in the Y direction.

Without further restrictions placed on the query, I don't see you being able to avoid the table scan. I'm eyeballing here, but it looks like you've got ~100 columns in the original RT_TRIP, and ~40 in the select statement you're calling.

If you can handle the insert/update/delete overhead, I'd recommend an overriding index that would shrink the necessary data pull down. What this would encompass would be Using the fields car_init, Car_no, and ship_date_time as your indexing, and use the rest as includes (including free_runner). If this table is touched regularly by changes, however, I wouldn't do this. It's just too danged wide.

This would switch this to an index scan, and help reduce your overhead. That first union is killing you though, and you would be better off doing a single pass as already pointed out by Lutz by modifying your WHERE clause.

If that's not a choice, I'd work off of simply a 3 column index with no includes and see if we can't get to a bookmark lookup, it seems reasonable, but I haven't seen your data disparity.

Really, we need to see those index definitions to see what you're currently working with. It'll help give us a better idea of what the optimizer is ignoring.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

LutzM SSC Guru Points: 107049 More actions · Answer 13

To squeeze even more performance out of it you could consider changing the data types from DATETIME to SMALLDATETIME (if the business case would allow for it).

This would reduce the index by another 28MB (7 DATETIME values * 4 Byte reduction * approx. 1 mill rows.). And a smaller size of an index can be searched faster.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 14

It looks to me as if RTV_CURRENT_TABLE is the name of the view. That means that THIS...

( RT_TABLE.ship_date_time = RTV_CURRENT_TABLE.maxRTTRIPshipdatetime )

... is a join an an aggregated column in the view and that makes the code non-SARGable because the entire view must be rendered/materialized before it can make the correct selection.

My recommendation would be to use an "indexed view".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)