February 27, 2009 at 6:38 am
***PLEASE NOTE THAT I'VE POSTED IN THE WRONG FORUM - THIS IS A SQL SERVER 2000 QUESTION***
Hi, I've hit a brick wall with this problem and I'm hoping someone can help me out.
I have two tables:
Table A has 50 million rows
Table B has 17 million rows
Table B is a sub-set of table A. It pointers to table A's primark key (clustered index) of significant events.
If I query table B only on date range the index on this table is hit and it returns 5000 rows in zero seconds.
If I now change my query so that I do a LEFT OUTER JOIN to join table A to table B and name a field in the SELECT from table A I get horrendous peformance problems :
SELECT B.Field1, B.Field2, B.Field3, A.Field1
FROM TableB
LEFT OUTER JOIN TableA ON TableB.Field1 = TableA.Field1
WHERE B.Field2 BETWEEN 'mystartdate' AND 'myenddate'
ORDER BY B.Field2
Bear in mind that TableA's field 1 is a clustered index so looking this up should be lightning quick.
The execution plan says that it hits my index on table B with 0% cost to recover the subset of data
it then goes to Table B to get the other day and has a cost of 99% on a Clustered Index Seek
Followed by 0% nested loops and 1% sort
Table A's clustered index is 28% fragmented.
***PLEASE NOTE THAT I'VE POSTED IN THE WRONG FORUM - THIS IS A SQL SERVER 2000 QUESTION***
Can anyone give me any suggestion?
Thanks for your time.
Windows 2008 Server | SQL Server 2008
February 27, 2009 at 7:44 am
Can you post the query plan?
Why are you doing an Outer Join if tableB is a subset of tableA? Wouldn't this mean that every row in tableB would have a matching row in tableA?
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
February 27, 2009 at 8:02 am
Hi, thanks for taking a look at my problem:
This is the actual query and exectution plan:
SELECT E.FleetGroup,E.Time_Stamp,E.Reason,E.Vehicle,R.ID
FROM RoutePositionEvents AS E
LEFT OUTER JOIN RoutePosition AS R ON E.RPID = R.ID
WHERE E.FleetGroup=@iFleetGroup AND E.TIME_STAMP BETWEEN @FromDate AND @ToDate AND E.REASON IN('Z','I')
|--Nested Loops(Left Outer Join, OUTER REFERENCES:([E].[RPID]) WITH PREFETCH)
|--Index Seek(OBJECT:([ASWebDBDataCentre].[dbo].[RoutePositionEvents].[IX_FleetGroupTime_StampReason] AS [E]), SEEK:([E].[FleetGroup]=[@iFleetGroup] AND [E].[Time_Stamp] >= [@FromDate] AND [E].[Time_Stamp] <= [@ToDate]), WHERE:([E].[Reason]='I' OR [E].[Reason]='Z')
|--Clustered Index Seek(OBJECT:([ASWebDBDataCentre].[dbo].[RoutePosition].[PK_RoutePosition] AS [R]), SEEK:([R].[ID]=[E].[RPID]) ORDERED FORWARD)
Windows 2008 Server | SQL Server 2008
February 27, 2009 at 8:04 am
Sorry about the icons, they've taken the place of the colon and open braket characters.
Windows 2008 Server | SQL Server 2008
February 27, 2009 at 9:22 am
kmarshall (2/27/2009)
Sorry about the icons, they've taken the place of the colon and open braket characters.
Right click the plan, select save as. Save as a .sqlplan file, zip and attach
That's assuming is is SQL 2005, which I would expect seeing as it's posted in the SQL 2005 forum.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 27, 2009 at 9:41 am
I have to ask, why even do the join? Maybe I'm missing something, but the only thing I see being pulled from the join table is the join column, which means you don't need to get it from that table.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 27, 2009 at 10:25 am
Damn, I'm sorry. I didn't realise that I'd posted in the 2k5 forum although it's pretty obvious when I look at the top of my page! It it possible to move this thread or should I re-post?
Although I've only named one field from the joined table, in reality I want to add several more. I was just trying to keep it simple.
Windows 2008 Server | SQL Server 2008
February 27, 2009 at 11:23 am
Steve can move it, but there's no real need to. Edit the original post, add something about it being SQL 2000 right at the beginning. That'll make it clear enough.
Got it on the join and the columns that aren't there yet.
Why an outer join? Why not inner?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 27, 2009 at 12:16 pm
kmarshall (2/27/2009)
I was just trying to keep it simple.
Can you post the full query? Simplifying it does not help when it's a performance issue.
Since it's 2000, the method to post the full plan is a lot more complex.
Put SET STATISTICS PROFILE ON before the query and run it. There will be a second resultset that is the exec plan. Copy the entire resultset (results to grid), paste in excel, zip and attach.
Can you also post table definitions and index definitions please?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 28, 2009 at 2:00 am
Table defs and index, query and execution plans below:
CREATE TABLE [RoutePosition] (
[ID] [int] IDENTITY (1, 1) NOT FOR REPLICATION NOT NULL ,
[Vehicle] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Time_Stamp] [datetime] NULL ,
[Longitude] [float] NULL ,
[Latitude] [float] NULL ,
[PostCode] [varchar] (8) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Speed] [int] NULL ,
[Heading] [int] NULL ,
[Odometer] [float] NULL ,
[CybitDateTime] [datetime] NULL ,
[OurDateTime] [datetime] NULL CONSTRAINT [DF_RoutePosition_OurDateTime] DEFAULT (getdate()),
[CompanyID] [int] NULL CONSTRAINT [DF_RoutePosition_CompanyID] DEFAULT (0),
[Ignition] [smallint] NULL CONSTRAINT [DF_RoutePosition_Ignition] DEFAULT (0),
[Idling] [smallint] NULL CONSTRAINT [DF_RoutePosition_Idling] DEFAULT (0),
[IdlingOld] [smallint] NULL CONSTRAINT [DF_RoutePosition_IdlingOld] DEFAULT (0),
[Reason] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Street] [nvarchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Town_Part] [nvarchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Town] [nvarchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Region] [nvarchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Country] [nvarchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
CONSTRAINT [PK_RoutePosition] PRIMARY KEY CLUSTERED
(
[ID]
) WITH FILLFACTOR = 70 ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [RoutePositionEvents] (
[RPID] [int] NOT NULL ,
[Vehicle] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CompanyID] [int] NULL ,
[FleetGroup] [int] NULL ,
[Time_Stamp] [datetime] NULL ,
[Reason] [nvarchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[GeofenceID] [int] NULL ,
[FenceType] [nvarchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO
CREATE INDEX [IX_FleetGroupTime_StampReasonVehicleRPID] ON [dbo].[RoutePositionEvents]([FleetGroup], [Time_Stamp], [Reason], [Vehicle], [RPID]) ON [PRIMARY]
GO
SELECT E.FleetGroup,E.Time_Stamp,E.Reason,E.Vehicle,R.Street, R.Town, R.Town_Part, R.Region, R.Country, R.PostCode
FROM RoutePositionEvents AS E
LEFT OUTER JOIN RoutePosition AS R ON E.RPID = R.ID
WHERE E.FleetGroup=@iFleetGroup AND E.TIME_STAMP BETWEEN @FromDate AND @ToDate AND E.REASON IN('Z','I')
ORDER BY E.Vehicle,E.Time_Stamp
The execution plan xls file can be downloaded here: http://www.savefile.com/files/2020391
Thanks again for your time gentlemen.
Windows 2008 Server | SQL Server 2008
March 2, 2009 at 2:55 am
I've spent quite a lot of time on this over Friday and Saturday and have got some new information.
1. The RoutePosition table contains 128 million rows.
2. The query I'm executing returns 9500 rows from the RoutePositionEvents table which it needs to join to the RoutePosition table. The records I need to get from the RoutePosition table are not sequential.
I believe it is simply the overhead in looking up the records whilst making the join to the RoutePosition table. I'm really not sure if there's anything I can do about this.
One of the experiments I conducted was executing the following query:
SELECT ID FROM ROUTEPOSITION WHERE ID BETWEEN 100000 AND 109500
This returned the records very quickly indeed because it was a direct clustered index seek and the records were all sequential.
The real query I'm trying to execute returns a similar number of records but they are not sequential in the RoutePosition table.
Windows 2008 Server | SQL Server 2008
March 2, 2009 at 7:15 am
Is there a column in RoutePosition that would satisfy the range of the query? Perhaps an index with a column like that would help.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 2, 2009 at 7:39 am
It's funny that you should suggest that.
I've been considering the two possibilities:
1) An indexed view
OR
2) Adding an index to the RoutePosition table that would index the data more closely in the order that the results of the queryies I'm executing it against and doing a JOIN against that instead of the Clusterd Index seek.
I've got a copy of the data so I can try things out without causing any trouble.
Windows 2008 Server | SQL Server 2008
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply