View with many tables and poor performance

  • My problem is terrible performance from a view created over 25 tables.

    Basically one header file, one detail file and many master files with primary keys. In addition the header and detail file are attached twice to get a previous item of data. The SQL is included below (if it makes any sense?).

    All the correct indexes exist (as I think are required). I wanted to do an extract from the view into a totally new table and create indexes, all as part of an overnight job. My logic being the table is then much more responsive than the underlying view.

    Problem is the extract ie; select * from viewname into newtable takes hours and hours. The mstrace table contains 126263 rows and MstRun contains 1402391.

    What am I doing wrong, or is this expected performance? I have 2gig ram, intel dual core, reasonable spec machine, raid sata drives.

    The MstRace and MstRun tables contain mostly numeric data, not overly comples, no computed columns or anything exotic.

    Any help greatly appreciated.

    tia.

    SELECT TOP (100) PERCENT dbo.MstRace.ID AS RaceID, dbo.MstRace.RaceType1, dbo.MstRace.RaceType2, dbo.MstRace.MaidenRace, dbo.MstRace.NH,

    dbo.MstRace.AW, dbo.MstRace.CourseID, dbo.MstRace.AgeID, dbo.MstRace.MeetingID, dbo.MstRace.RaceTitle, dbo.MstRace.Prize1,

    dbo.MstRace.HcpTo, dbo.MstRace.Class, dbo.MstRace.DistanceID, dbo.MstRace.FilliesRace, dbo.MstRace.WinningTime, dbo.MstRace.NumberRan,

    dbo.MstRace.BHBAvg, dbo.MstRace.GroupListed, dbo.MstRace.StallsPosition, dbo.MstRace.RNum, dbo.MstRace.Handicap, dbo.MstRace.Selling,

    dbo.MstRace.Claiming, dbo.MstRace.Novice, dbo.MstRace.RaceDate, dbo.MstRace.RFRaceNumber, dbo.MstRace.SP1, dbo.MstRace.SP2,

    dbo.MstRace.SP3, dbo.MstRace.RiderID, dbo.MstRace.GoingAllowance, dbo.MstRace.VariationFromStd, dbo.MstRace.GoingAllow,

    dbo.MstRace.GoingID, dbo.MstRace.Comment, dbo.MstRace.WinSpeedFig, dbo.MstRace.WinningHorseID, dbo.MstRace.DistanceRndID,

    dbo.MstRace.IsHcp, DATEPART(month, dbo.MstRace.RaceDate) AS RaceMonth, DATEPART(year, dbo.MstRace.RaceDate) AS RaceYear,

    DATEPART(day, dbo.MstRace.RaceDate) AS RaceDay, DATEPART(week, dbo.MstRace.RaceDate) AS RaceWeek, DATEPART(weekday,

    dbo.MstRun.RunDate) AS DOW, MstRaceType1_1.Name AS RaceTp1, MstRaceType2_1.Name AS RaceTp2, MstRaceType2_1.Abv AS RaceTp2Abv,

    dbo.MstCourse.ID, dbo.MstGoing.GoingDescFull, dbo.MstGoing.GoingDescAbv1, MstAge_1.Name AS Age, MstCourse_1.Direction,

    MstCourse_1.FinishType, MstCourse_1.RunIn, MstCourse_1.Sharp, MstCourse_1.Undulating, MstCourse_1.Galloping, MstCourse_1.Name AS Course,

    MstCourse_1.Country, MstCourse_1.NameAlt1 AS CourseAlt, MstDistance_1.DistanceActText, MstDistance_1.DistanceYardsAct,

    MstDistance_1.DistanceYardsRnd, MstDistance_1.DistanceRndText, trainer_1.tstylename AS Trainer, ri_master.dbo.horse.hname AS Horse,

    ri_master.dbo.owner.ostylename AS Owner, ri_master.dbo.jockey.jstylename AS Jockey, ri_master.dbo.dam.dhname AS Dam,

    ri_master.dbo.sire.shname AS Sire, dbo.MstRun.ID AS RunID, dbo.MstRun.SaddleCloth AS RaceCard, dbo.MstRun.RunDate, dbo.MstRun.Position,

    dbo.MstRun.SPPlaced, dbo.MstRun.Placed, dbo.MstRun.SPDecimal, dbo.MstRun.SpBfW, dbo.MstRun.AgeAtRaceDate, dbo.MstRun.SexAtRaceDate,

    dbo.MstRun.TrainerID, dbo.MstRun.DaysSinceRun, dbo.MstRun.JockeyID, dbo.MstRun.OwnerID, dbo.MstRun.HorseID, dbo.MstRun.WeightPounds,

    dbo.MstRun.Headgear, dbo.MstRun.Penalty, dbo.MstRun.Overweight, dbo.MstRun.Allowance, dbo.MstRun.Draw, dbo.MstRun.BHBRating,

    dbo.MstRun.LengthsBehindPrev, dbo.MstRun.LengthsBehindWinner, dbo.MstRun.SPPosition, dbo.MstRun.NotebookComment,

    dbo.MstRun.InrunningComment, dbo.MstRun.TrainerString, dbo.MstRun.PosCond, dbo.MstRun.WeightRank1, dbo.MstRun.WeightRank2,

    dbo.MstRun.OutOfHandicap, dbo.MstRun.MdnAtRaceDate, dbo.MstRun.SireID, dbo.MstRun.DamID, dbo.MstRun.LengthsInFrontOfNext,

    dbo.MstRun.RaceformRating, dbo.MstRun.RaceformRatingPlus, dbo.MstRun.RaceformSpeedRating, dbo.MstRun.OwnSpeedRating,

    dbo.MstRun.RunCountTrainer, dbo.MstRun.RunCount, dbo.MstRun.RunCountAW, dbo.MstRun.RunCountFL, dbo.MstRun.RunCountNH,

    dbo.MstRun.HeadGearID, dbo.MstRun.RFRBOL6, dbo.MstRun.RFSBOL6, dbo.MstRun.RunsThisYear, CAST(dbo.MstRun.WeightPounds AS int)

    - CAST(MstRun_1.WeightPounds AS int) AS WeightChgPounds, CAST(dbo.MstRace.BHBAvg AS int) - CAST(MstRace_1.BHBAvg AS int) AS BHBAvgChg,

    CAST(8 - dbo.MstRace.Class AS int) - CAST(8 - MstRace_1.Class AS int) AS ClassChg, CAST(dbo.MstRun.BHBRating AS int)

    - CAST(MstRun_1.BHBRating AS int) AS BHBChg, CAST(dbo.MstRun.RaceformRating AS int) - CAST(MstRun_1.RaceformRating AS int) AS RaceformChg,

    CAST(dbo.MstRun.RaceformSpeedRating AS int) - CAST(MstRun_1.RaceformSpeedRating AS int) AS RaceformSpeedChg,

    CAST(dbo.MstRun.OwnSpeedRating AS int) - CAST(MstRun_1.OwnSpeedRating AS int) AS OwnSpeedRatingChg, MstRace_1.Class AS P1Class,

    MstRace_1.Handicap AS P1Handicap, MstRace_1.RaceType1 AS P1RaceType1, MstRace_1.RaceType2 AS P1RaceType2,

    MstRace_1.CourseID AS P1CourseID, MstRace_1.MaidenRace AS P1MaidenRace, MstRace_1.AgeID AS P1AgeID,

    MstRace_1.GroupListed AS P1GroupListed, MstRace_1.FilliesRace AS P1FilliesRace, MstRace_1.BHBAvg AS P1BHBAvg,

    MstRace_1.HcpTo AS P1HandicapTo, MstRace_1.RiderID AS P1RiderID, dbo.MstRaceType1.Name AS P1RaceTp1,

    dbo.MstRaceType2.Name AS P1RaceTp2, dbo.MstCourse.Name AS P1Course, dbo.MstDistance.DistanceRndText AS P1Distance,

    MstRun_1.Position AS P1Position, MstRun_1.SPDecimal AS P1SPDecimal, MstRun_1.BHBRating AS P1BHBRating,

    MstRun_1.LengthsBehindWinner AS P1LenBehindWinner, MstRun_1.LengthsInFrontOfNext AS P1LIF, MstRun_1.SPPosition AS P1SPPosition,

    MstRun_1.WeightPounds AS P1WeightPounds, MstRun_1.Penalty AS P1Penalty, MstRun_1.PosCond AS P1PosCond,

    MstRun_1.RaceformRatingPlus AS P1RaceformRatingPlus, MstRun_1.RaceformRating AS P1RaceformRating,

    MstRun_1.TrainerString AS P1TrainerString, MstRun_1.Overweight AS P1Overweight, MstRun_1.Allowance AS P1Allowance,

    MstRun_1.Draw AS P1Draw, MstRun_1.Headgear AS P1Headgear, MstRun_1.InrunningComment AS P1InrunningComment,

    MstRun_1.DaysSinceRun AS P1DaysSinceRun, MstRun_1.RaceformSpeedRating AS P1RaceformSpeedRating,

    MstRun_1.OwnSpeedRating AS P1OwnSpeedRating, MstRun_1.MdnAtRaceDate AS P1MdnAtRaceDate, MstRun_1.WeightRank1 AS P1WeightRank1,

    MstRun_1.WeightRank2 AS P1WeightRank2, MstRun_1.MdnAtRaceDate AS MaidenARD, MstRun_1.HeadGearID AS P1HeadGearID,

    MstRun_1.JockeyID AS P1JockeyID, MstRun_1.OwnerID AS P1OwnerID, MstRun_1.AgeAtRaceDate AS P1AgeAtRaceDate,

    MstRun_1.SexAtRaceDate AS P1SexAtRaceDate, MstRun_1.SpBfW AS P1SPBfWb, MstRun_1.OutOfHandicap AS P1OutOfHandicap,

    trainer_1.tstylename AS P1Trainer, trainer_1.tid AS P1TrainerID, MstRun_1.RunCountFL + dbo.MstRun.RunCountAW AS FlatRuns,

    dbo.MstCourse.Country AS P1Country, MstRider_1.Name AS RiderType, dbo.MstRider.Name AS P1RiderType,

    CAST(dbo.MstRun.RaceformRating AS int) - CAST(MstRun_1.RaceformRating AS int) AS RFChange,

    dbo.MstDistance.DistanceYardsRnd AS P1DistanceYardsRnd, dbo.MstDistance.DistanceYardsAct AS P1DistanceYardsAct,

    CAST(MstDistance_1.DistanceYardsRnd AS int) - CAST(dbo.MstDistance.DistanceYardsRnd AS int) AS DistChangeRnd, dbo.MstAge.Name AS P1Age,

    DATEPART(weekday, MstRace_1.RaceDate) AS P1DOW, DATEPART(week, MstRace_1.RaceDate) AS P1RaceWeek, DATEPART(year,

    MstRace_1.RaceDate) AS P1RaceYear, DATEPART(month, MstRace_1.RaceDate) AS P1RaceMonth, DATEPART(day, MstRace_1.RaceDate)

    AS P1RaceDay, dbo.MstRun.HcpRunCount, MstRace_1.RaceTitle AS P1RaceTitle, MstRace_1.NumberRan AS P1NumberRan,

    MstRace_1.Comment AS P1Comment, MstRace_1.WinSpeedFig AS P1WinSpeedFig, MstRace_1.IsHcp AS P1IsHcp,

    MstRace_1.RaceDate AS P1RaceDate, MstRace_1.GoingID AS P1GoingID, MstRace_1.ID AS P1RaceID, ri_master.dbo.owner.oid AS RaceformOwnerID,

    dbo.MstRace.RFGoingU, dbo.MstRace.RFGoingAllowanceU, 0 AS Satus, MstRun_1.ID AS P1RunID

    FROM ri_master.dbo.trainer AS trainer_1 CROSS JOIN

    dbo.MstRaceType2 INNER JOIN

    dbo.MstCourse INNER JOIN

    dbo.MstRaceType1 INNER JOIN

    dbo.MstRace AS MstRace_1 ON dbo.MstRaceType1.ID = MstRace_1.RaceType1 INNER JOIN

    dbo.MstRider ON MstRace_1.RiderID = dbo.MstRider.ID INNER JOIN

    dbo.MstDistance ON MstRace_1.DistanceID = dbo.MstDistance.ID INNER JOIN

    dbo.MstAge ON MstRace_1.AgeID = dbo.MstAge.ID ON dbo.MstCourse.ID = MstRace_1.CourseID ON

    dbo.MstRaceType2.ID = MstRace_1.RaceType2 RIGHT OUTER JOIN

    ri_master.dbo.trainer INNER JOIN

    dbo.MstRun AS MstRun_1 ON trainer_1.tid = MstRun_1.TrainerID RIGHT OUTER JOIN

    ri_master.dbo.jockey INNER JOIN

    dbo.MstAge AS MstAge_1 INNER JOIN

    dbo.MstRaceType1 AS MstRaceType1_1 INNER JOIN

    dbo.MstRace INNER JOIN

    dbo.MstRun ON dbo.MstRace.ID = dbo.MstRun.RaceID INNER JOIN

    ri_master.dbo.sire ON dbo.MstRun.SireID = ri_master.dbo.sire.shid INNER JOIN

    ri_master.dbo.dam ON dbo.MstRun.DamID = ri_master.dbo.dam.dhid INNER JOIN

    dbo.MstRider AS MstRider_1 ON dbo.MstRace.RiderID = MstRider_1.ID INNER JOIN

    dbo.MstCourse AS MstCourse_1 ON dbo.MstRace.CourseID = MstCourse_1.ID ON MstRaceType1_1.ID = dbo.MstRace.RaceType1 INNER JOIN

    dbo.MstGoing ON dbo.MstRace.GoingID = dbo.MstGoing.ID INNER JOIN

    dbo.MstRaceType2 AS MstRaceType2_1 ON dbo.MstRace.RaceType2 = MstRaceType2_1.ID INNER JOIN

    dbo.MstDistance AS MstDistance_1 ON dbo.MstRace.DistanceID = MstDistance_1.ID ON MstAge_1.ID = dbo.MstRace.AgeID INNER JOIN

    ri_master.dbo.horse ON dbo.MstRun.HorseID = ri_master.dbo.horse.hid ON ri_master.dbo.jockey.jid = dbo.MstRun.JockeyID INNER JOIN

    ri_master.dbo.owner ON dbo.MstRun.OwnerID = ri_master.dbo.owner.oid ON MstRun_1.ID = dbo.MstRun.PrvRun1 ON

    MstRace_1.ID = dbo.MstRun.PrvRace1

  • Too many joins...

    cartesian product...

    No test data...

    Unformatted code...

    Help us help you, divide the code into smaller logical units and try to formulate a more specific question.

  • Here is a great article to get started:

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • OK, will read the article first and try and make the question clearer. I did read some posts on views and performance so at least attempted to help myself.

    Apologies, as rather a novice sql user, I tend to imagine you gurus can take a quick look at a piece of SQL code and now what the problem is in an instant:blush:

    cheers.

  • at a glance? you should replace the queries with TWO CTE's

    you are joining the same tables over and over again (search for "_1" int e statement provided for examples)

    as already identified, there's a Cartesian product.(CROSS JOIN) that should be dropped if not really needed;

    it looks like you had a query that maybe was working,and you needed to add parallel data to it, so you just cloned all the tables and added them in a second time. as identified, that cartesian product of the two is probably horrible.

    cannot tell if it is doing a cross database join,

    FROM ri_master.dbo.trainer AS trainer_1

    CROSS JOIN dbo.MstRaceType2 --which db context is this? ri_master or another db?

    if it is agaisnt two database, performance would suck real bad for that reason as well.

    there's so many joins,and you don't bother to put the ON next to the table it joins too, so it makes it nearly unreadable to me.

    without seeing it cleaned up, I can't see volunteering any time looking at it.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Again, thanks for your reply. Apologies for my ill considered plea for help. I will try and do some cleaning up first. You are in fact correct, I did have a roughly working view and added more tables.

    Also, yes the view now uses tables from two different databases, maybe I need to bring the tables into the same db as a first step.

    thanks for at least replying and putting me straight as to what is required to get specific help.

    jake.

  • maybe thinking outside of the box can help;

    show us the original working query, and explain what you wanted to do that spawned the query you posted; might be a simpler way to get the data.

    what's the additional layer of data you are trying to include?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • When tuning queries the execution plan is a must. I know you said you have indexes, but we can't tell whether or not they're being used without the execution plan. Can you capture and post that?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • ahhh just posted long reply and lost it!

    Overview:

    The db is an operational one containing data for all horse races in europe over the past 10 years. Minute details for each race and each runner in every race. MstRace holds 126313 rows, MstRun holds 1402814 rows. The data is normalised (to the extent I believe is feasible). All Master files with ID and relevant data etc.

    The view was working as a big collection of all master files joined together. I recently revamped the view to include more data from a souped up MstRace and MstRun table(s) and also to include one previous instance of MstRace and MstRun or each row. In theory allowing direct comparison of the current (most recent) row and the previous row (of the same runner (MstRun)). The performance problems started when I added the PrvRace1 and PrvRun1 join to the MstRace and MstRun tables to link the previous row.

    Previously I have simply relied on query designer but it seemed to be doing strange things, or maybe I was asking it to do strange things. Anyway, I have modified the SQL manually and the view now does work properly but when I try and do a SELECT * FROM view INTO mytable (I did this overnight with the old view to enable easy creation of indexes on the temp table for speedier research etc), it runs for a few hours before I have to cancel before anything is created. To generate an execution plan (I had never heard of this before), I ran a simple query of SELECT * from view WHERE RaceID <= 10000. This took 1 minute 49 seconds, much slower than my previous working view, which it's not possible to re-create as the MstRace and MstRun tables have changed (more columns added). I wuold expect a performance hit for this but the magnitude of the hit seems high. The MstRace and MstRun tables grow each day by approx 30 and 300 rows respectively.

    Attached are the table designs, view sql and excution plan for anyone fool hardy enough to have a look!!

    I have done what I can at the moment without a lot more learning, the execution plan tells me 77% of the workload is from the join with MstRunP1 on PrvRun1 = ID part of the view?

    Any help is greatly appreciated, I hope this post is more readable, I have included the SQL etc as attachments so it is not so messy

    Many thanks, jake.

  • Hi jake,

    Whats the difference between a race and a run? Is it a 1 to 1 relationship or do you have many runs per race?

  • One Race has many runners (average 10 runners per race).

    Another point I missed; each MstRun row does not always have a previous MstRace and MstRun, hence my outer join to return all from MstRun and only those matching from MstRunP1.

  • Hi again

    I suggest we chop this down to:

    SELECT

    *

    FROM

    dbo.MstRace

    INNER JOIN

    dbo.MstRun ON

    dbo.MstRun.RaceID = dbo.MstRace.ID

    LEFT OUTER JOIN

    dbo.MstRace AS MstRaceP1 ON

    MstRaceP1.ID = dbo.MstRun.PrvRace1

    LEFT OUTER JOIN

    dbo.MstRun AS MstRunP1 ON

    MstRunP1.ID = dbo.MstRun.PrvRun1

    WHERE

    dbo.MstRace.ID < 10000

    This will make it easier for others to jump in without having to parse through all the detail tables.

    Can you post the execution plan for this query please?

    It seems the statistics are way off, was this database migrated or something like that?

    Maxim

  • Your help is much appreciated. I am learning a lot along the way into the bargain.

    The new execution plan is attached, took 59 seconds, that's 50 seconds quicker without all the detail tables.

    I built the db from scratch a few years ago, it has evolved haphazardly over time since. One thing I have only just noticed the keys for some of the master tables I have recently imported from foxpro are actually defined as decimal (7,0) ie; horse. The corresponding column I use in MstRun are defined as int. Would this impact performance?

  • dji (3/24/2011)


    Your help is much appreciated. I am learning a lot along the way into the bargain.

    The new execution plan is attached, took 59 seconds, that's 50 seconds quicker without all the detail tables.

    I built the db from scratch a few years ago, it has evolved haphazardly over time since. One thing I have only just noticed the keys for some of the master tables I have recently imported from foxpro are actually defined as decimal (7,0) ie; horse. The corresponding column I use in MstRun are defined as int. Would this impact performance?

    Ideally keep data type coherent from a table to another. In this case I don't think it's causing type conversion.

    Is this a production system or just something you are playing with?

    Could you please run this and dump the result in a .txt file

    USE RacingSystem

    GO

    DBCC SHOW_STATISTICS('dbo.MstRace', 'PK_MstRace')

    GO

    DBCC SHOW_STATISTICS('dbo.MstRun', 'PK_MstRun')

    GO

    DBCC SHOW_STATISTICS('dbo.MstRun', 'IX_PrvRace1')

    GO

    DBCC SHOW_STATISTICS('dbo.MstRun', 'IX_PrvRun1')

  • The db is production to the extent I rely on it to make my living. I have a vb.net written front end for using the whole data set with gui's. The more data and the more complex data sets I can query as quickly as possible helps me keep ahead of the crowd, hence my recent 'overhaul' of this particular view.

    The output is attached, as is a screen shot of the resultant query screens as it wasn't clear (to me!) what output was from which query as some of them had no output (I copied and pasted exactly as you posted).

    Jake

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply