need help to tune a query

  • Please help me re-write this query. I am sure we can get rid of multiple self joins. Data table in my query is HUGE ( 1 Billion + rows)

    Thank you

    SELECT COUNT(*)

    FROM unit u

    INNER JOIN Data_Entity Outtsed ON Outtsed.category = 'unit outage' AND Outtsed.category_id = u.unit_id

    INNER JOIN Data_model mr ON mr.model_id = u.model_id

    INNER JOIN Data Outtsd ON Outtsed.entity_id = Outtsd.entity_id AND Outtsd.model_id = u.model_id

    LEFT JOIN Data_Entity ThUCtsed ON ThUCtsed.category = 'thermal UC' AND ThUCtsed.category_id = u.unit_id

    LEFT JOIN Data ThUCtsd ON ThUCtsed.entity_id = ThUCtsd.entity_id AND ThUCtsd.model_id = u.model_id AND ThUCtsd.date = Outtsd.date

    LEFT JOIN Data_Entity ThSStsed ON ThSStsed.category = 'thermal SS' AND ThSStsed.category_id = u.unit_id

    LEFT JOIN Data ThSStsd ON ThSStsed.entity_id = ThSStsd.entity_id AND ThSStsd.model_id = u.model_id AND ThSStsd.date = Outtsd.date

    LEFT JOIN Data_Entity HFtsed ON HFtsed.category = 'hydro forecast' AND HFtsed.category_id = u.unit_id

    LEFT JOIN Data HFtsd ON HFtsed.entity_id = HFtsd.entity_id AND HFtsd.model_id = u.model_id AND HFtsd.date = Outtsd.date

    LEFT JOIN Data_Entity PStsed ON PStsed.category = 'pumps forecast' AND PStsed.category_id = u.unit_id

    LEFT JOIN Data PStsd ON PStsed.entity_id = PStsd.entity_id AND PStsd.model_id = u.model_id AND PStsd.date = Outtsd.date

    LEFT JOIN Data_Entity SOLtsed ON SOLtsed.category = 'solar forecast' AND SOLtsed.category_id = u.unit_id

    LEFT JOIN Data SOLtsd ON SOLtsed.entity_id = SOLtsd.entity_id AND SOLtsd.model_id = u.model_id AND SOLtsd.date = Outtsd.date

    LEFT JOIN Data_Entity WINtsed ON WINtsed.category = 'wind forecast' AND WINtsed.category_id = u.unit_id

    LEFT JOIN Data WINtsd ON WINtsed.entity_id = WINtsd.entity_id AND WINtsd.model_id = u.model_id AND WINtsd.date = Outtsd.date

  • If you take all those LEFT JOINs away, do you get a different result?

    I mean, what happens if you try this?

    select count(*)

    from unit u

    join Data_Entity de on de.category = 'unit outage'

    and de.category_id = u.unit_id

    join Data_model dm on dm.model_id = u.model_id

    join Data d on de.entity_id = d.entity_id

    and d.model_id = u.model_id

    Best practice suggests that you should schema-qualify your object names.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • There are 2 issues here:

    Joins in general can generate duplicate rows which might affect the count. If you remove any join it could reduce the row count.

    Inner joins can reduce the number of rows which will affect the count as well.

    Basically, we can't tell you which joins aren't necessary because that's defined by the design of your database and the requirements for this particular query.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks Phil

    U r the best. I dont know what i was thinking when i wrote that query.over thinking i guess.

    It worked.

    Thank you all.

  • ekant_alone (9/22/2015)


    Thanks Phil

    U r the best. I dont know what i was thinking when i wrote that query.over thinking i guess.

    It worked.

    Thank you all.

    Glad to help. But please remember, as Luis pointed out, that the results between the two versions could be quite different, depending on your data.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply