Creating a view with 20 outer joins - best approach

  • Hi,

    Thanks for your help in advance, being snowed in has made this problem more annoying.

    Basically I would search for keywords associated to a book abstract, eg title, author, country (lots of other fields of info) etc

    To do this I was going to create a free text index on a view which is a conglomeration of about 20 tables linked by left outer joins (because a book may or may not be associated with a country or countries). When I do this however the view takes over 1 hour to create and is about a 100 rows large

    the sql looks like so:SELECT ABST.ABSTRACT_ID, ABST.USER_ABSTRACT_ID, dbo.R_REF_TYPE.REF_TYPE, ABST.AUTHORS, ABST.EDITORS,

    ABST.ARTCHAP_TITLE, ABST.BOOKPROC_TITLE, dbo.R_PUBLISHER.PUBLISHER, dbo.R_JOURNAL.JOURNAL, ABST.YEAR,

    ABSTOUTLIT.OUTCROP_LITH_LOG, ABSTPALCON.PALAEORECONSTRUCTION, ABSTWELL.LOG_NAME, ABSTISO.ISOPACH,

    ABSTPALGEO.PALAEOGEOGRAPHIC, ABSTLIT.LITHOSTRATIGRAPHY, ABSTFLD.FIELD, ABSTBAS.BASIN, ABST.ABSTRACT_ID AS Expr1,

    dbo.R_REGION.REGION, CTRYCODE.COUNTRY, TECH.TECHNIQUE, USGSBAS.PROVINCE_NAME, dbo.R_PERIOD.PERIOD_NAME, dbo.R_ERA.ERA,

    dbo.R_STAGE.STAGE, dbo.R_EPOCH.EPOCH

    FROM dbo.ABSTRACT AS ABST LEFT OUTER JOIN

    dbo.R_REF_TYPE ON ABST.REFTYPE_ID = dbo.R_REF_TYPE.ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_ISOPACH AS ABSTISO ON ABST.ABSTRACT_ID = ABSTISO.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_LITHOSTRAT AS ABSTLIT ON ABST.ABSTRACT_ID = ABSTLIT.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_OUTCROP_LITH AS ABSTOUTLIT ON ABST.ABSTRACT_ID = ABSTOUTLIT.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_PALAEOGEOGRAPHIC AS ABSTPALGEO ON ABST.ABSTRACT_ID = ABSTPALGEO.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_WELL_LOG AS ABSTWELL ON ABST.ABSTRACT_ID = ABSTWELL.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_FIELD AS ABSTFLD ON ABST.ABSTRACT_ID = ABSTFLD.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_PALAEORECONSTRUCTION AS ABSTPALCON ON ABST.ABSTRACT_ID = ABSTPALCON.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_BASIN AS ABSTBAS ON ABST.ABSTRACT_ID = ABSTBAS.ABSTRACT_ID LEFT OUTER JOIN

    dbo.R_JOURNAL ON ABST.JOURNAL_ID = dbo.R_JOURNAL.ID LEFT OUTER JOIN

    dbo.R_PUBLISHER ON ABST.PUBLISHER_ID = dbo.R_PUBLISHER.ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_GEO_TIME AS ABSTGEOT RIGHT OUTER JOIN

    dbo.R_PERIOD ON ABSTGEOT.PERIOD_ID = dbo.R_PERIOD.ID RIGHT OUTER JOIN

    dbo.R_EPOCH ON ABSTGEOT.EPOCH_ID = dbo.R_EPOCH.ID RIGHT OUTER JOIN

    dbo.R_STAGE ON ABSTGEOT.STAGE_ID = dbo.R_STAGE.ID RIGHT OUTER JOIN

    dbo.R_ERA ON ABSTGEOT.ERA_ID = dbo.R_ERA.ID ON ABST.ABSTRACT_ID = ABSTGEOT.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_USGS AS ABSTUSGS RIGHT OUTER JOIN

    dbo.R_USGS_BASIN AS USGSBAS ON ABSTUSGS.ID = USGSBAS.ID ON ABST.ABSTRACT_ID = ABSTUSGS.ABSTRACT_ID LEFT OUTER JOIN

    dbo.R_TECHNIQUE AS TECH LEFT OUTER JOIN

    dbo.REL_ABSTRACT_TECHNIQUE AS ABSTTECH ON TECH.ID = ABSTTECH.TECHNIQUE_ID ON

    ABST.ABSTRACT_ID = ABSTTECH.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_COUNTRY AS ABSTCTRY RIGHT OUTER JOIN

    dbo.R_COUNTRY_CODE AS CTRYCODE ON ABSTCTRY.ID = CTRYCODE.ID ON ABST.ABSTRACT_ID = ABSTCTRY.ABSTRACT_ID LEFT OUTER JOIN

    dbo.REL_ABSTRACT_REGION AS ABSTREG ON ABST.ABSTRACT_ID = ABSTREG.ABSTRACT_ID RIGHT OUTER JOIN

    dbo.R_REGION ON ABSTREG.REGION_ID = dbo.R_REGION.ID

    Any recommendations as to what to do would be really greatly appreciated. One idea I did have was to combine all the column fields into on nvarchar(max) column and free text index this.

    Thanks for your opinions, really appreciated.

    Oliver

  • Hi,

    Do not combine the data into 1 column, you'll break the first normal form and this beats the entire purpose of the database.

    I would start by analysing the first two joins and work on transforming rows into columns or in other words flatten the data. You'll probably end up with hundreds of columns, but it'll be easier for you to analyse the data in a long term run. Flattening the data is also a time consuming process (depending on how much data you are working with).

  • Many Thanks for the reply, I have used the rows to row code and then created views for each of the tables and brought this into a single view.

    SELECT

    t1.abstract_ID,

    List = substring((SELECT ( ', ' + ISOPACH )

    FROM dbo.REL_ABSTRACT_ISOPACH t2

    WHERE t1.abstract_ID = t2.abstract_ID

    ORDER BY

    abstract_ID

    FOR XML PATH( '' )

    ), 3, 1000 )FROM dbo.REL_ABSTRACT_ISOPACH t1

    GROUP BY abstract_ID

    Seems to have cut down the query process to under 20 sec's for 237000 rows.

    I understand you concerns about breaking the normalisation rule, but this is only for producing a full-text search on the data.

    Cheers for the help,

    Oliver

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply