Getting count from millions of records and fulltext search

  • I have a query I ideally need to get to run in seconds but currently runs in minutes.  It may be that the size of the data is just too large but,
    (This is being run on sql azure 2014 (v12.0.2000.8) on  the package that gives 1000 dtus.)


    SELECT    TOP 20
            Id,
            COUNT(1) as tCount
    FROM    dbo.List as l
    INNER JOIN    dbo.DimText as dim
    ON        l.Id = dim.Id
    WHERE    FREETEXT((dim.TextOne, dim.TextTwo), 'Diode')
    GROUP BY Id
    ORDER BY COUNT(1) DESC
    GO

    query plan shows 97% cost on clustered index seek on DimText PK which is just the int id column.

    dbo.List has a row count of 48,203,897

    dbo.DimText has a row count of 50,499,378 ( simple select count(1) FROM dbo.DimText takes over minutes, so not sure if something is completely right with this table)

    dbo.DimText where FREETEXT((dim.TextOne, dim.TextTwo), 'Diode') = 9202030 (nearly 4 minutes to run)  100% cost on clustered index seek / 0 cost on table valued function [Fulltextmatch]

    I have tried inner loop join, merge join normal inner join but not getting close to the time I need on this.

    Are there any ways to better index this? structure the join to get the best performance?

  • Can you post the DDL for the two tables, as well as the execution plan?
    It looks like you want the count of records in DimText that contain the string 'Diode' that match the ID of the table List?

    Is there an implicit conversion occurring in the FREETEXT?  These are nvarchar, you may need to preface the 'Diode' with the N, like this N'Diode'
    Is FREETEXT actually what you need to do? CONTAINS may provide better performance if you only need to match the exact word. 
    EXISTS may also return a better plan, assuming I have your logic correct.

    SELECT TOP 20
       Id,
       COUNT(1) as tCount
    FROM dbo.List as l
    WHERE EXISTS
        (SELECT ID
        FROM dbo.DimText as DIM
        WHERE FREETEXT((DIM.TextOne, DIM.TextTwo), N'Diode')
            AND l.Id = DIM.Id)
    GROUP BY l.ID

    And, as an example, a SELECT COUNT(1) on a table in my system with 346,324,378 rows takes 2+ minutes.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply