Must be a better way?

  • I wrote the following query to extract unique IDs based on a key word search of the text column from what is basically a running log table that could contain 100+ entries for every ID. The query works and gets me what i need but i would like to know if there is a better way to do this besides running 3 sub-selects one for each keyword?

    SELECT Notes.ID,

    Notes.DisplayTime,

    Notes.NEText,

    Case2.Other_ID

    FROM Case2 Join Notes on Case2.ID = Notes.ID

    Where DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59'

    and NeText Like '%ethanol%'

    and DisplayTime in

    (Select

    MAX(DisplayTime)as time

    FROM Case2 Join Notes on Case2.ID = Notes.ID

    Where DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59' and NeText Like '%ethanol%'

    Group BY Notes.ID)

    or DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59'

    and NeText Like '%etoh%'

    and DisplayTime in

    (Select

    MAX(DisplayTime)as time

    FROM Case2 Join Notes on Case2.ID = Notes.ID

    Where DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59' and NeText Like '%etoh%'

    Group BY Notes.ID)

    or DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59'

    and NeText Like '%Alcohol%'

    and DisplayTime in

    (Select

    MAX(DisplayTime)as time

    FROM Case2 Join Notes on Case2.ID = Notes.ID

    Where DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59' and NeText Like '%Alcohol%'

    Group BY Notes.ID)

    Thanks

    E

  • It highly depends on your expected output, but this should do the trick.

    DECLARE @Case2 TABLE (

    ID int PRIMARY KEY CLUSTERED,

    Other_ID int NOT NULL

    )

    DECLARE @Notes TABLE (

    ID int PRIMARY KEY CLUSTERED,

    DisplayTime datetime NOT NULL,

    NEText nvarchar(500) NOT NULL

    )

    INSERT INTO @Case2 VALUES (1, 1)

    INSERT INTO @Notes VALUES (1, '20111001', 'This row contains ethanol and must be returned')

    INSERT INTO @Case2 VALUES (2, 2)

    INSERT INTO @Notes VALUES (2, '20111001', 'This row contains etoh and must be returned')

    INSERT INTO @Case2 VALUES (3, 3)

    INSERT INTO @Notes VALUES (3, '20111001', 'This row contains Alcohol and must be returned')

    INSERT INTO @Case2 VALUES (4, 4)

    INSERT INTO @Notes VALUES (4, '20120101', 'This row is out of the date range')

    INSERT INTO @Case2 VALUES (5, 5)

    INSERT INTO @Notes VALUES (5, '20111001', 'This row contains Alcohol and must be returned')

    INSERT INTO @Case2 VALUES (6, 6)

    INSERT INTO @Notes VALUES (6, '20111001', 'This row contains no valid pattern')

    SELECT Notes.ID,

    Notes.DisplayTime,

    Notes.NEText,

    Case2.Other_ID

    FROM @Case2 AS Case2

    JOIN @Notes AS Notes

    on Case2.ID = Notes.ID

    WHERE DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59'

    AND EXISTS (

    SELECT 1

    FROM (

    SELECT 'ethanol'

    UNION ALL

    SELECT 'etoh'

    UNION ALL

    SELECT 'Alcohol'

    ) AS src (pattern)

    WHERE NeText LIKE '%' + pattern + '%'

    )

    and DisplayTime in (

    Select MAX(DisplayTime)as time

    FROM @Case2 AS InnerCase2

    Join @Notes AS InnerNotes

    on InnerCase2.ID = InnerNotes.ID

    Where DisplayTime between '2004-12-21 00:00:00' and '2011-12-21 23:59:59'

    and NeText = Notes.NeText -- Outer reference

    )

    -- Gianluca Sartori

  • If I understood what you're trying to do, this is a more efficient approach.

    SELECT n.[ID], n.DisplayTime, n.NEText, c.Other_ID

    FROM @Case2 AS c

    CROSS APPLY (

    SELECT TOP (1) n.ID, n.DisplayTime, n.NEText

    FROM @Notes AS n

    INNER JOIN (

    SELECT 'Ethanol'

    UNION ALL

    SELECT 'Etoh'

    UNION ALL

    SELECT 'Alcohol'

    ) AS p(Pattern)

    ON NEText LIKE '%' + pattern + '%'

    WHERE c.[ID] = n.[ID]

    AND n.DisplayTime BETWEEN '2004-12-21 00:00:00' AND '2011-12-21 23:59:59'

    ORDER BY n.DisplayTime DESC

    ) AS n

    Here are the stats:

    CROSS APPLY

    Table '#2145C81B'. Scan count 0, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#1F5D7FA9'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    MAX

    Table '#2145C81B'. Scan count 0, logical reads 60, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#1F5D7FA9'. Scan count 5, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Drew

    Edited: Added ORDER BY to produce the MAX for the criteria.

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • This is the kind of situation where full text indexing and a custom thesaurus file will come in very, very handy.

    I've seen this kind of query many, many times, and on large tables, full text indexing will usually return results in milliseconds where Like can take minutes or even hours on the same data.

    One I recently implemented needs to compare input values to street addresses. It uses a full text index and thesaurus to compare things like "123 First Street" to "123 1st St", in a 5-million row table. LIKE and a table of synonyms ("1st" for "First") would usually return results in about a minute (nowhere near good enough for a public-facing web page), where the full-text version returns matches in under a millisecond.

    Definitely look into it for this.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks for the replies and a few things I have not seen before that I will need to read up on. i will see if I can get the code to work in my DB and do a time test.

    Thanks

    E

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply