AND/OR problems...

  • This will probably be a bit longwinded, so bear with me.

    [request]

    I have been tasked with building a search interface (in ASP) that will allow our employees to search from a list of documents stored in our SQL database

    [tables]

    for the query I need help with, there are 6 tables involved:

    Proposals (contains ProposalID, ProjectName, ProjectCode, ProposalTypeID, IndustryID, CustomerID)

    Customers (contains CustomerID, CustomerName)

    ProposalTypes (contains ProposalTypeID, ProposalTypeName)

    ProposalFiles (contains FileID, ProposalID, File, Filename)

    Keywords (contains KeywordID, Keyword)

    ProposalKeywords (contains ProposalID, KeywordID)

    Proposals is the "main" table, and each of the others are tied to it via an ID field.

    [problem]

    in an attempt to be more efficient, I have broken up some of the data (specifically Keywords), into a new table. previously, in the Proposals table, there was a column called Keywords. it housed information like 12, 23, 51 (representing the IDs of the Keywords associated with this proposal).

    i didnt like that approach, so i created a ProposalKeywords table that contains an individual entry for every Proposal and the ID(s) associated with it.

    now comes my problem. my users have to be able to search for Proposals based on Keywords (among other things), which wouldnt be a problem if it were a simple search (im currently using an IN to grab all keywords and proposals, etc).

    the problem arrises when i need to allow AND vs. OR searches. i cant seem to find an elegant way to basically say "Give me all records where Keyword = 12 AND Keyword = 23"

    this is the current query i have that will give me ALL the informtion i need, without any sort of WHERE filtering involved (it looks worse that it is :

    ------------------------------------------------------------

    SELECT Customers.CustomerName, Proposals.ProjectCode, ProposalTypes.ProposalTypeName, ProposalFiles.FileID, ProposalFiles.Filename, Proposals.ProposalID, Keywords.KeywordID

    FROM Keywords INNER JOIN ((ProposalTypes INNER JOIN ((Customers INNER JOIN Proposals ON Customers.CustomerID = Proposals.CustomerID) INNER JOIN ProposalFiles ON Proposals.ProposalID = ProposalFiles.ProposalID) ON ProposalTypes.ProposalTypeID = Proposals.ProposalTypeID) INNER JOIN ProposalKeywords ON Proposals.ProposalID = ProposalKeywords.ProposalID) ON Keywords.KeywordID = ProposalKeywords.KeywordID;

    ------------------------------------------------------------

    is there -any- way i can use a variation of this query, AND provide a WHERE component that grabs a specific set of Keywords?

    as i mentioned, i am using a version of the above with IN, but i need the query to be more flexible and allow for AND/OR variations on the KeywordID values.

    just some additional info.

    old Proposals table example (formatting will most likely be foobared :

    ProposalID | CustomerID | IndustryID | Keywords

    1 | 12 | 7 | 12, 23, 51

    2 | 18 | 2 | 5, 19

    new Proposals table example:

    ProposalID | CustomerID | IndustryID

    1 | 12 | 7

    2 | 18 | 2

    ProposalKeywords table

    ProposalID | KeywordID

    1 | 12

    1 | 23

    1 | 51

    2 | 5

    2 | 19

    and i am trying to write a query that will allow for "Keyword = 12 AND Keyword = 19" (the OR version is basically handled with my IN piece)

    oh, and in code, i have built a loop that will generate the "Keyword = XX AND Keyword = YY" piece... the problem is that i get no resultset.

    if any additional clarification is needed, please let me know... thanks!

  • It's better to have more information than less.

    Here's an example of what you need using the system tables...

    --find all the tables that have the columns indid, id :

    Select O.Name from SysObjects O where Exists (Select 1 from dbo.SysColumns C where C.id = O.id and C.Name in ('indid', 'id') GROUP BY id having count(*) = 2)

    --should return at least those 2 tables : sysindexes and sysindexkeys

    --this can be converted to in ('indid', 'id') = in (Select * from dbo.SplitString(@ItemsToFind)) and count(*) = Select count(*) from dbo.SplitString(@ItemsToFind)

  • Just to clarify the problem with your current code :

    If you have Tablea A inner join TableB B on A.id = 1 and A.id = 2 it will always fail because you guessed it : 1 2 . That's why you have to use the in clause again. By adding the group by id having count(*) = number of items. You force sql server to return only the Objects that match ALL keywords, no matter how many there are.

    Btw I used the split function to count the number of items, but it would be wise to pass the number of items as a parameter so you don't have to execute the split twice.

  • thanks Remi!

    just for my own clarification (and perhaps due to the lack of food in my belly), could you please equate your example to the tables/columns in my setup?

    this is my attempt at reproducing your example, with my information:

    Select O.ProposalID from Proposals O where Exists (Select 1 from ProposalKeywords C where C.ProposalID = O.ProposalID and C.KeywordID in (12, 23) GROUP BY ProposalID having count(*) = 2)

    is that even close?

  • Assuming you got the column names right then yes. The only thing now is to modify this code to handle a list of keywords passed in parameter. Do you know how to use a split function?

    P.S. you can search the script section of this site to find one.

    Post back if you need more help.

  • well, ive written some ASP code that can create a comma dilimited list of the KeywordIDs i need... basically, i just need to build the query you provided, insert my variable with the comma dilimted values, and then provide the number of parameters... correct?

  • If you use dynamic sql, then yes... but I was assuming that you were gonna use a stored proc as best practices suggest.

  • thanks Remi, you have been incredibly helpful... you are definitely a sql guru in my book..

  • The gurus are recognized as MVPs (like Frank Kalis)... I've still got a verrrrrry long way to go before I can learn half the stuff he masters. I'd put myself in the good average in the sql server users (don't know squat about dts, backup/restore which is kind of important... but then again I'm only a developper )

  • This is an interesting problem.  I think I have a better solution.  Your solution only finds proposals that have KeywordID of 12 and 23 exactly, and no other keywords are allowed.

    The easy part is for proposals that have Keywords of 12 or 23.

    SELECT * FROM Proposals a INNER JOIN ProposalKeywords b ON a.ProposalID = b.ProposalID

    WHERE b.KeywordID IN (12,23)

    The hard part is for proposals that have Keywords of at least 12 and 23, and other Keywords are allowed.

    SELECT * FROM Proposals

    WHERE ProposalID IN ((SELECT DISTINCT ProposalID FROM ProposalKeywords

            WHERE KeywordID = 12) a INNER JOIN

           (SELECT DISTINCT ProposalID FROM ProposalKeywords

            WHERE Keyword = 23) b ON a.ProposalID = b.ProposalID)

    The trick is to create two derived tables in the subquery.  The first derived table has all distinct ProposalID's from ProposalKeywords that have a KeywordID of 12, and the second derived table has all distinct ProposalID's from ProposalKeywords that have a KeywordID of 23.  When you join these two derived tables you have the distinct ProposalID's that have KeywordID's of both 12 and 23, but also other possible keywords. 

    What do you think?

    --Jeff

     

  • Nice thinking... but what happens if the list of keywords is dynamic and you still want to use static sql?

  • Remi:

    Thanks for the reply.  I just thought of a much simpler solution that involves the use of a trivial user defined function (UDF).

    Here is the UDF:

    CREATE FUNCTION dbo.fnCountKeywords(@KeywordID int)

    RETURNS int

    AS

    BEGIN

    DECLARE @Count int

    SET @Count = (SELECT COUNT(*) FROM ProposalKeywords WHERE KeywordID = @KeywordID)

    RETURN @Count

    END

    GO

    Here is the SELECT statement:

    SELECT * FROM Proposals

    WHERE dbo.fnCountKeywords(12) > 0 AND dbo.fnCountKeywords(23) > 0

    Conclusion

    Now you can easily create a dynamic SQL statement by construcing the WHERE clause dynamically by concatenating a call to dbo.fnCountKeywords with an AND for each keyword.

     

  • I said I didn't want dynamic sql in my solution.

    Besides if you were to have Millions of lines in either table (especially the Proposals), this would run slower than a snail walking backward, uphill on his hands. Also you don't make refference to the proposal in the function so it would always return a match.

    I think you should reread my proposed solution and try to understand how it works.

    P.S. Not trying to insult you in any way by these posts.

  • OOPS!!!!!!!!!!!!

    My mind went blank!  Ignore the above "simple solution".

    Unfortunately, the only way to solve this using my technique with derived tables is to use VB.NET or C# to construct a dynamic SQL statement.  For example, if there were 5 keywords, then there would be 5 derived tables joined to each other by 4 inner joins in the subquery.

    The logic to construct the correct SQL syntax for the subquery would be somewhat messy, but it would be straightforward. 

     

  • There's never only 1 way to do something in sql server. However there's often a best way. And here's my version of it :

    IF Object_id('Split') > 0

    DROP FUNCTION SPLIT

    GO

    CREATE FUNCTION [dbo].[Split] (@vcDelimitedString nVarChar(4000),

    @vcDelimiternVarChar(100) )

    /**************************************************************************

    DESCRIPTION: Accepts a delimited string and splits it at the specified

    delimiter points. Returns the individual items as a table data

    type with the ElementID field as the array index and the Element

    field as the data

    PARAMETERS:

    @vcDelimitedString- The string to be split

    @vcDelimiter- String containing the delimiter where

    delimited string should be split

    RETURNS:

    Table data type containing array of strings that were split with

    the delimiters removed from the source string

    USAGE:

    SELECT ElementID, Element FROM Split('11111,22222,3333', ',') ORDER BY ElementID

    AUTHOR:Karen Gayda

    DATE: 05/31/2001

    MODIFICATION HISTORY:

    WHODATEDESCRIPTION

    ----------------------------------------------------------------

    ***************************************************************************/

    RETURNS @tblArray TABLE

    (

    ElementIDsmallintIDENTITY(1,1), --Array index

    ElementnVarChar(1200)--Array element contents

    )

    AS

    BEGIN

    DECLARE

    @siIndexsmallint,

    @siStartsmallint,

    @siDelSizesmallint

    SET @siDelSize= LEN(@vcDelimiter)

    --loop through source string and add elements to destination table array

    WHILE LEN(@vcDelimitedString) > 0

    BEGIN

    SET @siIndex = CHARINDEX(@vcDelimiter, @vcDelimitedString)

    IF @siIndex = 0

    BEGIN

    INSERT INTO @tblArray (Element) VALUES(@vcDelimitedString)

    BREAK

    END

    ELSE

    BEGIN

    INSERT INTO @tblArray (Element) VALUES(SUBSTRING(@vcDelimitedString, 1,@siIndex - 1))

    SET @siStart = @siIndex + @siDelSize

    SET @vcDelimitedString = SUBSTRING(@vcDelimitedString, @siStart , LEN(@vcDelimitedString) - @siStart + 1)

    END

    END

    RETURN

    END

    GO

    IF NOT Object_id('SearchColumns') IS NULL

    DROP PROCEDURE SearchColumns

    GO

    CREATE PROCEDURE dbo.SearchColumns @Items as varchar(8000), @ItemsCount as smallint --avoid calling split twice and allows for any or all type searches

    AS

    SET NOCOUNT ON

    SELECT

    O.id

    , O.Name

    FROMdbo.SysObjects O

    WHERE Exists(

    SELECT 1

    FROMdbo.SysColumns C

    WHERE C.id = O.id and C.Name in (Select Element from dbo.Split(@Items, ','))

    GROUP BY id having count(*) >= @ItemsCount

    )

    ORDER BYName

    SET NOCOUNT OFF

    GO

    --find all items

    Exec dbo.SearchColumns 'name,id', 2-- 4 row(s) affected

    Exec dbo.SearchColumns 'name,id,indid', 3-- 1 row(s) affected

    Exec dbo.SearchColumns 'id', 1-- 12 row(s) affected

    --find at least 2

    Exec dbo.SearchColumns 'name,id,indid', 2-- 5 row(s) affected (instead of 1)

    --find at least 1

    Exec dbo.SearchColumns 'name', 1-- 10 row(s) affected

    Exec dbo.SearchColumns 'name,indid', 1-- 11 row(s) affected

    GO

    DROP PROCEDURE SearchColumns

    --DROP FUNCTION Split

    GO

    As you can see, you can search for ANY number of items at any time without touching the code, nor the number of joins and it's all in static sql.

    Even better, you can search for any amount of matches you wish, from 1 to all, STILL without touching the code.

    See my point now???????

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply