SQL query question

  • I'm trying to create a "smart agent" feature in my website.

    A user can search for ads by some criteria, and then save his search as a "smart agent".

    When a new ad is added that matches his criteria he will get an email.

     

    For example, I have 3 tables:

    'Ads' table:

    columns: recordID(int), price(int), size(int)

    'SmartAgents' table:

    columns: userID(int), priceFrom(int), priceTo(int), sizeFrom(int), sizeTo(int)

    'SmartAgents_toSend' table:

    columns: adID(int) => this contains the recordID data from the Ads table when a new ad is added.


    This is what I have to far:

    SELECT userID
    FROM smartAgents_toSend as a WITH (nolock)
    INNER JOIN ads as b WITH (nolock)
    ON a.adID = b.recordID


     

    Now how do I match the ads with the criteria in the SmartAgents table and get a list of userIDs ?

    • All columns in SmartAgents table are optional so need to use ISNULL probably on each column comparison
    • Assume that I have around 50 criteria columns in my real tables.
    • How do I even begin creating indexes for that?

    • This topic was modified 4 years, 2 months ago by  Section One.
    • This topic was modified 4 years, 2 months ago by  Section One.
  • In regards to points 1 and 3, an Index will be useless to the RDBMS is you're using something like WHERE Column1 = ISNULL(@Param1, Column1). You need to use proper boolean logic in the format WHERE (Column1 = @Param1 OR @Param1 IS NULL).

    Obviously you won't be using parameters here, you'll be using a JOIN with such syntax and a lot of OR clauses in the ON. Honestly, the indexes aren't likely to even then be much help, especially if you have things like leading wildcards or something. Something more tangible than an idea would really help us understand here, as you mention there are 50 criteria column, but your example shows us only 1.

    Also why are you using the NOLOCK query hint? Do you understand the consequences of using this? I suspect that using that hint is going to only cause more harm than good in a query like this.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Something like this

    select sa.userID, a.RecordID
    from SmartAgents_toSend sas
    join Ads a on sas.adID=a.RecordID
    join SmartAgents sa on a.price between sa.priceFrom and sa.priceTo
    and a.size between sa.sizeFrom and sa.sizeTo;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Thom A wrote:

    Also why are you using the NOLOCK query hint? Do you understand the consequences of using this? I suspect that using that hint is going to only cause more harm than good in a query like this.

    Well, on this specific query it might not be necessary indeed to use NOLOCK.

     

    Steve Collins wrote:

    Something like this

    select sa.userID, a.RecordID
    from SmartAgents_toSend sas
    join Ads a on sas.adID=a.RecordID
    join SmartAgents sa on a.price between sa.priceFrom and sa.priceTo
    and a.size between sa.sizeFrom and sa.sizeTo;

     

    Thanks

    I was thinking maybe there is a better way using WHERE EXIST or something similar I'm not aware of.

    But ON would work just as well I guess

  • It could be done with a correlated subquery too.  Stylewise I prefer >= and <= to BETWEEN AND.  Anything that's shorter, I like.  The more framework words you use the better self-documenting the code is, I guess.

    select sa.userID
    from SmartAgents sa
    where exists(select 1
    from SmartAgents_toSend sas
    join Ads a on sas.adID=a.RecordID
    where a.price between sa.priceFrom and sa.priceTo
    and a.size between sa.sizeFrom and sa.sizeTo);

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • I've made some progress with my query and this is how it is now:

     

    SET NOCOUNT ON

    select sa.userID
    from SmartAgents sa
    where exists(select 1
    from SmartAgents_toSend sas
    join Ads a on sas.adID=a.RecordID
    where a.price <= ISNULL(sa.price, a.price)
    and a.listingCategoryID = ISNULL(sa.apartmentType, a.listingCategoryID)
    and a.[floor] BETWEEN ISNULL(sa.floorMin, a.[floor]) AND ISNULL(sa.floorMax, a.[floor])
    and a.numberOfRooms between ISNULL(sa.numberOfRoomsMin, a.numberOfRooms) and ISNULL(sa.numberOfRoomsMax, a.numberOfRooms)
    and a.numberOfBathrooms = ISNULL(sa.numberOfBathrooms, a.numberOfBathrooms)
    and a.availableFrom >= ISNULL(sa.availableFrom, a.availableFrom)
    )

     

    But, I have a new problem now.

    This gives me a list of userIDs  that have "smart agents" that I need to send messages to. (I of course will get a DISTINCT list in the end)

    But how can I know which adIDs in 'SmartAgents_toSend' table, each userID on my list needs to be notified about?

     

    Optimally, I would like to get a record-set like this:

    userID (int)   |   adIDs (varchar)

    35                         201, 4930, 2120

    46                         102, 302

  • All right that's nice progress.  Thanks for following up.  Well, an adID is the same thing as a RecordID.  If you recall the earlier query had RecordID in the select list.  That's because instead of testing for the existence of a match it did the inner join between the tables.  With the correlated subquery the columns in the inner query aren't accessible to the outer query, or at least I don't think so.  Sample data would be nice :)/   If it were altered to the earlier joins then it seems like we could GROUP BY userID and aggregate with STRING_AGG(adID, ',') .  Something like that

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply