Has anyone written a Faceted Search like the ElasticUI interface ... all in SQL?

  • If you go to dubdubdub elasticui dotcom and look at the bottom-most screenshot, I need to do the equivalent, all in a t-SQL stored proc (addt'l support functions acceptable), and needs to be reasonable portable as we MAY have to drop back to mariadb if client won't pay for SS like we want.

    Basically in the UI, there'd be two 'ranged searches' like the 'Tweet Reach' example, followed by 6 boolean checks. 

    If you think of something like home listings, users could check as many of these 7 options as they like:
     Price
    1) __ 0-249K  (20K)
    2) __ 250-499K (30K)
    3) __ 500-749K (10K)
    4) __ >750 (5K)

    Bathrooms
    5) __ 0-2 (30K)
    6) __ 2-4 (30K)
    7) __ 5+ (5K)

    And as many of these boolean options as they like.  Note the 'unchecked' means 'ignore the field', not 'this boolean must be false'
    8) __ Has Central Air (30K)
    9) __ Has HOA (20K)
    10) __ Has Indoor Plumbing (65K)
    11) __ Has Backyard (40K)
    12) __ Has Fridge (60K)
    13) __ Has Dishwasher (45K)

    The 6 booleans would be booleans in the table as well, then there's a price field and bathroom count field. 

    Selected booleans would be (AND) with each other, chosen range values within each of the two ranges would be (OR) with one another, but (AND) when compared to the other range values. 

    So if 1, 2 and 7, 10, 11 were selected, we'd be looking for 0-$499K homes with 5+ bathrooms, indoor plumbing, and backyard.  Make sense?

    Imagine the 'user selected filter' data will be in a wide bit table like the below, and values will be 0 if unselected by user, 1 if selected.
    create table tblFilter
    (
    user_id int
    ,range_1 bit -- price 0-249 is true
    ,range_2 bit
    ,range_3 bit
    etc
    ,bool_8 bit -- has central air = 1 if selected, ignore central air field if unselected by user
    ,bool_9 bit
    )
    etc.

    NOW, here's where it gets REALLY HARD ... I need to know the count of how many will be subtracted OR added to the 'total selected', if the state of each (1-13) is toggled from what it currently is, checked or unchecked.

    So using the above example, assuming those  5 options are checked, how many listings will be added or lost from 'total selected' if each of those 5 options is UNchecked, and how many will be added or subtracted for each of the other 8 selectors, if they were to GET checked.

    Will be operating on about 10-100K records, and need it to return results within about 2-3 seconds.  Results would be 'total selected', and then counts (+/-) for the 13 selectors.

    Anyone ever attempted something like this with pure SQL?

    Any help appreciated, I can easily figure out 'which listings meet the criteria', but the counts are blowing my mind.

    TIA and thx for reading!

  • bvaljalo-1000038 - Wednesday, May 9, 2018 4:30 PM

    If you go to dubdubdub elasticui dotcom and look at the bottom-most screenshot, I need to do the equivalent, all in a t-SQL stored proc (addt'l support functions acceptable), and needs to be reasonable portable as we MAY have to drop back to mariadb if client won't pay for SS like we want.

    Basically in the UI, there'd be two 'ranged searches' like the 'Tweet Reach' example, followed by 6 boolean checks. 

    If you think of something like home listings, users could check as many of these 7 options as they like:
     Price
    1) __ 0-249K  (20K)
    2) __ 250-499K (30K)
    3) __ 500-749K (10K)
    4) __ >750 (5K)

    Bathrooms
    5) __ 0-2 (30K)
    6) __ 2-4 (30K)
    7) __ 5+ (5K)

    And as many of these boolean options as they like.  Note the 'unchecked' means 'ignore the field', not 'this boolean must be false'
    8) __ Has Central Air (30K)
    9) __ Has HOA (20K)
    10) __ Has Indoor Plumbing (65K)
    11) __ Has Backyard (40K)
    12) __ Has Fridge (60K)
    13) __ Has Dishwasher (45K)

    The 6 booleans would be booleans in the table as well, then there's a price field and bathroom count field. 

    Selected booleans would be (AND) with each other, chosen range values within each of the two ranges would be (OR) with one another, but (AND) when compared to the other range values. 

    So if 1, 2 and 7, 10, 11 were selected, we'd be looking for 0-$499K homes with 5+ bathrooms, indoor plumbing, and backyard.  Make sense?

    Imagine the 'user selected filter' data will be in a wide bit table like the below, and values will be 0 if unselected by user, 1 if selected.
    create table tblFilter
    (
    user_id int
    ,range_1 bit -- price 0-249 is true
    ,range_2 bit
    ,range_3 bit
    etc
    ,bool_8 bit -- has central air = 1 if selected, ignore central air field if unselected by user
    ,bool_9 bit
    )
    etc.

    NOW, here's where it gets REALLY HARD ... I need to know the count of how many will be subtracted OR added to the 'total selected', if the state of each (1-13) is toggled from what it currently is, checked or unchecked.

    So using the above example, assuming those  5 options are checked, how many listings will be added or lost from 'total selected' if each of those 5 options is UNchecked, and how many will be added or subtracted for each of the other 8 selectors, if they were to GET checked.

    Will be operating on about 10-100K records, and need it to return results within about 2-3 seconds.  Results would be 'total selected', and then counts (+/-) for the 13 selectors.

    Anyone ever attempted something like this with pure SQL?

    Any help appreciated, I can easily figure out 'which listings meet the criteria', but the counts are blowing my mind.

    TIA and thx for reading!

    This is what is known as a "Catch All Query" and the definitive article on the subject may be found at the following URL.  Done correctly, the 2-3 seconds will be more a 100 milliseconds or so.

    https://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

    There is a an update to the article to help performance and possibly eliminate the need for dynamic SQL.

    https://sqlinthewild.co.za/index.php/2018/03/13/revisiting-catch-all-queries/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply