NOT LIKE Alternatives in WHERE clause

  • Hello,

    I need help to improve an existing (hence I cannot change table's structure nor the content's logic) WHERE clause, please.

    The table consists of returned products with their respective reason(s). In case there is more than one reason, they are concatenated. The request as shown in the mock-up below, has to retrieve a list of returned products without specific reasons. In my example I retrieve returned products without reasons 2 or 3 by using NOT LIKE in the WHERE clause. The result itself is correct. Yet it is inefficient and I would like your suggestion if it is possible to improve, please?

     

    DROP TABLE IF EXISTS returned_products;
    GO
    CREATE TABLE returned_products(
    returned_products_id INT IDENTITY(1,1),
    returned_products_reason VARCHAR(1000))


    ALTER TABLE returned_products
    ADD CONSTRAINT returned_products_PK PRIMARY KEY (returned_products_id);





    DECLARE @c_reason_1 VARCHAR(100) = 'reason_1',
    @c_reason_2 VARCHAR(100) = 'reason_2',
    @c_reason_3 VARCHAR(100) = 'reason_3';





    INSERT INTO returned_products(returned_products_reason)
    VALUES(@c_reason_1),
      (@c_reason_2),
      (@c_reason_1+@c_reason_3),
      (@c_reason_1+@c_reason_3+@c_reason_2)





    SELECT rp.* 
    FROM returned_products rp
    WHERE rp.returned_products_reason NOT LIKE '%'+@c_reason_3+'%'
    AND   rp.returned_products_reason NOT LIKE '%'+@c_reason_2+'%'

     

     

  • First off string comparisons are never going to be highly efficient just by the nature of that beast which is why numerics are used most of the time. For instance, in this table structure if there was a table that held the ProductReturnReason descriptions with a key value and then that key value is what is stored in the other table your WHERE clause could be made much simpler and more efficient.  Aka no sloppy concatentation required and no string searches needed.

    So I would suggest building staging tables that stages that data in a more friendly manner. You are not changing the existing structure but adding a new table that reformats that ugly table into what it should have looked like. You can even do this on the fly, using a temp table but then you are going to lose time building the temp table each time where building it ahead of time on a schedule or via a Trigger would be the best.

    TABLE tbReturnedProductReasons
    (
         ReturnedProdReasonId INT IDENTITY(1,1)
        ,ProductId INT -- (FK)
        ,ReasonCode INT
        ,DateReturned    DATETIME2(0)
    )
    -- Create a Non-Unique Clustered Index(es) based on how you plan to query the data

    TABLE returned_products
    (
    ReasonCode INT IDENTITY(1,1),
    ReasonDesc VARCHAR(100)
    )
    -- This table holds the individual Reasons 'c_reason_1', 'c_reason_2', 'c_reason_3'

    -- New WHERE clause
    -- WHERE ReasonCode NOT IN (2, 3)

    Again this assumes you can create these new staging tables, otherwise, I am not aware of any way to more efficiently query a poorly designed string column. The only other thing that might be helpful is to make sure there are quality Indexes on the table you are actually querying. Sorry I could not be of much more help, perhaps someone else will have a better solution.

    • This reply was modified 1 year, 7 months ago by  Dennis Jensen. Reason: fix typos
  • Thank you for the reply, yet I think it will not work if the reason column, as I mentioned in my original request, will have several concatenated reasons.

  • Yes the solution covers the concatenated reasons -- so would you be able to create the staging tables and if yes then I can guide you through populating them so you will not need to use the current source table and still get the appropriate results.

  • Maybe something using STRING_SPLIT?

    -- e.g.,

    SELECT #returned_products.#returned_products_id,CONCAT('reason_',value) AS returned_products_reason
    FROM #returned_products
    --Assumes reason always in the form "reason_#. Using REPLACE to treat underscore as the delimiter.
    CROSS APPLY STRING_SPLIT(REPLACE(#returned_products.returned_products_reason,'reason',''),'_') reasons
    WHERE reasons.value <> ''
    AND NOT EXISTS (SELECT* FROM STRING_SPLIT(REPLACE(#returned_products.returned_products_reason,'reason',''),'_') reasons
    WHERE reasons.value IN ('2','3'))

    Need a lot more data to compare performance fairly.

  • BOR15K wrote:

    Thank you for the reply, yet I think it will not work if the reason column, as I mentioned in my original request, will have several concatenated reasons.

    Do those reasons have a unique character that separates them?  Your original post doesn't show one but I had to ask.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Very valid point. No, there is no separator, but after having an internal chat, it may be possible to add a pipe (|) as a separator.

  • Jeff Moden wrote:

    Do those reasons have a unique character that separates them?  Your original post doesn't show one but I had to ask.

    Ghads Jeff I know this is a quality question to ask but if I had to deal with this database as seems to have been designed -- I would be very scared of the possible answer to this.

    The shiver of fear goes down my spin at the answer

    So BOR15K can you create extra tables within this database -- tables that would have no affect on what they may be doing with them? Seriously pulling those concatenated strings out and storing them in a staging table for cross referencing purposes would do you a whole world of good. You would still need that | separator but your subsequent queries would fly and they might see what you are doing and make quality changes to their stuff.

  • Dennis Jensen wrote:

    Jeff Moden wrote:

    Do those reasons have a unique character that separates them?  Your original post doesn't show one but I had to ask.

    Ghads Jeff I know this is a quality question to ask but if I had to deal with this database as seems to have been designed -- I would be very scared of the possible answer to this.

    The shiver of fear goes down my spin at the answer

    So BOR15K can you create extra tables within this database -- tables that would have no affect on what they may be doing with them? Seriously pulling those concatenated strings out and storing them in a staging table for cross referencing purposes would do you a whole world of good. You would still need that | separator but your subsequent queries would fly and they might see what you are doing and make quality changes to their stuff.

    Totally understood but BOR15K has already identified that the underlying database is making sucking sounds and they either can't or won't make changes... at least not now.  Sometimes, you play the cards your dealt and, sometimes, you not allowed to simply fold and walk away.

    And totally agreed on the idea of a "staging" table of sorts.  In reality, it would be a pre-split table to do easy lookups on.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • BOR15K wrote:

    Very valid point. No, there is no separator, but after having an internal chat, it may be possible to add a pipe (|) as a separator.

    That would help for what I have in mind.  Can you provide a couple of ACTUAL sample of these concatenated code as they would actually look in real life so that we don't have to play musical code to get around the data, please?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you, Jeff. The real data will have various reasons, but I need to exclude only three out of the rest.

    I can confirm now the pipe (|) can be added as a separator if you think it is required.

    DECLARE @c_reason_1 VARCHAR(100) = 'voided.expired.date',
    @c_reason_2 VARCHAR(100) = 'voided.damaged.ink.print',
    @c_reason_3 VARCHAR(100) = 'voided.illegal.in.country';

     

     

    • This reply was modified 1 year, 7 months ago by  BOR15K.
  • Jeff Moden wrote:

    Sometimes, you play the cards your dealt and, sometimes, you not allowed to simply fold and walk away.

    Oh I totally understand that concept, and have had to deal with it more than once myself. Still I was not under the impression that new tables could not be added just that old tables could not be changed.  This is thus quite a bit different than, nothing can be added to the database either, so yeah if they cannot add anything to the database guess its back to a more clunky solution.

    I have to say I am quite interested in seeing what you do come up with for this.

  • BOR15K wrote:

    Thank you, Jeff. The real data will have various reasons, but I need to exclude only three out of the rest. I can confirm now the pipe (|) can be added as a separator if you think it is required.

    DECLARE @c_reason_1 VARCHAR(100) = 'voided.expired.date',
    @c_reason_2 VARCHAR(100) = 'voided.damaged.ink.print',
    @c_reason_3 VARCHAR(100) = 'voided.illegal.in.country';

    Ok... so we have 3 outstanding items to be address.

    1. Can we add an extra table as Dennis Jensen alluded to?  Would it be ok to add two tables to make this type of thing scream with performance?
    2. Is the addition of a pipe delimiter going to happen or not?
    3. Can be add a trigger to the original  table? (Which is where I believe that Dennis Jensen may have been headed to, as well).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 1. Only #temp or ##temp tables.

    2. Yes

    3. Sadly no.

     

    Massive thanks.

  • BOR15K wrote:

    1. Only #temp or ##temp tables.

    2. Yes

    3. Sadly no.

    Massive thanks.

    Heh... they really don't want this to be fast, do they? 😉

    Instead of TempDB, how about a very small database in the SIMPLE Recovery Model that requires no backups nor any index maintenance to contain the two small tables?

    If the answer is also "No" there, then I'm afraid that the answer will be "Sorry... until {they} allow for a bit a change, {they'll} need to live with the slowness that {they've} built into this table".  And, to be sure... that's not directed at you.  It's directed at "them/they", whoever they are that are providing such restrictions.

    So far, this is the equivalent of someone pouring sugar into a car's gas tank and saying "make the car run fast" when almost everyone else knows that they shouldn't even start the car because of the damage that will be caused. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply