May 30, 2023 at 2:13 pm
Hello,
I need help to improve an existing (hence I cannot change table's structure nor the content's logic) WHERE clause, please.
The table consists of returned products with their respective reason(s). In case there is more than one reason, they are concatenated. The request as shown in the mock-up below, has to retrieve a list of returned products without specific reasons. In my example I retrieve returned products without reasons 2 or 3 by using NOT LIKE in the WHERE clause. The result itself is correct. Yet it is inefficient and I would like your suggestion if it is possible to improve, please?
DROP TABLE IF EXISTS returned_products;
GO
CREATE TABLE returned_products(
returned_products_id INT IDENTITY(1,1),
returned_products_reason VARCHAR(1000))
ALTER TABLE returned_products
ADD CONSTRAINT returned_products_PK PRIMARY KEY (returned_products_id);
DECLARE @c_reason_1 VARCHAR(100) = 'reason_1',
@c_reason_2 VARCHAR(100) = 'reason_2',
@c_reason_3 VARCHAR(100) = 'reason_3';
INSERT INTO returned_products(returned_products_reason)
VALUES(@c_reason_1),
(@c_reason_2),
(@c_reason_1+@c_reason_3),
(@c_reason_1+@c_reason_3+@c_reason_2)
SELECT rp.*
FROM returned_products rp
WHERE rp.returned_products_reason NOT LIKE '%'+@c_reason_3+'%'
AND rp.returned_products_reason NOT LIKE '%'+@c_reason_2+'%'
May 30, 2023 at 3:10 pm
First off string comparisons are never going to be highly efficient just by the nature of that beast which is why numerics are used most of the time. For instance, in this table structure if there was a table that held the ProductReturnReason descriptions with a key value and then that key value is what is stored in the other table your WHERE clause could be made much simpler and more efficient. Aka no sloppy concatentation required and no string searches needed.
So I would suggest building staging tables that stages that data in a more friendly manner. You are not changing the existing structure but adding a new table that reformats that ugly table into what it should have looked like. You can even do this on the fly, using a temp table but then you are going to lose time building the temp table each time where building it ahead of time on a schedule or via a Trigger would be the best.
TABLE tbReturnedProductReasons
(
ReturnedProdReasonId INT IDENTITY(1,1)
,ProductId INT -- (FK)
,ReasonCode INT
,DateReturned DATETIME2(0)
)
-- Create a Non-Unique Clustered Index(es) based on how you plan to query the data
TABLE returned_products
(
ReasonCode INT IDENTITY(1,1),
ReasonDesc VARCHAR(100)
)
-- This table holds the individual Reasons 'c_reason_1', 'c_reason_2', 'c_reason_3'
-- New WHERE clause
-- WHERE ReasonCode NOT IN (2, 3)
Again this assumes you can create these new staging tables, otherwise, I am not aware of any way to more efficiently query a poorly designed string column. The only other thing that might be helpful is to make sure there are quality Indexes on the table you are actually querying. Sorry I could not be of much more help, perhaps someone else will have a better solution.
May 30, 2023 at 3:15 pm
Thank you for the reply, yet I think it will not work if the reason column, as I mentioned in my original request, will have several concatenated reasons.
May 30, 2023 at 3:24 pm
Yes the solution covers the concatenated reasons -- so would you be able to create the staging tables and if yes then I can guide you through populating them so you will not need to use the current source table and still get the appropriate results.
May 30, 2023 at 3:37 pm
Maybe something using STRING_SPLIT?
-- e.g.,
SELECT #returned_products.#returned_products_id,CONCAT('reason_',value) AS returned_products_reason
FROM #returned_products
--Assumes reason always in the form "reason_#. Using REPLACE to treat underscore as the delimiter.
CROSS APPLY STRING_SPLIT(REPLACE(#returned_products.returned_products_reason,'reason',''),'_') reasons
WHERE reasons.value <> ''
AND NOT EXISTS (SELECT* FROM STRING_SPLIT(REPLACE(#returned_products.returned_products_reason,'reason',''),'_') reasons
WHERE reasons.value IN ('2','3'))
Need a lot more data to compare performance fairly.
May 30, 2023 at 6:33 pm
Thank you for the reply, yet I think it will not work if the reason column, as I mentioned in my original request, will have several concatenated reasons.
Do those reasons have a unique character that separates them? Your original post doesn't show one but I had to ask.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2023 at 8:09 pm
Very valid point. No, there is no separator, but after having an internal chat, it may be possible to add a pipe (|) as a separator.
May 30, 2023 at 8:34 pm
Do those reasons have a unique character that separates them? Your original post doesn't show one but I had to ask.
Ghads Jeff I know this is a quality question to ask but if I had to deal with this database as seems to have been designed -- I would be very scared of the possible answer to this.
The shiver of fear goes down my spin at the answer
So BOR15K can you create extra tables within this database -- tables that would have no affect on what they may be doing with them? Seriously pulling those concatenated strings out and storing them in a staging table for cross referencing purposes would do you a whole world of good. You would still need that | separator but your subsequent queries would fly and they might see what you are doing and make quality changes to their stuff.
May 30, 2023 at 9:05 pm
Jeff Moden wrote:Do those reasons have a unique character that separates them? Your original post doesn't show one but I had to ask.
Ghads Jeff I know this is a quality question to ask but if I had to deal with this database as seems to have been designed -- I would be very scared of the possible answer to this.
The shiver of fear goes down my spin at the answer
So BOR15K can you create extra tables within this database -- tables that would have no affect on what they may be doing with them? Seriously pulling those concatenated strings out and storing them in a staging table for cross referencing purposes would do you a whole world of good. You would still need that | separator but your subsequent queries would fly and they might see what you are doing and make quality changes to their stuff.
Totally understood but BOR15K has already identified that the underlying database is making sucking sounds and they either can't or won't make changes... at least not now. Sometimes, you play the cards your dealt and, sometimes, you not allowed to simply fold and walk away.
And totally agreed on the idea of a "staging" table of sorts. In reality, it would be a pre-split table to do easy lookups on.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2023 at 9:07 pm
Very valid point. No, there is no separator, but after having an internal chat, it may be possible to add a pipe (|) as a separator.
That would help for what I have in mind. Can you provide a couple of ACTUAL sample of these concatenated code as they would actually look in real life so that we don't have to play musical code to get around the data, please?
--Jeff Moden
Change is inevitable... Change for the better is not.
May 30, 2023 at 9:27 pm
Thank you, Jeff. The real data will have various reasons, but I need to exclude only three out of the rest.
I can confirm now the pipe (|) can be added as a separator if you think it is required.
DECLARE @c_reason_1 VARCHAR(100) = 'voided.expired.date',
@c_reason_2 VARCHAR(100) = 'voided.damaged.ink.print',
@c_reason_3 VARCHAR(100) = 'voided.illegal.in.country';
May 30, 2023 at 9:36 pm
Sometimes, you play the cards your dealt and, sometimes, you not allowed to simply fold and walk away.
Oh I totally understand that concept, and have had to deal with it more than once myself. Still I was not under the impression that new tables could not be added just that old tables could not be changed. This is thus quite a bit different than, nothing can be added to the database either, so yeah if they cannot add anything to the database guess its back to a more clunky solution.
I have to say I am quite interested in seeing what you do come up with for this.
May 30, 2023 at 11:41 pm
Thank you, Jeff. The real data will have various reasons, but I need to exclude only three out of the rest. I can confirm now the pipe (|) can be added as a separator if you think it is required.
DECLARE @c_reason_1 VARCHAR(100) = 'voided.expired.date',
@c_reason_2 VARCHAR(100) = 'voided.damaged.ink.print',
@c_reason_3 VARCHAR(100) = 'voided.illegal.in.country';
Ok... so we have 3 outstanding items to be address.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 31, 2023 at 12:33 am
1. Only #temp or ##temp tables.
2. Yes
3. Sadly no.
Massive thanks.
May 31, 2023 at 2:00 am
1. Only #temp or ##temp tables.
2. Yes
3. Sadly no.
Massive thanks.
Heh... they really don't want this to be fast, do they? 😉
Instead of TempDB, how about a very small database in the SIMPLE Recovery Model that requires no backups nor any index maintenance to contain the two small tables?
If the answer is also "No" there, then I'm afraid that the answer will be "Sorry... until {they} allow for a bit a change, {they'll} need to live with the slowness that {they've} built into this table". And, to be sure... that's not directed at you. It's directed at "them/they", whoever they are that are providing such restrictions.
So far, this is the equivalent of someone pouring sugar into a car's gas tank and saying "make the car run fast" when almost everyone else knows that they shouldn't even start the car because of the damage that will be caused. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply