Conditional Search

Question

Conditional Search

CaseyP 58642

SSC Enthusiast

Points: 131
More actions
September 13, 2017 at 11:06 am

#146470

Its been a while since I have worked with SQL. So much that I am struggling to ask this appropriately so I will try best to spell out what I am If I have two such tables like:
create table inv ( inv_id3 int, inv_desc varchar(200), column1 char(6), column2 char(200) );
create table item_stores ( item_id int, item_size_id varchar(200), item_size2_id varchar(200), column1 char(6), column2 char(6), column3 char(6) );
where their contents look like this:
insert into inv values (208798,'PANTS1'), (108829,'PANTS2'), (108828,'PANTS3');

insert into item_stores values (208798,'34','34'), (108829,'34','34'), (108829,'34','i34'), (108829,'34','i32'), (108829,'34','i30'), (108829,'34','i36'), (108829,'34','36'), (108828,'34','34');
What I am wanting to do is expose a list like "inv.inv_id3 , inv.inv_desc" of those records where situations like this exist
(108829,'34','34'), (108829,'34','i34'),
Where someone has entered a size incorrectly. The letter I at the beginning is the correct format. The trick is that it isn't always the correct format. In other cases within the same data set the lack of the letter I is correct depending on how the item was first entered into the system. So, I am left to spot where this double occurrence occurs is that makes sense and call out that item_id.
My questions is short of ssomeone being over generous and providing a sample. Could you in layman's terms provide some guidance on the best way to proceed. I have to accomplish this in the query without the use of any other tooling. A select statement of sorts,

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 247218 More actions · Answer 1

Well done on taking the time to post all of your set-up scripts ... this is a great start.
But I am having trouble understanding what is so special about the 'situations' which you wish to see. Can you please have another go at explaining the logic required to identify a 'situation'?

CaseyP 58642 SSC Enthusiast Points: 131 More actions · Answer 2

I need to identify all the IDs that that have a letter "I" in one of their size combinations. Then determine if for that size combination containing the letter "I" if there is a like size combination without the letter I. Then return a list of bad IDs so that I can do later work moving inventory out of the bad size combinations into the correct ones and then purge the DB of all the mistakes.

The thing I worry about is if there are a couple million combinations, how to grab all of the size combinations for a given ID and then act upon that subset rather than comparing every record against the sum of all records

Does this make sense?

Phil Parkin SSC Guru Points: 247218 More actions · Answer 3

CaseyP 58642 - Wednesday, September 13, 2017 11:32 AM
I need to identify all the IDs that that have a letter "I" in one of their size combinations. Then determine if for that size combination containing the letter "I" if there is a like size combination without the letter I. Then return a list of bad IDs so that I can do later work moving inventory out of the bad size combinations into the correct ones and then purge the DB of all the mistakes.
The thing I worry about is if there are a couple million combinations, how to grab all of the size combinations for a given ID and then act upon that subset rather than comparing every record against the sum of all records
Does this make sense?

Makes some sense 🙂
So can you tell me why you would be interested in this pair
(108829,'34','34'),
(108829,'34','i34')

but not in this pair
(108829,'34','i36'),
(108829,'34','36'),
?

CaseyP 58642 SSC Enthusiast Points: 131 More actions · Answer 4

You are correct. I am interested in any pairing that occurs. With all of the possible sizes and combinations I wanted to keep the post as brief as possible but as far as the example data is concerned, the output would be 108829, PANTS2 because there is at a minimum one pair. In this example there are 2 matches.

It would be cool to output results that showed the ID and the conflicting sizes so that cleanup would be easier and the scope of the cleanup better known but I am having this mental block on how just to identify which IDs are the problem to begin with

Phil Parkin SSC Guru Points: 247218 More actions · Answer 5

Try running this and see what you think. I'm not surprised that you had trouble coding it. There must be a better way than what I've come up with.
CREATE TABLE #item_stores ( item_id INT , item_size_id VARCHAR(200) , item_size2_id VARCHAR(200) );

INSERT #item_stores ( item_id , item_size_id , item_size2_id ) VALUES ( 208798, '34', '34' ) ,( 108829, '34', '34' ) ,( 108829, '34', 'i34' ) ,( 108829, '34', 'i32' ) ,( 108829, '34', 'i30' ) ,( 108829, '34', 'i36' ) ,( 108829, '34', '36' ) ,( 108828, '34', '34' );

SELECT * FROM #item_stores ist;

WITH PrevValues AS ( SELECT ist.item_id , ist.item_size_id , Prev_item_size_id = LAG(ist.item_size_id) OVER (PARTITION BY ist.item_id ORDER BY ist.item_id , Cleaned.Clean_item_size_id , Cleaned.Clean_item_size2_id ) , ist.item_size2_id , Prev_item_size2_id = LAG(ist.item_size2_id) OVER (PARTITION BY ist.item_id ORDER BY ist.item_id , Cleaned.Clean_item_size_id , Cleaned.Clean_item_size2_id ) , Cleaned.Clean_item_size_id , Cleaned.Clean_item_size2_id FROM #item_stores ist CROSS APPLY ( SELECT Clean_item_size_id = REPLACE(ist.item_size_id, 'i', '') , Clean_item_size2_id = REPLACE(ist.item_size2_id, 'i', '') ) Cleaned ) SELECT pv.item_id , pv.item_size_id , pv.item_size2_id FROM PrevValues pv WHERE pv.Prev_item_size_id IS NOT NULL AND ( ( pv.item_size_id = pv.Clean_item_size_id AND pv.item_size_id <> pv.Prev_item_size_id ) OR ( pv.item_size2_id = pv.Clean_item_size2_id AND pv.item_size2_id <> pv.Prev_item_size2_id ) );

drew.allen SSC Guru Points: 77014 More actions · Answer 6

Phil Parkin - Wednesday, September 13, 2017 12:58 PM
Try running this and see what you think. I'm not surprised that you had trouble coding it. There must be a better way than what I've come up with.
CREATE TABLE #item_stores ( item_id INT , item_size_id VARCHAR(200) , item_size2_id VARCHAR(200) ); INSERT #item_stores ( item_id , item_size_id , item_size2_id ) VALUES ( 208798, '34', '34' ) ,( 108829, '34', '34' ) ,( 108829, '34', 'i34' ) ,( 108829, '34', 'i32' ) ,( 108829, '34', 'i30' ) ,( 108829, '34', 'i36' ) ,( 108829, '34', '36' ) ,( 108828, '34', '34' ); SELECT * FROM #item_stores ist;
WITH PrevValues AS ( SELECT ist.item_id , ist.item_size_id , Prev_item_size_id = LAG(ist.item_size_id) OVER (PARTITION BY ist.item_id ORDER BY ist.item_id , Cleaned.Clean_item_size_id , Cleaned.Clean_item_size2_id ) , ist.item_size2_id , Prev_item_size2_id = LAG(ist.item_size2_id) OVER (PARTITION BY ist.item_id ORDER BY ist.item_id , Cleaned.Clean_item_size_id , Cleaned.Clean_item_size2_id ) , Cleaned.Clean_item_size_id , Cleaned.Clean_item_size2_id FROM #item_stores ist CROSS APPLY ( SELECT Clean_item_size_id = REPLACE(ist.item_size_id, 'i', '') , Clean_item_size2_id = REPLACE(ist.item_size2_id, 'i', '') ) Cleaned ) SELECT pv.item_id , pv.item_size_id , pv.item_size2_id FROM PrevValues pv WHERE pv.Prev_item_size_id IS NOT NULL AND ( ( pv.item_size_id = pv.Clean_item_size_id AND pv.item_size_id <> pv.Prev_item_size_id ) OR ( pv.item_size2_id = pv.Clean_item_size2_id AND pv.item_size2_id <> pv.Prev_item_size2_id ) );

I would just do a count by item number and find any that have more than 1 value.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

CaseyP 58642 SSC Enthusiast Points: 131 More actions · Answer 7

First, mind blowing. That someone would help out in such a way is phenomenal. Truly appreciate it. I have been testing this on:

http://sqlfiddle.com/#!6/d6f3d6/2

because the source system is heavily used and I am loathe to run ad hock queries against it until I am ready and sure of the results. I'm not getting the same results you are. That may just be the site and I may just have to wait until after hours to test what you have provided. I am super stoked though at the potential. I'm really going to have to work through what you have done. Thanks so much.

Phil Parkin SSC Guru Points: 247218 More actions · Answer 8

drew.allen - Wednesday, September 13, 2017 1:10 PM
I would just do a count by item number and find any that have more than 1 value.
Drew

Wouldn't that find too many rows? Eg, that would identify

(1,1,32)
and
(1,1,36)

which (as far as I can see) should not be considered invalid.

CaseyP 58642 SSC Enthusiast Points: 131 More actions · Answer 9

Oh snap, I caught my problem with sqlfiddle. It didn't like the # sign. It works awesome !!! Again, simply awesome.

Phil Parkin SSC Guru Points: 247218 More actions · Answer 10

CaseyP 58642 - Wednesday, September 13, 2017 1:16 PM
First, mind blowing. That someone would help out in such a way is phenomenal. Truly appreciate it. I have been testing this on:
http://sqlfiddle.com/#!6/d6f3d6/2
because the source system is heavily used and I am loathe to run ad hock queries against it until I am ready and sure of the results. I'm not getting the same results you are. That may just be the site and I may just have to wait until after hours to test what you have provided. I am super stoked though at the potential. I'm really going to have to work through what you have done. Thanks so much.

No problem, it was a fun diversion.
Here are the results I see, by the way:

Which appears to be what you are looking for, more or less.
But I've just noticed that we're in a 2005 forum, so that is an issue (I apologise). The LAG() function was introduced in 2012 and without it, the query will get even more complex.

CaseyP 58642 SSC Enthusiast Points: 131 More actions · Answer 11

Sadly I have to either export out the data anyway to SQL which was what I was thinking to do as the system is an older INFORMIX release or work it into INFORMIX's limited scope of SQL.The logic is what matters. The key thing was a starting point in logic which you have amply provided. I get the same results and you are absolutely on target and Ive an express instance where I think I can copy the data set to. But that made my head hurt so you idea of an interesting diversion and mine differ just slightly, LMAO.

Phil Parkin SSC Guru Points: 247218 More actions · Answer 12

CaseyP 58642 - Wednesday, September 13, 2017 1:32 PM
Sadly I have to either export out the data anyway to SQL which was what I was thinking to do as the system is an older INFORMIX release or work it into INFORMIX's limited scope of SQL.The logic is what matters. The key thing was a starting point in logic which you have amply provided. I get the same results and you are absolutely on target and Ive an express instance where I think I can copy the data set to. But that made my head hurt so you idea of an interesting diversion and mine differ just slightly, LMAO.

I'm glad you found it helpful and thanks for posting back.
Just in case you don't know, SQL Server 2016 Developer Edition is free, so you can easily download and install it somewhere to test this stuff out on a 'proper' SQL instance local to you.