Conditional Search

  • Its been a while since I have worked with SQL. So much that I am struggling to ask this appropriately so I will try best to spell out what I am If I have two such tables like:

    create table inv (
    inv_id3 int,
    inv_desc varchar(200),
    column1 char(6),
    column2 char(200)
    );


    create table item_stores (
    item_id int,
    item_size_id varchar(200),
    item_size2_id varchar(200),
    column1 char(6),
    column2 char(6),
    column3 char(6)
    );

    where their contents look like this:


    insert into inv values
    (208798,'PANTS1'),
    (108829,'PANTS2'),
    (108828,'PANTS3');



    insert into item_stores values
    (208798,'34','34'),
    (108829,'34','34'),
    (108829,'34','i34'),
    (108829,'34','i32'),
    (108829,'34','i30'),
    (108829,'34','i36'),
    (108829,'34','36'),
    (108828,'34','34');

    What I am wanting to do is  expose a list like "inv.inv_id3 , inv.inv_desc" of those records where situations like this exist


    (108829,'34','34'),
    (108829,'34','i34'),

    Where someone has entered a size incorrectly. The letter I at the beginning is the correct format. The trick is that it isn't always the correct format. In other cases within the same data set the lack of the letter I is correct depending on how the item was first entered into the system. So,  I am left to spot where this double occurrence occurs is that makes sense and call out that item_id.

    My questions is short of ssomeone being over generous and providing a sample. Could you in layman's terms provide some guidance on the best way to proceed. I have to accomplish this in the query without the use of any other tooling. A select statement of sorts,

  • Well done on taking the time to post all of your set-up scripts ... this is a great start.
    But I am having trouble understanding what is so special about the 'situations' which you wish to see. Can you please have another go at explaining the logic required to identify a 'situation'?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I need to identify all the IDs that that have a letter "I" in one of their size combinations. Then determine if for that size combination containing the letter "I" if  there is a like size combination without the letter I. Then return a list of bad IDs so that I can do later work moving inventory out of the bad size combinations into the correct ones and then purge the DB of all the mistakes.

    The thing I worry about is if there are a couple million combinations, how to  grab all of the  size combinations for a given ID and then act upon that subset  rather than comparing every record against the sum of all records

    Does this make sense?

  • CaseyP 58642 - Wednesday, September 13, 2017 11:32 AM

    I need to identify all the IDs that that have a letter "I" in one of their size combinations. Then determine if for that size combination containing the letter "I" if  there is a like size combination without the letter I. Then return a list of bad IDs so that I can do later work moving inventory out of the bad size combinations into the correct ones and then purge the DB of all the mistakes.

    The thing I worry about is if there are a couple million combinations, how to  grab all of the  size combinations for a given ID and then act upon that subset  rather than comparing every record against the sum of all records

    Does this make sense?

    Makes some sense 🙂
    So can you tell me why you would be interested in this pair
    (108829,'34','34'),
    (108829,'34','i34')

    but not in this pair
    (108829,'34','i36'),
    (108829,'34','36'),
    ?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • You are correct. I am interested in any pairing that occurs. With all of the possible sizes and combinations I wanted to keep the post as brief as possible but as far as the example data is concerned, the output would be 108829, PANTS2 because there is at a minimum one pair. In this example there are 2 matches.

    It would be cool to output results that showed the ID and the conflicting sizes so that cleanup would be easier and the scope of the cleanup better known but I am having this mental block on how just to identify which IDs are the problem to begin with

  • Try running this and see what you think. I'm not surprised that you had trouble coding it. There must be a better way than what I've come up with.
    CREATE TABLE #item_stores
    (
      item_id   INT
    , item_size_id VARCHAR(200)
    , item_size2_id VARCHAR(200)
    );

    INSERT #item_stores
    (
      item_id
    , item_size_id
    , item_size2_id
    )
    VALUES
    (
      208798, '34', '34'
    )
    ,(
      108829, '34', '34'
    )
    ,(
      108829, '34', 'i34'
    )
    ,(
      108829, '34', 'i32'
    )
    ,(
      108829, '34', 'i30'
    )
    ,(
      108829, '34', 'i36'
    )
    ,(
      108829, '34', '36'
    )
    ,(
      108828, '34', '34'
    );

    SELECT *
    FROM #item_stores ist;

    WITH PrevValues
    AS
    (
      SELECT
       ist.item_id
      , ist.item_size_id
      , Prev_item_size_id = LAG(ist.item_size_id) OVER (PARTITION BY ist.item_id
                       ORDER BY
                         ist.item_id
                       , Cleaned.Clean_item_size_id
                       , Cleaned.Clean_item_size2_id
                     )
      , ist.item_size2_id
      , Prev_item_size2_id = LAG(ist.item_size2_id) OVER (PARTITION BY ist.item_id
                        ORDER BY
                         ist.item_id
                        , Cleaned.Clean_item_size_id
                        , Cleaned.Clean_item_size2_id
                      )
      , Cleaned.Clean_item_size_id
      , Cleaned.Clean_item_size2_id
      FROM
       #item_stores ist
      CROSS APPLY
      (
       SELECT
        Clean_item_size_id = REPLACE(ist.item_size_id, 'i', '')
       , Clean_item_size2_id = REPLACE(ist.item_size2_id, 'i', '')
      )      Cleaned
    )
    SELECT
      pv.item_id
    ,  pv.item_size_id
    ,  pv.item_size2_id
    FROM PrevValues pv
    WHERE
      pv.Prev_item_size_id IS NOT NULL
      AND (
         (
          pv.item_size_id   = pv.Clean_item_size_id
          AND pv.item_size_id  <> pv.Prev_item_size_id
       )
         OR (
           pv.item_size2_id  = pv.Clean_item_size2_id
           AND pv.item_size2_id <> pv.Prev_item_size2_id
         )
      );

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin - Wednesday, September 13, 2017 12:58 PM

    Try running this and see what you think. I'm not surprised that you had trouble coding it. There must be a better way than what I've come up with.
    CREATE TABLE #item_stores
    (
      item_id   INT
    , item_size_id VARCHAR(200)
    , item_size2_id VARCHAR(200)
    );

    INSERT #item_stores
    (
      item_id
    , item_size_id
    , item_size2_id
    )
    VALUES
    (
      208798, '34', '34'
    )
    ,(
      108829, '34', '34'
    )
    ,(
      108829, '34', 'i34'
    )
    ,(
      108829, '34', 'i32'
    )
    ,(
      108829, '34', 'i30'
    )
    ,(
      108829, '34', 'i36'
    )
    ,(
      108829, '34', '36'
    )
    ,(
      108828, '34', '34'
    );

    SELECT *
    FROM #item_stores ist;

    WITH PrevValues
    AS
    (
      SELECT
       ist.item_id
      , ist.item_size_id
      , Prev_item_size_id = LAG(ist.item_size_id) OVER (PARTITION BY ist.item_id
                       ORDER BY
                         ist.item_id
                       , Cleaned.Clean_item_size_id
                       , Cleaned.Clean_item_size2_id
                     )
      , ist.item_size2_id
      , Prev_item_size2_id = LAG(ist.item_size2_id) OVER (PARTITION BY ist.item_id
                        ORDER BY
                         ist.item_id
                        , Cleaned.Clean_item_size_id
                        , Cleaned.Clean_item_size2_id
                      )
      , Cleaned.Clean_item_size_id
      , Cleaned.Clean_item_size2_id
      FROM
       #item_stores ist
      CROSS APPLY
      (
       SELECT
        Clean_item_size_id = REPLACE(ist.item_size_id, 'i', '')
       , Clean_item_size2_id = REPLACE(ist.item_size2_id, 'i', '')
      )      Cleaned
    )
    SELECT
      pv.item_id
    ,  pv.item_size_id
    ,  pv.item_size2_id
    FROM PrevValues pv
    WHERE
      pv.Prev_item_size_id IS NOT NULL
      AND (
         (
          pv.item_size_id   = pv.Clean_item_size_id
          AND pv.item_size_id  <> pv.Prev_item_size_id
       )
         OR (
           pv.item_size2_id  = pv.Clean_item_size2_id
           AND pv.item_size2_id <> pv.Prev_item_size2_id
         )
      );

    I would just do a count by item number and find any that have more than 1 value.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • First, mind blowing. That someone would help out in such a way is phenomenal. Truly appreciate it. I have been testing this on:

    http://sqlfiddle.com/#!6/d6f3d6/2

    because the source system is heavily used and I  am loathe to run ad hock queries against it until I am ready and sure of the results. I'm not getting the same results you are. That may just be the site and I may just have to wait until after hours to test what you have provided. I am super stoked though at the potential. I'm really going to have to work through what you have done. Thanks so much.

  • drew.allen - Wednesday, September 13, 2017 1:10 PM

    I would just do a count by item number and find any that have more than 1 value.

    Drew

    Wouldn't that find too many rows? Eg, that would identify

    (1,1,32)
    and
    (1,1,36)

    which (as far as I can see) should not be considered invalid.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Oh snap, I caught my problem with sqlfiddle. It didn't like the # sign. It works awesome !!! Again, simply awesome.

  • CaseyP 58642 - Wednesday, September 13, 2017 1:16 PM

    First, mind blowing. That someone would help out in such a way is phenomenal. Truly appreciate it. I have been testing this on:

    http://sqlfiddle.com/#!6/d6f3d6/2

    because the source system is heavily used and I  am loathe to run ad hock queries against it until I am ready and sure of the results. I'm not getting the same results you are. That may just be the site and I may just have to wait until after hours to test what you have provided. I am super stoked though at the potential. I'm really going to have to work through what you have done. Thanks so much.

    No problem, it was a fun diversion.
    Here are the results I see, by the way:

    Which appears to be what you are looking for, more or less.
    But I've just noticed that we're in a 2005 forum, so that is an issue (I apologise). The LAG() function was introduced in 2012 and without it, the query will get even more complex.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Sadly I have to either export out the data anyway to SQL which was what I was thinking to do as the system is an older  INFORMIX release or work it into INFORMIX's limited scope of SQL.The logic is what matters. The key thing was a starting point in logic which you have amply provided. I get the same results and you are absolutely on target and Ive an express instance where I think I can copy the data set to. But that made my head hurt so you idea of an interesting diversion and mine differ just slightly, LMAO.

  • CaseyP 58642 - Wednesday, September 13, 2017 1:32 PM

    Sadly I have to either export out the data anyway to SQL which was what I was thinking to do as the system is an older  INFORMIX release or work it into INFORMIX's limited scope of SQL.The logic is what matters. The key thing was a starting point in logic which you have amply provided. I get the same results and you are absolutely on target and Ive an express instance where I think I can copy the data set to. But that made my head hurt so you idea of an interesting diversion and mine differ just slightly, LMAO.

    I'm glad you found it helpful and thanks for posting back.
    Just in case you don't know, SQL Server 2016 Developer Edition is free, so you can easily download and install it somewhere to test this stuff out on a 'proper' SQL instance local to you.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply