Identify missing rows

  • Hello,

    I am trying to figure out the following:

    I have two queries:

    1. Select * from tableA where colA <> ''

    2. Select * from TableB where colA <> ''

    For #1 there is a returned records of 2345, for #2 there is a returned records of 1462 - resulting in a difference of 883.

    So what I want to do is update TableB with those missing 883 records. Keep in mind the rows are identical there is just that column that the data is corrupt, all other columns are correct.

    What I tried to do was use a left join to view that incorrect data(883 rows), however, it returned far more rows as being different than the simple select statements:

    Select a.cola,b.cola from TableA

    left join tableB on a.pricol = b.pricol

    where b.cola = '' and a.cola <> ''

    (2336 rows)

    What am I doing wrong here?

    Thanks for your help.

  • mintsql (5/8/2012)


    Hello,

    I am trying to figure out the following:

    I have two queries:

    1. Select * from tableA where colA <> ''

    2. Select * from TableB where colA <> ''

    For #1 there is a returned records of 2345, for #2 there is a returned records of 1462 - resulting in a difference of 883.

    So what I want to do is update TableB with those missing 883 records. Keep in mind the rows are identical there is just that column that the data is corrupt, all other columns are correct.

    What I tried to do was use a left join to view that incorrect data(883 rows), however, it returned far more rows as being different than the simple select statements:

    Select a.cola,b.cola from TableA

    left join tableB on a.pricol = b.pricol

    where b.cola = '' and a.cola <> ''

    (2336 rows)

    What am I doing wrong here?

    Thanks for your help.

    You are looking at different values in tableb.

    #2 above gets rows where colA <> '' and in your above query you are getting tableB where colA = ''

    Pretty hard to know from your example but maybe just changing the second query to have the same criteria would be the answer?

    Select a.cola,b.cola from TableA

    left join tableB on a.pricol = b.pricol

    where b.cola <> '' and a.cola <> ''

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Maybe my intuition was wrong, but my thought process was:

    1. If I query two tables for their non blank columnZ, and tableA has 883 more non rows of that non blank columnZ than tableB, then I should be able to do a left join selecting only those blank columnZ from tableB that are not blank in tableA. (hence my criteria for = '' from table B. columnZ)

    does that make sense?

    I tried your suggestion before I posted but unfortunately it just returns 9 rows, not the 883 that would be blank in TableB but populated in tableA.

  • mintsql (5/8/2012)


    Maybe my intuition was wrong, but my thought process was:

    1. If I query two tables for their non blank columnZ, and tableA has 883 more non rows of that non blank columnZ than tableB, then I should be able to do a left join selecting only those blank columnZ from tableB that are not blank in tableA. (hence my criteria for = '' from table B. columnZ)

    does that make sense?

    I would have to say that no that doesn't make sense to me. You wanted the rows from TableB where ColA <> '' then you go back and say you want the others. Again I may be missing something given the simple example.

    Also, there is nothing in your top two queries that had anything to do with the joining criteria. From your example you have some rows in TableA where Cola <> '' and some in TableB where colA <> '' but nothing to indicate they have any relation to each other.

    Again your actual problem is so masked it is hard to give much advice. You might take a look at INTERSECT and EXCEPT to see if that might help.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • yes , i would go with except or the MERGE statment to idetify and insert the delta

    Jayanth Kurup[/url]

  • I'll try to be more clear:

    We have a table, lets call it Automobiles. We also have a backup table of Automobiles before it was modified, called AutoB. In both tables, there are 57,000 records.

    In the table Automobiles, there is a column named Engine. In Automobiles, there are 8000 non blank rows of the column Engine. In AutoB, there are 4,000 nonblank rows of the column Engine. Resulting in a 4,000 difference.

    I am attempting to display those 4,000 records in AutoB that are not in the current table Automobiles. Bearing in mind, that there are other columns in the both tables that are exactly the same, and only the 4,000 rows of column Engine got modified.

    I want to restore it so that there are only 4,000 non blank rows of Column Engine in Automobiles again.

  • Something along these lines?

    select b.*

    from AutosA a

    left join AutosB b on a.PriKey = b.PriKey AND a.Engine <> b.Engine

    where b.Engine <> ''

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • mintsql (5/8/2012)


    I'll try to be more clear:

    We have a table, lets call it Automobiles. We also have a backup table of Automobiles before it was modified, called AutoB. In both tables, there are 57,000 records.

    In the table Automobiles, there is a column named Engine. In Automobiles, there are 8000 non blank rows of the column Engine. In AutoB, there are 4,000 nonblank rows of the column Engine. Resulting in a 4,000 difference.

    I am attempting to display those 4,000 records in AutoB that are not in the current table Automobiles. Bearing in mind, that there are other columns in the both tables that are exactly the same, and only the 4,000 rows of column Engine got modified.

    I want to restore it so that there are only 4,000 non blank rows of Column Engine in Automobiles again.

    Based on the above, the following does not necessarily mean that you have 4,000 rows difference based solely on the engine column.

    select * from dbo.Automobiles where Engine <> ''; -- 8,000 non-blank rows where Engine <> ''

    select * from dbo.AutoB where Engine <> ''; -- 4,000 non-blank rows where Engine <> ''

    You could, in fact, be missing 4,000 rows from dbo.AutoB.

    So, the question before you is simple, what are you actually attempting to accomplish.

  • What originally happened was the current table was updated without using the proper join constraint, this wiped out all of the records in one of the columns.

    Right after that, the table was updated with new column records, resulting in the table only having the new column records.

    I was just trying to figure out an easy way to insert the old column records into the table that only contained the new column records. No rows were added or deleted from the table.

    However, I have figured it out using backups and without the join statements I was having trouble with:

    1. I first set those columns back to how they were before the update:

    begin tran

    update current

    set current.X = b.X

    from old b

    inner join current a

    on a.PRI = b.PRI

    commit tran

    2. I then reran the update using the proper constraint:

    begin tran

    update current

    set current.X = new.X

    from Newdata new

    inner join current a

    on a.PRI = NEW.PRI

    commit tran

    And the data seems to be correct now, I believe I was just making things too complicated for myself considering I had backups of the data at all points, and the data that was inserted was never deleted.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply