Need assistance avoiding rbar

  • I'm struggling with !rbar in my latest project. Maybe someone can point me in the right direction.

    My source table has person and address data combined.

    I'm trying to take that data and put it into a person table and address table. Because there could be parents and kids in the source table, the person table has a foreign key to the address table.

    I'm stuck in cursor logic - read the record, see if the address already exists in the address table, if it does, then grab that AddrID and use it when inserting into the person table. If the address doesn't exist, insert the address, grab the scope_identity, and then insert the person. On to the next record.

    I would think this could be done as a set, but I just can't wrap my head around it. Any pointers?

    Source table:

    CREATE TABLE [dbo].[Source] (

    ID int IDENTITY (1, 1) NOT NULL,

    FirstName VARCHAR(50) NULL,

    LastName VARCHAR(50) NULL,

    Phone CHAR(15) NULL,

    Address VARCHAR(100) NULL,

    Address2 VARCHAR(100) NULL,

    City VARCHAR(50) NULL,

    State CHAR(2) NULL,

    ZipCode VARCHAR(5) NULL,

    -- etc fields

    CONSTRAINT [PK_RegistrationHighSchoolMailingAddress] PRIMARY KEY CLUSTERED ([ID])

    ) ON [PRIMARY]

    GO

    CREATE TABLE Person(

    PersonID INT IDENTITY(1,1) NOT NULL,

    FirstName VARCHAR(50) NOT NULL,

    LastName varchar(50),

    AddrID INT NULL,

    CONSTRAINT [PK_Person_PersonID] PRIMARY KEY CLUSTERED (PersonID)

    ) ON [PRIMARY];

    go

    ALTER TABLE Person

    ADD CONSTRAINT [UNQ_Person_Names] UNIQUE (FirstName, LastName)

    CREATE TABLE Address(

    AddrID INT IDENTITY(1,1) NOT NULL,

    AddressLine1 VARCHAR(50),

    AddressLine2 VARCHAR(50),

    City VARCHAR(25),

    State CHAR(2) NULL,

    ZIP CHAR(5),

    CONSTRAINT [PK_Address_AddrID] PRIMARY KEY CLUSTERED (AddrID)

    ) ON [PRIMARY];

    ALTER TABLE Address

    ADD CONSTRAINT [UNQ_tbtAddress_ThreeCol] UNIQUE (AddressLine1, AddressLine2, Zip)

    ALTER TABLE [dbo].[Person] WITH CHECK ADD CONSTRAINT [FK_Address_Person] FOREIGN KEY([AddrID])

    REFERENCES [dbo].[Address] ([AddrID])

    GO

  • First, insert into the address table any addresses from the source table that aren't already in the address table. Now everyone in your source table will have a row in the address table so you can join to it and get the address ID.

    When you load the person table some of the rows from Source might already exist. You could use a similar approach to only insert new people from Source.

    INSERT INTO dbo.Address (

    AddressLine1,

    AddressLine2,

    City,

    State,

    ZIP

    )

    SELECT

    sa.Address

    ,sa.Address2

    ,sa.City

    ,sa.State

    ,sa.ZipCode

    FROM

    (

    SELECT

    Address

    ,Address2

    ,City

    ,State

    ,ZipCode

    FROM

    dbo.Source

    GROUP BY

    Address

    ,Address2

    ,City

    ,State

    ,ZipCode

    ) sa -- source addresses

    LEFT OUTER JOIN

    dbo.Address a

    ON sa.Address = a.AddressLine1

    AND sa.Address2 = a.AddressLine2

    AND sa.ZipCode = a.ZIP

    WHERE

    a.AddrID IS NULL -- source row does not exist in address table

  • Some source test data, for anyone else who might want to play.

    INSERT INTO dbo.Source (

    FirstName,

    LastName,

    Phone,

    Address,

    Address2,

    City,

    State,

    ZipCode

    )

    SELECT 'Joe','Smith',NULL,'123 Spruce Street','Apt #1','Madison','WI','12345' UNION ALL

    SELECT 'Sally','Smith',NULL,'123 Spruce Street','Apt #1','Madison','WI','12345' UNION ALL

    SELECT 'Bill','Jones',NULL,'789 5th Ave','Suite #25','Honolulu','HI','54321'

  • Heh... you've just gotta know that any post with "RBAR" in the title is going to catch my eye. Thought I'd stop by and see what's up. Looks like you have it handled just fine, Eric. Nice job.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff.

    Glad I didn’t get hit by any flying pork chops. 😀

  • Heh... Oh my... everybody knows about the pork chops!

    Nope... no pork chops for you today. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hey, thanks for the help!

    I still get a unq constraint violation error, but I'm a heck of a lot farther than I was before.

  • The unique constraint violation might be caused by nulls in your join columns. There are probably rows with a null Address2 line that might already exist in addresses and when you join on a null column the rows from the two sets don't match so the "insert what doesn't exist" logic fails and you get a unique constraint violation .

    Unless you take nulls into account in the join you will be trying to insert rows that violate your unique constraint.

    CREATE TABLE #Source(

    AddressLine CHAR(50) NOT NULL

    ,AddressLine2 VARCHAR(50) NULL -- see the effect nulls have with the final insert into #Address

    )

    CREATE TABLE #Address(

    AddressLine CHAR(50) NOT NULL

    ,AddressLine2 VARCHAR(50) NULL

    )

    -- now some test data

    INSERT INTO #Source (

    AddressLine,

    AddressLine2

    )

    SELECT '123 Spruce Street',NULL UNION ALL -- no address2 value

    SELECT '789 5th Ave','Suite #25'

    -- try this with both versions of AddressLine2 join

    INSERT INTO #Address (

    AddressLine,

    AddressLine2

    )

    SELECT t1.AddressLine, t1.AddressLine2

    FROM #Source t1

    LEFT OUTER JOIN #Address t2

    ON t1.AddressLine = t2.AddressLine

    -- 2 versions of AddressLine2 join

    AND ISNULL(t1.AddressLine2,'') = ISNULL(t2.AddressLine2,'') -- handles nulls

    --AND t1.AddressLine2 = t2.AddressLine2 -- does not handle nulls

    WHERE

    t2.AddressLine IS NULL -- does not already exist in #Address

    -- clean up

    DROP TABLE #Source

    DROP TABLE #Address

  • That makes sense. Darn nulls! Thanks. I couldn't have done this without your help.

    I'm going to have to practice lots more with the !rbar.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply