Merge Multiple Records into one w/o Cursor

  • To All:

    I need a performant approach to merging multiple records into one. This will need to merge about 3 mill records. I thought of a cursor but is there a better way?

    INSERT INTO #CustomerInfo

    ( CustRowID

    ,CustFuzzyGroup_Key_Out_ID

    ,CustLocID

    ,CustNo

    ,FirstName

    ,MiddleName

    ,LastName

    ,AddressLine1

    ,City

    ,State

    ,ZipCode

    ,CellPhone

    ,BusinessPhone

    ,EmailAddress

    ,CreateDate

    )

    VALUES

    ( 25, 3, 204, '127489', 'Tommy', 'A', 'James', '123 Any Street', 'New York', 'NY', '10011', '2121231234', '', '', '2011-08-09 16:23:45' ),

    ( 18, 3, 255, '36987', 'Tommy', '', 'James', '123 Any Street', 'New York', 'NY', '10011', '', '2124441212

    ', 'corpemail@somecorp.com', '2011-09-09 00:00:00' ),

    ( 1, 3,173, '68974', 'Tommy', '', 'James', '123 Any Street', 'New York', 'NY', '10011', '', '2124155000', 'email@somemail.com', '2011-07-09 23:23:23' ),

    ( 31, 4, 89, '789657', 'Paula', 'P', 'Lady', '45 Main Street', 'Milwaukee', 'WI', '53111', '4148889797', '4148256000', 'email@somemail.com', '2011-08-07 23:00:00' ),

    ( 22, 4,58, '9897865T', 'Paula', '', 'Lady', '45 Main Street', 'Milwaukee', 'WI', '53112', '4148011234', '', '', '2011-09-03 23:00:00' )

    GO

    SELECT * FROM [#CustomerInfo]

    GO

    --DROP TABLE [#CustomerInfo]

    -- Desired Results

    /*

    MasterCustIDFirstNameMiddleNameLastNameAddressLine1CityStateZipCodeCellPhoneBusinessPhoneEmailAddress CreateDate

    1TommyA James123 Any StreetNew YorkNY1001121211112122124441212corpemail@somecorp.com9/9/2011

    2PaulaPLady45 Main StreetMilwaukeeWI5311241480112344148256000email@somemail.com 9/3/2011

    */

    /*

    --CustInfoKeyLU

    MasterCustIDCustRowIDCustLocIDCustNoCustFuzzyGroup_Key_Out_ID

    1252041274893

    118255369873

    11173689743

    231897896574

    221589897865T4

    */

    Notice I want the MAX NonNULL values based on the CreateDate if there is a possibility of 2 or more values.

    Any ideas as to the best approach? I can use TSQL, SSIS or both.

    Thanks in advance.

  • First of all, please provide the table def for #CustomerInfo so your script can actually be used (we would have to guess the data type you're using).

    Also, please note that an empty string is different than NULL. In the given example, "'Tommy', '', 'James'" would lead to an empty middle name since '' is not equal null.

    Regarding the business logic itself: How do you know the two persons T. James and T. A. James are not two different people just sharing the same address? How do you guarantee to correctly identify rows that can be merged?



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Sorry About that. Bad pasting job

    CREATE TABLE [#CustomerInfo](

    [CustRowID] [int] NULL,

    [CustFuzzyGroup_Key_Out_ID] [int] NULL,

    [CustLocID] [int] NULL,

    [CustNo] [varchar](17) NULL,

    [FirstName] [varchar](25) NULL,

    [MiddleName] [varchar](25) NULL,

    [LastName] [varchar](40) NULL,

    [AddressLine1] [varchar](45) NULL,

    [City] [varchar](35) NULL,

    [State] [varchar](3) NULL,

    [ZipCode] [varchar](10) NULL,

    [CellPhone] [varchar](10) NULL,

    [BusinessPhone] [varchar](15) NULL,

    [EmailAddress] [varchar](40) NULL,

    [CreateDate] [datetime] NULL

    ) ON [PRIMARY]

    GO

  • Just leaves a few more questions to answer 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • First of all you are correct, the '' in the middle name should be nulls.

    In regards to identifying an individual I have to make some choices here. So if it is the same name with the same address I have to assume that it is the same person. I have spoken with the business and they are ok with this.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply