How to ‘shuffle’ column data along to remove empty strings.

  • Hi,

    I need to cleanse some address data that has empty strings in the middle of the some columns. I have 7 address lines in columns Add1 to Add7 some of these have no data and when this occurs in the middle of columns with data I want to shuffle the data along to remove the blanks.

    Example:

    Before

    Add1: My House

    Add2: A Road

    Add3:

    Add4: Some Place

    Add5: Some Town

    Add6:

    Add7: Post Code

    After:

    Add1: My House

    Add2: A Road

    Add3: Some Place

    Add4: Some Town

    Add5: Post Code

    Add6:

    Add7:

    Any suggestions or pointers will be appreciated.

    Thanks, Paul.

  • Hi Buddy,how about you going through this following article and helping us help you?? 🙂

    FORUM POSTING ETIQUETTES - JEFF MODEN[/url]

    When u do so, i am sure a lot of us will help u instantly...

    So please post

    1. Table Structures - CREATE TABLE scripts

    2. Constraints or Indexes in the table, - CREATE SCRIPTS

    3. SAMPLE DATA - INSERT INTO TABLE scripts

    4. Desired output - some visual representation of this.

    🙂

  • This might help. I'd recommend rejigging it to return all address elements.

    ALTER FUNCTION [dbo].[udf_AddressShuffle]

    (

    @AddressElementID INT = 1,

    @Address1 varchar(80),

    @Address2 varchar(80),

    @Address3 varchar(80),

    @Address4 varchar(80),

    @Address5 varchar(80),

    @Address6 varchar(80),

    @Address7 varchar(80),

    @Address8 varchar(80)

    )

    RETURNS VARCHAR(80)

    AS

    BEGIN

    DECLARE @AddressElement VARCHAR(60)

    ;WITH Stringer AS (

    SELECT ROW_NUMBER() OVER (ORDER BY Ind) AS RowNum, *

    FROM (

    SELECT 1 AS Ind, NULLIF(@Address1, '') AS [Address] UNION ALL

    SELECT 2, NULLIF(@Address2, '') UNION ALL

    SELECT 3, NULLIF(@Address3, '') UNION ALL

    SELECT 4, NULLIF(@Address4, '') UNION ALL

    SELECT 5, NULLIF(@Address5, '') UNION ALL

    SELECT 6, NULLIF(@Address6, '') UNION ALL

    SELECT 7, NULLIF(@Address7, '') UNION ALL

    SELECT 8, NULLIF(@Address8, '')

    ) d WHERE [Address] IS NOT NULL )

    SELECT @AddressElement = [Address]

    FROM Stringer WHERE RowNum = @AddressElementID

    RETURN ISNULL(@AddressElement, '')

    END

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • This is quite efficient too:

    DECLARE @Example

    TABLE (

    row_id INTEGER IDENTITY PRIMARY KEY,

    address1 VARCHAR(50) NULL,

    address2 VARCHAR(50) NULL,

    address3 VARCHAR(50) NULL,

    address4 VARCHAR(50) NULL,

    address5 VARCHAR(50) NULL,

    address6 VARCHAR(50) NULL,

    address7 VARCHAR(50) NULL

    );

    INSERT @Example

    (address1, address2, address3, address4, address5, address6, address7)

    VALUES ('House 1', 'A Road', '', 'Some Place', 'Some Town', '', 'Post Code 1');

    INSERT @Example

    (address1, address2, address3, address4, address5, address6, address7)

    VALUES ('House 2', '', 'Another Road', '', 'Some Other Town', 'Post Code 2', '');

    WITH Relational

    AS (

    -- Unpivot the data into a more relational form

    -- Only return non-empty address entries

    -- Renumber the addresses in non-empty order

    SELECT U.row_id,

    name = 'address' + CONVERT(CHAR(1),

    ROW_NUMBER() OVER (PARTITION BY U.row_id ORDER BY U.name)),

    U.value

    FROM @Example E

    UNPIVOT (

    value

    FOR name

    IN (E.address1, E.address2, E.address3, E.address4, E.address5, E.address6, E.address7)

    ) U

    WHERE U.value <> SPACE(0)

    )

    -- Pivot back

    SELECT P.row_id,

    address1 = ISNULL(P.address1, SPACE(0)),

    address2 = ISNULL(P.address2, SPACE(0)),

    address3 = ISNULL(P.address3, SPACE(0)),

    address4 = ISNULL(P.address4, SPACE(0)),

    address5 = ISNULL(P.address5, SPACE(0)),

    address6 = ISNULL(P.address6, SPACE(0)),

    address7 = ISNULL(P.address7, SPACE(0))

    FROM Relational R

    PIVOT (

    MAX(R.value)

    FOR R.name

    IN (address1, address2, address3, address4, address5, address6, address7)

    ) P

    ORDER BY

    P.row_id ASC;

  • This time with a bit more effort on my part! 🙂

    The code is below and it uses 9 address lines not 7 as per my original post. The code shows two tables one with data as it is now and another with how I would like the data to look after the update. Thanks for the replies so far I will take a closer look at them shortly when I get another couple of jobs out of the way.

    Many Thanks, Paul.

    USE TEMPDB

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'#AddressTempPreUpd') AND type in (N'U'))

    DROP TABLE #AddressTempPreUpd

    GO

    CREATE TABLE #AddressTempPreUpd(

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [AdLine1] [varchar](60) NULL,

    [AdLine2] [varchar](90) NULL,

    [AdLine3] [varchar](104) NULL,

    [AdLine4] [varchar](114) NULL,

    [AdLine5] [varchar](40) NULL,

    [AdLine6] [varchar](40) NULL,

    [AdLine7] [varchar](30) NULL,

    [AdLine8] [varchar](40) NULL,

    [AdLine9] [varchar](8) NULL,

    CONSTRAINT [PK_AddressTempPreUpd] PRIMARY KEY CLUSTERED

    (

    [ID] ASC

    )

    ) ON [PRIMARY]

    GO

    INSERT INTO #AddressTempPreUpd

    ([AdLine1]

    ,[AdLine2]

    ,[AdLine3]

    ,[AdLine4]

    ,[AdLine5]

    ,[AdLine6]

    ,[AdLine7]

    ,[AdLine8]

    ,[AdLine9])

    (

    select '', '', '', '1 A Street', '', 'A District', 'A Town', 'A County', 'AA1 1BB' UNION ALL

    select '', '', '', '2 A Street', '', 'A District', 'A Town', 'A County', 'AA1 1BB' UNION ALL

    select '', '', '', '3 A Street', '', 'A District', 'A Town', 'A County', 'AA1 1BB' UNION ALL

    select 'Business Name', '', 'Unit 1', 'A Street', '', 'A District', 'A Town', 'A County', 'AA2 2BB' UNION ALL

    select '', '', '', '1 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB' UNION ALL

    select '', '', '', '2 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB' UNION ALL

    select '', '', '', '3 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB' UNION ALL

    select '', '', 'A Farm', 'A Street', '', 'A District', 'A Town', 'A County', 'AA4 4BB'

    )

    GO

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'#AddressTempPostUpd') AND type in (N'U'))

    DROP TABLE #AddressTempPostUpd

    GO

    CREATE TABLE #AddressTempPostUpd(

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [AdLine1] [varchar](60) NULL,

    [AdLine2] [varchar](90) NULL,

    [AdLine3] [varchar](104) NULL,

    [AdLine4] [varchar](114) NULL,

    [AdLine5] [varchar](40) NULL,

    [AdLine6] [varchar](40) NULL,

    [AdLine7] [varchar](30) NULL,

    [AdLine8] [varchar](40) NULL,

    [AdLine9] [varchar](8) NULL,

    CONSTRAINT [PK_AddressTempPostUpd] PRIMARY KEY CLUSTERED

    (

    [ID] ASC

    )

    ) ON [PRIMARY]

    GO

    INSERT INTO #AddressTempPostUpd

    ([AdLine1]

    ,[AdLine2]

    ,[AdLine3]

    ,[AdLine4]

    ,[AdLine5]

    ,[AdLine6]

    ,[AdLine7]

    ,[AdLine8]

    ,[AdLine9])

    (

    select '1 A Street', 'A District', 'A Town', 'A County', 'AA1 1BB','','', '', '' UNION ALL

    select '2 A Street', 'A District', 'A Town', 'A County', 'AA1 1BB','','', '', '' UNION ALL

    select '3 A Street', 'A District', 'A Town', 'A County', 'AA1 1BB','','', '', '' UNION ALL

    select 'Business Name', 'Unit 1', 'A Street', 'A District', 'A Town', 'A County', 'AA2 2BB','','' UNION ALL

    select '1 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB','','', '' UNION ALL

    select '2 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB','','', '' UNION ALL

    select '3 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB','','', '' UNION ALL

    select 'A Farm', 'A Street', 'A District', 'A Town', 'A County', 'AA4 4BB','','', ''

    )

    GO

    SELECT [ID]

    ,[AdLine1]

    ,[AdLine2]

    ,[AdLine3]

    ,[AdLine4]

    ,[AdLine5]

    ,[AdLine6]

    ,[AdLine7]

    ,[AdLine8]

    ,[AdLine9]

    FROM #AddressTempPreUpd

    GO

    SELECT [ID]

    ,[AdLine1]

    ,[AdLine2]

    ,[AdLine3]

    ,[AdLine4]

    ,[AdLine5]

    ,[AdLine6]

    ,[AdLine7]

    ,[AdLine8]

    ,[AdLine9]

    FROM #AddressTempPostUpd

    GO

  • Thanks Paul and Chris for your replies. I have decided to use Paul's example with pivot and unpivot. Code using my example data below:

    USE TEMPDB

    IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'#AddressTempPreUpd') AND type in (N'U'))

    DROP TABLE #AddressTempPreUpd

    GO

    CREATE TABLE #AddressTempPreUpd(

    [ID] [int] IDENTITY(1,1) NOT NULL,

    [AdLine1] [varchar](100) NULL,

    [AdLine2] [varchar](100) NULL,

    [AdLine3] [varchar](100) NULL,

    [AdLine4] [varchar](100) NULL,

    [AdLine5] [varchar](100) NULL,

    [AdLine6] [varchar](100) NULL,

    [AdLine7] [varchar](100) NULL,

    [AdLine8] [varchar](100) NULL,

    [AdLine9] [varchar](100) NULL,

    CONSTRAINT [PK_AddressTempPreUpd] PRIMARY KEY CLUSTERED

    (

    [ID] ASC

    )

    ) ON [PRIMARY]

    GO

    INSERT INTO #AddressTempPreUpd

    ([AdLine1]

    ,[AdLine2]

    ,[AdLine3]

    ,[AdLine4]

    ,[AdLine5]

    ,[AdLine6]

    ,[AdLine7]

    ,[AdLine8]

    ,[AdLine9])

    (

    select '', '', '', '1 A Street', '', 'A District', 'A Town', 'A County', 'AA1 1BB' UNION ALL

    select '', '', '', '2 A Street', '', 'A District', 'A Town', 'A County', 'AA1 1BB' UNION ALL

    select '', '', '', '3 A Street', '', 'A District', 'A Town', 'A County', 'AA1 1BB' UNION ALL

    select 'Business Name', '', 'Unit 1', 'A Street', '', 'A District', 'A Town', 'A County', 'AA2 2BB' UNION ALL

    select '', '', '', '1 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB' UNION ALL

    select '', '', '', '2 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB' UNION ALL

    select '', '', '', '3 A Lane', 'A Locality', 'A District', 'A Town', 'A County', 'AA3 3BB' UNION ALL

    select '', '', 'A Farm', 'A Street', '', 'A District', 'A Town', 'A County', 'AA4 4BB'

    )

    GO

    WITH Relational

    AS (

    -- Unpivot the data into a more relational form

    -- Only return non-empty address entries

    -- Renumber the addresses in non-empty order

    SELECT u.id,

    name = 'address' + CONVERT(CHAR(1),

    ROW_NUMBER() OVER (PARTITION BY u.id ORDER BY u.name)),

    value

    FROM #AddressTempPreUpd E

    UNPIVOT (

    value

    FOR name

    IN

    (

    E.[AdLine1]

    ,E.[AdLine2]

    ,E.[AdLine3]

    ,E.[AdLine4]

    ,E.[AdLine5]

    ,E.[AdLine6]

    ,E.[AdLine7]

    ,E.[AdLine8]

    ,E.[AdLine9]

    )

    ) U

    WHERE U.value <> SPACE(0)

    )

    -- Pivot back

    SELECT P.id,

    address1 = ISNULL(P.address1, SPACE(0)),

    address2 = ISNULL(P.address2, SPACE(0)),

    address3 = ISNULL(P.address3, SPACE(0)),

    address4 = ISNULL(P.address4, SPACE(0)),

    address5 = ISNULL(P.address5, SPACE(0)),

    address6 = ISNULL(P.address6, SPACE(0)),

    address7 = ISNULL(P.address7, SPACE(0))

    FROM Relational R

    PIVOT (

    MAX(R.value)

    FOR R.name

    IN (address1, address2, address3, address4, address5, address6, address7, address8, address9)

    ) P

    ORDER BY

    P.id ASC;

  • You just need to add address8 and address9 into the PIVOT's SELECT...;-)

  • Thanks Paul. I have got it coded into my procedures it is working very well and the performance is good.

    Paul White NZ (4/9/2010)


    You just need to add address8 and address9 into the PIVOT's SELECT...;-)

  • Cool - thanks!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply