Populating a sort order column based on string data

  • We have a table that stores building and unit numbers in an apartment building. What I need to do is populate a sort_order column that will be used to display the units within the application. Generally, the data will look something like this:

    building unit sort_order

    ------- ---- ----------

    1 101 1

    1 102 2

    1 103 3

    2 201 4

    2 202 5

    3 301 6

    10 1001 7

    Sorting this data is a no-brainer:

    UPDATE units

    SET sort_order = CAST(building AS VARCHAR) + unit

    It's worth mentioning that the unit column is a VARCHAR. Occasionally, however, units will contain letters as well:

    building unit sort_order

    ------- ---- ----------

    1 101 1

    1 102 2

    1 103 3

    10 1001 ?

    2 A201 ?

    2 B202 ?

    3 A301 ?

    In this situation, my sorting scheme fails. Additionally, unit 1001 is sorted improperly and ends up displaying before the units in building 2. The ultimate goal is for the data to be sorted as follows:

    building unit sort_order

    ------- ---- ----------

    1 101 1

    1 102 2

    1 103 3

    2 A201 4

    2 A202 5

    2 B202 6

    3 A301 7

    10 1001 8

    So the challenge is... how would I go about writing a T-SQL statement(s) that will sort the data properly while taking into account letters?

  • You might go for a two part sort:

    SELECT *

    FROM units

    ORDER BY CASE WHEN unit LIKE '[A-Z]___' THEN CAST(RIGHT(unit,3) AS int) ELSE CAST(unit AS int) END

    ,CASE WHEN unit LIKE '[A-Z]___' THEN LEFT(unit,1) ELSE '' END

    But I do not have the full specification of your unit codes. Therefore I used the assumption that it is one of the following forms:

    1. 1 letter followed by 3 numeric digits (e.g. X123)

    2. 3 numeric digits (e.g. 123)

    3. 4 numeric digits (e.g. 1234)

    Best Regards,

    Chris Büttner

  • @Christian

    Thanks for the reply. Your solution will indeed work; however, I'm looking for a solution that will:

    1) populate the sort_order column appropriately

    2) work regardless of the format of the units

    I attached a simple create/populate table statement for your convenience.

  • You need to split the Unit values apart. Sort on the Numeric part first, then the Alpha part.

    SELECT Building,

    unit,

    CONVERT(INT,RIGHT (unit, (len(unit) - patindex('%[0-9]%',UNIT)) + 1)),

    LEFT (unit, patindex('%[0-9]%',UNIT)-1)

    , ROW_NUMBER() OVER (ORDER BY Building, CONVERT(INT,RIGHT (unit, (len(unit) - patindex('%[0-9]%',UNIT)) + 1)), LEFT (unit, patindex('%[0-9]%',UNIT)-1)) SortOder

    FROM #units

    ORDER BY Building, CONVERT(INT,RIGHT (unit, (len(unit) - patindex('%[0-9]%',UNIT)) + 1)), LEFT (unit, patindex('%[0-9]%',UNIT)-1)

    Note that this is assuming the alpha portion is always first and the numeric portion always last. If that isn't the case then you will have more work to do.

    Gary Johnson
    Sr Database Engineer

  • @gary

    Unfortunately, there is no consistency to the format of the unit numbers. So whatever solution the final solution, it has to work for any pattern.

    One of the options I was exploring was converting each character of the unit number to its ASCII equivalent. This would result in a string like this:

    unit 116 ==> 494954

    Since the ASCII values for numbers is less than that for letters, I should be able to use it as the sort order.

  • something like this?

    DROP TABLE #buildings

    CREATE TABLE #buildings (building int, unit varchar(4), sort_order int)

    INSERT INTO #buildings (building, unit, sort_order)

    SELECT 1, '101', 1 UNION ALL

    SELECT 1, '102', 2 UNION ALL

    SELECT 1, '103', 3 UNION ALL

    SELECT 2, 'A201', 4 UNION ALL

    SELECT 2, 'A202', 5 UNION ALL

    SELECT 2, 'B202', 6 UNION ALL

    SELECT 3, 'A301', 7 UNION ALL

    SELECT 10, '1001', 8

    SELECT *,

    RIGHT('00000' + CAST(building AS VARCHAR), 5) +

    RIGHT('00000' + unit, 5)

    FROM #buildings ORDER BY 4

    Results:

    building unit sort_order

    ----------- ---- ----------- ----------

    1 101 1 0000100101

    1 102 2 0000100102

    1 103 3 0000100103

    2 A201 4 000020A201

    2 A202 5 000020A202

    2 B202 6 000020B202

    3 A301 7 000030A301

    10 1001 8 0001001001

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • OK.. Well this is a little more dynamic 🙂

    IF OBJECT_ID('tempdb..#units') IS NOT NULL

    DROP TABLE #units

    CREATE TABLE #units (

    buildingINT,

    unitVARCHAR(25),

    sort_orderINT

    );

    INSERT INTO#units

    ( building, unit )

    SELECT 1, 'A' UNION ALL

    SELECT 1, 'AA' UNION ALL

    SELECT 1, 'B' UNION ALL

    SELECT 1, 'C' UNION ALL

    SELECT 1, 'CC' UNION ALL

    SELECT 1, '102' UNION ALL

    SELECT 1, '103' UNION ALL

    SELECT 1, '104' UNION ALL

    SELECT 1, '105A' UNION ALL

    SELECT 1, 'B106' UNION ALL

    SELECT 1, '107C' UNION ALL

    SELECT 1, '108' UNION ALL

    SELECT 2, '404' UNION ALL

    SELECT 2, '405' UNION ALL

    SELECT 2, '406' UNION ALL

    SELECT 1, '115' UNION ALL

    SELECT 3, 'L102' UNION ALL

    SELECT 3, 'L103' UNION ALL

    SELECT 3, 'L104' UNION ALL

    SELECT 3, 'L105' UNION ALL

    SELECT 3, 'L106' UNION ALL

    SELECT 3, 'L107' UNION ALL

    SELECT 1, '211' UNION ALL

    SELECT 1, '212' UNION ALL

    SELECT 1, '301' UNION ALL

    SELECT 1, '317' UNION ALL

    SELECT 2, '112' UNION ALL

    SELECT 2, '201' UNION ALL

    SELECT 2, '202' UNION ALL

    SELECT 1, '318' UNION ALL

    SELECT 1, '319' UNION ALL

    SELECT 2, '100' UNION ALL

    SELECT 2, '101' UNION ALL

    SELECT 2, '102' UNION ALL

    SELECT 2, '104' UNION ALL

    SELECT 1, '214' UNION ALL

    SELECT 2, '105' UNION ALL

    SELECT 2, '205' UNION ALL

    SELECT 2, '301' UNION ALL

    SELECT 2, '302' UNION ALL

    SELECT 2, '401' UNION ALL

    SELECT 2, '402' UNION ALL

    SELECT 2, '403' UNION ALL

    SELECT 2, '501' UNION ALL

    SELECT 2, '502' UNION ALL

    SELECT 2, '601' UNION ALL

    SELECT 2, '602' UNION ALL

    SELECT 2, '701' UNION ALL

    SELECT 2, '702' UNION ALL

    SELECT 2, '801' UNION ALL

    SELECT 2, '802' UNION ALL

    SELECT 2, '901' UNION ALL

    SELECT 2, '902' UNION ALL

    SELECT 2, '903' UNION ALL

    SELECT 2, 'PH01' UNION ALL

    SELECT 2, 'PH02' UNION ALL

    SELECT 2, 'PH03' UNION ALL

    SELECT 2, 'PH04' UNION ALL

    SELECT 3, 'L101' UNION ALL

    SELECT 3, 'L108' UNION ALL

    SELECT 3, 'L109' UNION ALL

    SELECT 3, 'L110' UNION ALL

    SELECT 3, 'L111' UNION ALL

    SELECT 3, 'L112' UNION ALL

    SELECT 3, 'TH1' UNION ALL

    SELECT 3, 'TH2' UNION ALL

    SELECT 3, 'TH3' UNION ALL

    SELECT 3, 'TH4' UNION ALL

    SELECT 3, 'TH5' UNION ALL

    SELECT 3, 'TH6';

    ;WITH Sorter(Building, unit, NumericCol, AlphaCol,SortOrder) AS (

    SELECT Building

    ,unit

    ,NumericCol = CASE patindex('%[0-9]%',UNIT)

    WHEN 0 THEN ''

    ELSE SUBSTRING (unit, patindex('%[0-9]%',UNIT) , CASE (patindex('%[a-zA-Z]%',UNIT) -1)

    WHEN 0 THEN LEN(unit)

    WHEN -1 THEN LEN(Unit)

    ELSE ABS(patindex('%[a-zA-Z]%',UNIT) -1 )

    END)

    END

    ,AlphaCol = CASE patindex('%[a-zA-Z]%',UNIT)

    WHEN 0 THEN ''

    ELSE SUBSTRING (unit, patindex('%[a-zA-Z]%',UNIT) , CASE (patindex('%[0-9]%',UNIT) -1)

    WHEN 0 THEN LEN(unit)

    WHEN -1 THEN LEN(Unit)

    ELSE ABS(patindex('%[0-9]%',UNIT) -1 )

    END)

    END

    , SortOrder = ROW_NUMBER() OVER(ORDER BY Building,

    CASE patindex('%[0-9]%',UNIT) -- Numeric section

    WHEN 0 THEN ''

    ELSE SUBSTRING (unit, patindex('%[0-9]%',UNIT) , CASE (patindex('%[a-zA-Z]%',UNIT) -1)

    WHEN 0 THEN LEN(unit)

    WHEN -1 THEN LEN(Unit)

    ELSE ABS(patindex('%[a-zA-Z]%',UNIT) -1 )

    END)

    END,

    CASE patindex('%[a-zA-Z]%',UNIT) -- Alpha Section

    WHEN 0 THEN ''

    ELSE SUBSTRING (unit, patindex('%[a-zA-Z]%',UNIT) , CASE (patindex('%[0-9]%',UNIT) -1)

    WHEN 0 THEN LEN(unit)

    WHEN -1 THEN LEN(Unit)

    ELSE ABS(patindex('%[0-9]%',UNIT) -1 )

    END)

    END)

    FROM #units

    )

    UPDATE u

    SET Sort_Order = s.SortOrder

    FROM #Units u

    JOIN Sorter s

    ON u.Building = s.Building

    AND u.unit = s.unit

    SELECT * FROM #units

    Gary Johnson
    Sr Database Engineer

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply