Use Alpha characters instead of expanding field

  • We have a situation where back many, many years ago it was decided to merge three columns of data into one column on the DW tables.  The field is currently defined as a CHAR(9).  The first 3 digits represent the risk, next 3 are the location, the final 3 are the building.  So this looks like 001001001 for risk 1, location 1 and building 1.  These values always appear like this in this order.  So each section of this field has a max value of 999.  We have recently ran into issues where there are more than 999 locations.   One thought would be to increase the length of the field to 10.  But with coding in who knows how many places that are expecting it to be the 3, 3, 3 format that could cause issues since the location is in the middle.  And changing the order these appear isn't practical either.

    Now we could look at splitting these back out into three separate columns, but again there would be a lot of code to touch.

    So my thought went to using Alpha characters for values greater then 999 and keeping the same format of 3, 3, 3.  I was thinking it might be best to set up a conversion table to use for 1) loading the data and 2) converting back to a numeric value if it's needed.

    I guess if we go with the conversion table it doesn't matter what values we use above 999.  But what do you think makes the most since?

    Start 1000 as 00A(thats zero zero A)  go up to 00Z then to 0B0(zero B zero) or 0BA and so on...

    Or start with 1000 as A00(A zero zero) then to A01 for 1001, up to A99, then start with B00(B zero zero)...

    Or start with AAA for 1000, then AAB for 1001 and so on...

    Just looking to get some ideas before we make the final decision.

    Thanks in advance for your thoughts.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • I would definitely use a conversion table.

    Personally I prefer the "A[0-9][0-9]" format.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • There are many articles covering "Composite columns" (/ "composite keys") vs individual columns (/ surrogate key ) pro's and cons.

    You are now at a junction point handling one of the downsides.

    Sooner or later, these always bite you in the back

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • You mention that there would be a lot of changes required if you change the format of the column. But wouldn't there also be lots of changes required to use a conversion table? If not, why not?

    If you have many similar chunks of code which handle the shredding and concatenation of the column's data, now is the time to think of ways to centralise that code & follow DRY principles. If you abstract the column-handling code to one or two functions, you'd easily be able to change the structure of the data at table level (along with the function definitions) at a later date without requiring code changes elsewhere.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I'm only in charge of a portion of the tables and processes that are affected by this.  I haven't dug into each job that references this field but I'm assuming over 90% of them just take the field as is.  So I assume I won't need to reference this conversion.  The are a few reporting jobs I think might need to be touched.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • Seriously, it sounds like whatever option you choose will be a lot of work.

    Maybe it's time to do it with three separate columns.  You can add the three columns to the table, leaving the existing column as-is.  That may not break any existing code (select * anyone?), and you can then tackle the most important pieces first, and work your way thru the system.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • The bigger problem I just found out is that this column is part of the key structure on some of our financial tables.  So splitting it out makes it more work.

    This is coming from the source systems as 3 separate columns.  So we are just going to convert those greater than 999 to the alpha values, it will be one conversion table.  Then those that need it broken back out will use the conversion table.

    1000 - A00

    1001 - A01 and so on to Z99

    Then go to AA0 to ZZ9 then AAA to ZZZ this should be more than enough.

    Now it's just building all of that in a conversion table.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • That should be fairly easy using tally tables.

    The trickier part would be excluding certain letter combinations, meaning of course you have to determine which ones to omit.  For example: 'ASS', 'FUx', etc..

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • I hadn't thought of that yet.  Might exclude 'I' and 'O' since this code be mistaken for 1 and zero.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • below86 wrote:

    The bigger problem I just found out is that this column is part of the key structure on some of our financial tables.  So splitting it out makes it more work.

    This is coming from the source systems as 3 separate columns.  So we are just going to convert those greater than 999 to the alpha values, it will be one conversion table.  Then those that need it broken back out will use the conversion table.

    1000 - A00

    1001 - A01 and so on to Z99

    Then go to AA0 to ZZ9 then AAA to ZZZ this should be more than enough.

    Now it's just building all of that in a conversion table.

    This is a "Who was first: the chicken or the egg" situation.

    Ofcourse it depends on the efforts you are allowed to perform to correct this situation.

    Splitting the column correctly and adding a (persisted) derived column ( stuff the 3 new columns adding a separator character + index it ) may be a way to the correct path.

    Anyway you need to figure out the impact of this (OP) midification on related systems and work from there on.

    How about reports? Is this composite key shown in reports etc ( do they need re-rendering to accomodate te newly needed space )

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I know splitting them out would be the best solution.  But as with a lot of things we don't have the time or resources to get this done with everything else on our plates.

    -------------------------------------------------------------
    we travel not to escape life but for life not to escape us
    Don't fear failure, fear regret.

  • Here's my thought.

    While Johan is right, and you know splitting makes sense, it's not practical. I'd kick the can and assume we will need to split this, but try the A01 for now. That buys you 1000 more slots, and you'll see how many places the A instead of [0-9] causes issues.

    I'd track that, or at least get a list of code to touch because that will give you the idea of work to split this into another column. You could also add a computed column with no space (and no issues if query tools work well) that decodes this into 4 digits and potentially helps you start to evolve other systems. So I might have

    original_col comp_col
    ------------ --------
    A01001001 1001001001

    If I can slowly get systems to use the computed column (or 3 new ones), I can start to fix code over time.

     

  • This was removed by the editor as SPAM

  • The inventive suggestion to represent numbers bigger than 999 with alphanumeric letters is a good one and might work as a workable solution to the issue you're having. All of the strategies you suggested might be effective, but they also have pros and cons.

    Thanks
    Raavikant

  • Building a new logic may take time plus avoiding certain characters, or keeping a conversion table and when there no resource available, why don't you expand the field itself.  Try to expand all the three at once 4,4,4 so the length will be 12.

    -- Keep it simple

    -- Keep incrementing numbers

    In my opinion it may not come back and bite until either of the 3 values need a 5 digit number.  Check for impact pros & cons.

    =======================================================================

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply