Sorting AlphaNumeric non standard field

  • I am working on streamlining some work that another group here is doing. They have a series of several hundred documents (entered into a table). The document naming structure is not really "standard" meaning that there is a set pattern. Other than the first two characters are Alpha (ex: SE).

    I have stripped out those characters as the first level of my order by gets them all together. The problem comes when I try and sort the following:

    11e

    11EOa

    11f

    15c

    1a

    1a

    1b

    After these strings there is a space and then the rest of the file name. I just need to get these in order. I thought about trying to split the numeric from the alpha, but have not been very successful.

    my order by is based on:

    substring(eso.documentTitle, 3, CHARINDEX(' ', eso.documentTitle)-2)

    As you can see there are some that have a single digit, and some that have multiple characters after the numbers. Any thoughts on the best way to attack this? I am trying to do this in my TSQL, as I am grabbing the data and throwing it into a temp table to redo the sort order (IE: update the original table as it is based on the order in which these files where entered into the DB.)

    Hope this makes sense....

    TIA.

  • ...

    Hope this makes sense....

    ...

    It would make full sense, if you could show the exact expected result based on your sample. It would help to understand how you want your data be sorted...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Here is a subset of the data showing the current problem. notice that 2a comes after 27n. there unfortunately is no pattern or set number of files that can or will be present. Some may have OO27a-g or they may go to a-n. Hope that is clearer.

    OO27k Informed Consent from Persons with Limited

    OO27l Expedited Review

    OO27m IRB Review of Modifications

    OO27n Recruiting Potential Research Participants

    OO2a Tim Parker CV 2011 10 25

    OO3a Nursing Quality Plan 2011

    OO3b 2011 Safety Mgmt Plan

    OO3c Health System 2010 Community Benefit

  • in a similar situation, i've used a function dbo.StripNonNumeric, which leaves just the string 0-9, which i could convert to an integer;

    then i could do

    ORDER BY

    CONVERT(int,dbo.StripNonNumeric(SomeColumn)),,

    SomeColumn

    As long as you understand the sorting ramifications, like

    that "11AB23" will get converted to 1123 for the order,

    that's one way to tackle it.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Removing all digits may not help for cases like:

    OO1a 2011 Zero

    OO1a 2012 Non-Zero

    as OP, if I understand correctly, would want to sort them as above, and stripping digits would sort them as

    OO1a 2012 Non-Zero

    OO1a 2011 Zero

    Question: what is the maximum number of digits between your OO prefix and the first Alpha-character you want to be significant for sorting?

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • YUCK!!! Can you go to the to other group and tell him how horrible their naming convention is? Add an underscore, space, anything...

    OO_27_a

    Fixing this at the source will be far better than trying to unravel it later.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean - I couldn't agree more...I am going to be working on that! I just started here 3 weeks ago and this is the second step of a cleanup process that they are spending quite a bit of time on. these file names are output to an HTML page and they a sort order that is when they were placed in the project.

    Step one was cleaning up files that were associated improperly with the 5 categories. That was easy.

    As for the max number of digits. So far 2. and I am really only worried about sorting on anything up to that first space in the file name. The rest of the document name isn't a problem (at least currently).

  • I love it!

    declare @t table (id int identity(1,1), val varchar(200))

    insert @t

    select 'OO27k Informed Consent from Persons with Limited'

    union select 'OO27l Expedited Review'

    union select 'OO27m IRB Review of Modifications'

    union select 'OO27n Recruiting Potential Research Participants'

    union select 'OO2a Tim Parker CV 2011 10 25'

    union select 'OO3a Nursing Quality Plan 2011'

    union select 'OO3b 2011 Safety Mgmt Plan'

    union select 'OO3c Health System 2010 Community Benefit'

    ;with remd

    as

    (

    select id, LEN(MAX(rn)) rd

    from

    (

    select id, substring(val,3,200) v, rn

    from @t

    join (select top (999) cast(ROW_NUMBER() over (order by (select null)) as varchar) rn

    from sys.columns) n

    on n.rn = LEFT(substring(val,3,200),LEN(rn))

    ) q

    group by id

    )

    select t.val

    from @t t

    join remd r on r.id = t.id

    order by substring(val,3 + rd,200)

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • I'm sure someone may find a more elegant way, but this is forcing the sort.

    with c (doc_name) as (

    select 'OO27k' union all

    select 'OO27l' union all

    select 'OO27m' union all

    select 'OO27n' union all

    select 'OO2a' union all

    select 'OO3a' union all

    select 'OO3b' union all

    select 'OO3c'

    )

    select *

    from c

    order by

    case when isnumeric(substring(doc_name, 3, 1))=0 then ASCII(substring(doc_name, 3, 1)) else ASCII(substring(doc_name, 3, 1))+200 end,

    case when isnumeric(substring(doc_name, 4, 1))=0 then ASCII(substring(doc_name, 4, 1)) else ASCII(substring(doc_name, 4, 1))+200 end,

    case when isnumeric(substring(doc_name, 5, 1))=0 then ASCII(substring(doc_name, 5, 1)) else ASCII(substring(doc_name, 5, 1))+200 end

    ______________________________________________________________________________
    How I want a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.

  • ..., but this is forcing the sort.

    Yeah, but it will only work for known values.

    It's as well as doing it manually...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Can you provide the DDL (CREATE TABLE statement) where this data is stored (and there by selected from). It wold help to know more about the data.

  • Eugene Elutin (5/15/2012)


    ..., but this is forcing the sort.

    Yeah, but it will only work for known values.

    It's as well as doing it manually...

    Not at all. It's sorting by column 3, then 4, then 5. It answers the example given.

    ______________________________________________________________________________
    How I want a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.

  • TimParker (5/15/2012)


    Here is a subset of the data showing the current problem. notice that 2a comes after 27n. there unfortunately is no pattern or set number of files that can or will be present. Some may have OO27a-g or they may go to a-n. Hope that is clearer.

    OO27k Informed Consent from Persons with Limited

    OO27l Expedited Review

    OO27m IRB Review of Modifications

    OO27n Recruiting Potential Research Participants

    OO2a Tim Parker CV 2011 10 25

    OO3a Nursing Quality Plan 2011

    OO3b 2011 Safety Mgmt Plan

    OO3c Health System 2010 Community Benefit

    You say there is no set pattern but, let me ask, will that first blob of text always contain at least 1 alpha character followed by at least 1 numeric digit followed by at least 1 alpha character follwed by at least 1 space?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Assuming Jeff's assumptions are correct (except that the first blob is always 2 alpha characters), you can construct an ugly, nasty ORDER BY clause:

    DECLARE @t TABLE ([name] VARCHAR(100))

    INSERT INTO @t

    SELECT 'OO27k Informed Consent from Persons with Limited'

    UNION ALL SELECT 'OO27l Expedited Review'

    UNION ALL SELECT 'OO27 Another Review'

    UNION ALL SELECT 'OO27m IRB Review of Modifications'

    UNION ALL SELECT 'OO27n Recruiting Potential Research Participants'

    UNION ALL SELECT 'OO2a Tim Parker CV 2011 10 25'

    UNION ALL SELECT 'OO3a Nursing Quality Plan 2011'

    UNION ALL SELECT 'OO3b 2011 Safety Mgmt Plan'

    UNION ALL SELECT 'OO3c Health System 2010 Community Benefit'

    SELECT [name]

    FROM (

    SELECT [name], PATINDEX('%[0-9]%', [name]) AS FirstNo

    ,PATINDEX('%[a-zA-Z]%', SUBSTRING([name],3,LEN([name]))) As Suffix

    FROM @t) x

    ORDER BY SUBSTRING([name], 1, 2)

    ,CAST(SUBSTRING([name], 3, 2+Suffix-FirstNo) AS INT)

    ,SUBSTRING([name], 2+Suffix, CHARINDEX(' ', [name]))

    Resulting in this:

    OO2a Tim Parker CV 2011 10 25

    OO3a Nursing Quality Plan 2011

    OO3b 2011 Safety Mgmt Plan

    OO3c Health System 2010 Community Benefit

    OO27 Another Review

    OO27k Informed Consent from Persons with Limited

    OO27l Expedited Review

    OO27m IRB Review of Modifications

    OO27n Recruiting Potential Research Participants


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply