COALESCE to replace multiple CASE statements

  • Luis Cazares (5/27/2014)


    This is just an idea that might work and a shot in the dark at most. Maybe a nested replace will perform better.

    Something like this:

    DECLARE @CSV varchar(100) = 'cricket, soccer, baseball, golf'

    SELECT REPLACE(REPLACE(REPLACE(@CSV,'cricket','ck'),'soccer','sc'),'baseball','bb')

    This will not give the result the OP is looking for. The concatenation in the original code was conditional. This concatenates all the values, whether or not they were referred to by the data or not.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • nimalatissa (5/27/2014)


    I'm very confused. These tables are normalized to 3rd normal form. There shouldn't be anything to do with that. the column in this table have comma separated values and I need to return a specif value for those comma separated values. This was working fine, but we are seeing slowness and trying get rid of these CASE statements.

    So we have determined that the reason for your request for help is because the query is slow. Getting rid of case expressions is highly unlikely to help much if any. What you are really asking for help with is to make your query run faster. We can help with that. We will need to see an actual execution plan at the very least. Additionally providing table and index definitions would be a big help.

    I have to agree with Luis that some normalization would help here. You state that your tables are 3NF yet you also state that one of the columns contains comma separated values. That means you are storing multiple values in a single column. This is in violation of 1NF. However, let's focus on the issue at hand and get your query performance improved.

    If you need some instruction on how to find the details I requested please take a few minutes and read this article.

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Alvin Ramard (5/28/2014)


    ChrisM@Work (5/28/2014)


    Alvin Ramard (5/28/2014)


    Luis Cazares (5/27/2014)


    Alvin,

    The code makes clear that there's a concatenation based on multiple values on a single column.

    CASE WHEN A.[COL_1] LIKE '%cricket%' THEN 'ck' + ',' ELSE '' END +

    CASE WHEN A.[COL_1] LIKE '%soccer%' THEN 'sc' + ',' ELSE '' END +

    ....

    That means that the table is not properly normalized. I'm not sure why nimalatissa would say that they are normalized and at the same time there's a column with comma separated values.

    Luis, the code says NOTHING about the tables! Stop making assumptions!!

    The comma separated values are the output of the query. Did you ever consider they might not be inserted into a table?

    nimalatissa (5/27/2014)


    ...the column in this table have comma separated values...

    Alvin, quit giving Luis a hard time!

    Sorry Chris, but Luis is doing nothing but making this more confusing for the OP. There's nothing to back up his assumptions and the suggestion he gave is WRONG!

    Alvin,

    I'm sorry, but the code is clear on what it does (at least for me). The code is concatenating short values based on values that are in a single column and row. That means that either the code is nonsense as you'll always have one value with a comma at the end or the column used has multiple values in it which violate the first normal form (and all others in consequence). The OP confirmed that the column has comma separated values so I wasn't wrong and my analysis was accurate. The code won't tell me the complete picture, but it can give me an idea of the requirement.

    You, on the other side, are saying that my suggestion is wrong and that really is an assumption without anything to validate it.

    I gave an alternative that might or might not work. I'm aware that it's not exactly the same functionality. The original code will suffer from silent truncation while mine won't. Either code will have to change if a non-considered value is added to the column and they want to include it in the output. Normalizing the table will prevent any future code changes by having a lookup table (or catalog as you which) and a table with a row for each value.

    Limitations on the use of UDFs (of any type) and the lack of DDL, sample data, expected results and other considerations for performance enhancements prevent us from giving the best options, but that doesn't mean that we can't analyse the problems. That's my everyday job and I'm proud to be good at it.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • nimalatissa (5/27/2014)


    I'm very confused. These tables are normalized to 3rd normal form. There shouldn't be anything to do with that. the column in this table have comma separated values and I need to return a specif value for those comma separated values. This was working fine, but we are seeing slowness and trying get rid of these CASE statements.

    That's why people are saying the table isn't normalized. You have more than one value in a column.

    Please, do us all a favor... post a couple of rows of that column so we can come up with a known solution for a known problem before two of my friends kick the crap out of each other on this thread. πŸ˜›

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Alvin Ramard (5/28/2014)


    ChrisM@Work (5/28/2014)


    Alvin Ramard (5/28/2014)


    Luis Cazares (5/27/2014)


    Alvin,

    The code makes clear that there's a concatenation based on multiple values on a single column.

    CASE WHEN A.[COL_1] LIKE '%cricket%' THEN 'ck' + ',' ELSE '' END +

    CASE WHEN A.[COL_1] LIKE '%soccer%' THEN 'sc' + ',' ELSE '' END +

    ....

    That means that the table is not properly normalized. I'm not sure why nimalatissa would say that they are normalized and at the same time there's a column with comma separated values.

    Luis, the code says NOTHING about the tables! Stop making assumptions!!

    The comma separated values are the output of the query. Did you ever consider they might not be inserted into a table?

    nimalatissa (5/27/2014)


    ...the column in this table have comma separated values...

    Alvin, quit giving Luis a hard time!

    Sorry Chris, but Luis is doing nothing but making this more confusing for the OP. There's nothing to back up his assumptions and the suggestion he gave is WRONG!

    Its quite clear that a csv column is not only not normalised, but given the leading % in the search condition (doing a LIKE '%someTerm%' search) will create a performance issue. I credit Luis for sniffing this out from the beginning.

    ----------------------------------------------------

  • MMartin1 (8/1/2014)


    Its quite clear that a csv column is not only not normalised, but given the leading % in the search condition (doing a LIKE '%someTerm%' search) will create a performance issue. I credit Luis for sniffing this out from the beginning.

    Whether it's normalised depends on whose definition of 1NF you believe. Personally I use the original 1NF, but there are a lot of people who believed Date when he claimed that Codd had got it wrong and his atomicity condition was not needed and could be replaced by something else (although it was never altogether clear what the something else was).

    This page will give you a flavour of Date's version of 1NF; he explicitly allows, for example, repeating groups and pointers to arrays. It is also an example of how he succeeded in being called a co-counder (with Codd) of the Relational Model, despite not being named as a co-author in any of the early papers, nor being referenced in any of those papers - which of course is not surprising as until 1974 Date was an instructor in IBM's European education program and after that was involved in technical planning of SQL and DB2 before he got involved in anything actually relational.

    Tom

  • MMartin1 (8/1/2014)

    .

    .

    .

    Its quite clear that a csv column is not only not normalised, but given the leading % in the search condition (doing a LIKE '%someTerm%' search) will create a performance issue. I credit Luis for sniffing this out from the beginning.

    The OP provides no evidence that LIKE is used anywhere other than the output list. As Jeff pointed out earlier, we are completely in the dark regarding the performance issue.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • TomThomson (8/2/2014)


    MMartin1 (8/1/2014)


    Its quite clear that a csv column is not only not normalised, but given the leading % in the search condition (doing a LIKE '%someTerm%' search) will create a performance issue. I credit Luis for sniffing this out from the beginning.

    Whether it's normalised depends on whose definition of 1NF you believe. Personally I use the original 1NF, but there are a lot of people who believed Date when he claimed that Codd had got it wrong and his atomicity condition was not needed and could be replaced by something else (although it was never altogether clear what the something else was).

    This page will give you a flavour of Date's version of 1NF; he explicitly allows, for example, repeating groups and pointers to arrays. It is also an example of how he succeeded in being called a co-counder (with Codd) of the Relational Model, despite not being named as a co-author in any of the early papers, nor being referenced in any of those papers - which of course is not surprising as until 1974 Date was an instructor in IBM's European education program and after that was involved in technical planning of SQL and DB2 before he got involved in anything actually relational.

    Thank you Tom, I'll have a look at that. I probably fall into the class of a follower of Codd, but its good to keep an open mind.

    ----------------------------------------------------

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply