Table-based Functions vs. Views

  • SQLKnowItAll (5/9/2012)


    No matter if you reference all the fields explicitly from ViewX in your SP or just simply do a SELECT * FROM ViewX, you will be forced to refresh those SPs any time you make a change to ViewX.

    I think the problem is that you are requiring all data to be queried from views. Query directly from the SP to the table and you avoid this issue. For me, you did not make a good argument to have all classes reference a view.

    There's really no justification for hitting the table directly. Any change to the table then forces you to refresh all your SPs. Forcing your data access to your table to go through a Function means that you make that change to reflect your table change once (in your function) and then you don't have to touch anything else that does a SELECT against that data. Your just leaving the system with too many failure points when changes are made.

    Also, it makes more sense from a programming side. If you know that ViewX has everything you need to populate ClassX (even possible helper table or related data necessary for that class that could come from multiple tables) then your life becomes much easier on that side of thing.

  • Putts (5/9/2012)


    Any change to the table then forces you to refresh all your SPs.

    Not at all. The only time that a procedure might need refreshing if the table is changed is if select * is used.

    I'm not seeing a good reason for the multiple levels of indirection. Any changes to the base table should be proceeded with impact analysis and followed with proper testing, so there should be minimal chance of stuff getting missed, and if it does it gets picked up in testing.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Putts (5/9/2012)


    Currently, our Data Access relies heavily on SPs - we do not call Tables or Views directly from code. If we have a reason to query the database then we create a specific SP to reflect that need. Then all data requests for that data go through that one SP and if a change is needed then it's done once in the SP and we don't have to hunt down all code references looking for that (sure, you could say "well why not force all such calls to just go through one programming method" but we've found that people don't stick to that very well).

    This sounds like a case of over-engineering. I can understand the reasoning behind it, but the reality - as you're discovering - is that you're just giving yourself more to deal with any time there is a change.

    Personally, I would just write my stored procedures to pull directly from the tables. This way you avoid the issue entirely. If you find that there are things that you do all the time that are complex enough to warrant a view, then by all means create a view and use that; but make it clear to your programmers that whenever these views change, that gets treated as a change to production - ie. it is their responsibility to find out what else is affected (for example, what stored procedures need to be refreshed) and to take the appropriate steps. And as for people not sticking to it... that's nonsense. It's their job. If they break something, hold them responsible.

  • There's really no justification for hitting the table directly.

    That makes no sense. There's really no justification for hitting the table indirectly.

    Any change to the table then forces you to refresh all your SPs.

    False:CREATE TABLE selecttest (id int, name varchar(20));

    GO

    INSERT INTO selecttest

    SELECT 1, 'jared';

    GO

    create procedure proctest

    AS

    begin

    SELECT * FROM selecttest

    end

    GO

    exec proctest

    go

    alter table selecttest add phone bigint

    go

    exec proctest

    go

    DROP table selecttest

    go

    drop procedure proctest

    go

    Forcing your data access to your table to go through a Function means that you make that change to reflect your table change once (in your function) and then you don't have to touch anything else that does a SELECT against that data. Your just leaving the system with too many failure points when changes are made.

    Moot point, with select *... Lazy otherwise. A proper code review should take precedence over this.

    Also, it makes more sense from a programming side. If you know that ViewX has everything you need to populate ClassX (even possible helper table or related data necessary for that class that could come from multiple tables) then your life becomes much easier on that side of thing.

    It is also easier to fix holes in a boat with duct tape than to have the hull repaired. However, when you try to get someone to take a ride in that boat they will probably cringe. They certainly won't buy it from you.

    Jared
    CE - Microsoft

  • cphite (5/9/2012)


    Putts (5/9/2012)


    Currently, our Data Access relies heavily on SPs - we do not call Tables or Views directly from code. If we have a reason to query the database then we create a specific SP to reflect that need. Then all data requests for that data go through that one SP and if a change is needed then it's done once in the SP and we don't have to hunt down all code references looking for that (sure, you could say "well why not force all such calls to just go through one programming method" but we've found that people don't stick to that very well).

    This sounds like a case of over-engineering. I can understand the reasoning behind it, but the reality - as you're discovering - is that you're just giving yourself more to deal with any time there is a change.

    Personally, I would just write my stored procedures to pull directly from the tables. This way you avoid the issue entirely. If you find that there are things that you do all the time that are complex enough to warrant a view, then by all means create a view and use that; but make it clear to your programmers that whenever these views change, that gets treated as a change to production - ie. it is their responsibility to find out what else is affected (for example, what stored procedures need to be refreshed) and to take the appropriate steps. And as for people not sticking to it... that's nonsense. It's their job. If they break something, hold them responsible.

    +1

    I agree totally. This type of over-engineering causes development to be so slow that the potential gains to productivity are lost because the devs have to spend so much extra time trying to figure out all the loose ends they have to tie together.

    Make your procs pull from the tables directly. There is no need to have a n-tier data layer. It will do nothing but cause you grief in the long run.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • GilaMonster (5/9/2012)


    Putts (5/9/2012)


    Any change to the table then forces you to refresh all your SPs.

    Not at all. The only time that a procedure will need refreshing if the table is changed is if select * is used.

    I'm not seeing a good reason for the multiple levels of indirection. Any changes to the base table should be proceeded with impact analysis and followed with proper testing, so there should be minimal chance of stuff getting missed, and if it does it gets picked up in testing.

    Typically, our changes are "this column needs to be added" or "this column needs to be removed."

    Walking through a sample scenario of "ColumnB is added to TableX"

    First, we add the column to the table.

    Then there'll be one method in code that converts data from TableX to an instance of ClassX. So we'll update that.

    We'll also find the 2 or 3 SPs that Insert/Update and Delete rows from TableX and modify them accordingly.

    We'll then find the corresponding code methods that call those SPs and modify them to send the new value for ColumnB.

    No matter how we do the rest, those steps are consistent. What remains is a choice between:

    Modify one Function to now include ColumnB in the Select.

    ... or ...

    Find all references in any SP to that table and then go through each of those SPs and make the necessary change. ... And hope you don't miss one.

    I'm not following why people would prefer option B in that scenario.

  • Putts (5/9/2012)


    GilaMonster (5/9/2012)


    Putts (5/9/2012)


    Any change to the table then forces you to refresh all your SPs.

    Not at all. The only time that a procedure will need refreshing if the table is changed is if select * is used.

    I'm not seeing a good reason for the multiple levels of indirection. Any changes to the base table should be proceeded with impact analysis and followed with proper testing, so there should be minimal chance of stuff getting missed, and if it does it gets picked up in testing.

    Typically, our changes are "this column needs to be added" or "this column needs to be removed."

    Walking through a sample scenario of "ColumnB is added to TableX"

    First, we add the column to the table.

    Then there'll be one method in code that converts data from TableX to an instance of ClassX. So we'll update that.

    We'll also find the 2 or 3 SPs that Insert/Update and Delete rows from TableX and modify them accordingly.

    We'll then find the corresponding code methods that call those SPs and modify them to send the new value for ColumnB.

    No matter how we do the rest, those steps are consistent. What remains is a choice between:

    Modify one Function to now include ColumnB in the Select.

    ... or ...

    Find all references in any SP to that table and then go through each of those SPs and make the necessary change. ... And hope you don't miss one.

    I'm not following why people would prefer option B in that scenario.

    Because it is better on performance and does not use functions incorrectly.

    Jared
    CE - Microsoft

  • Putts (5/9/2012)


    GilaMonster (5/9/2012)


    Putts (5/9/2012)


    Any change to the table then forces you to refresh all your SPs.

    Not at all. The only time that a procedure will need refreshing if the table is changed is if select * is used.

    I'm not seeing a good reason for the multiple levels of indirection. Any changes to the base table should be proceeded with impact analysis and followed with proper testing, so there should be minimal chance of stuff getting missed, and if it does it gets picked up in testing.

    Typically, our changes are "this column needs to be added" or "this column needs to be removed."

    Walking through a sample scenario of "ColumnB is added to TableX"

    First, we add the column to the table.

    Then there'll be one method in code that converts data from TableX to an instance of ClassX. So we'll update that.

    We'll also find the 2 or 3 SPs that Insert/Update and Delete rows from TableX and modify them accordingly.

    We'll then find the corresponding code methods that call those SPs and modify them to send the new value for ColumnB.

    No matter how we do the rest, those steps are consistent. What remains is a choice between:

    Modify one Function to now include ColumnB in the Select.

    ... or ...

    Find all references in any SP to that table and then go through each of those SPs and make the necessary change. ... And hope you don't miss one.

    I'm not following why people would prefer option B in that scenario.

    You still have to identify every stored procedure that now has the new column so you can modify the code that calls it so that it now has the column. So how does that save time?

  • You are just making statements now with no backing.

    What is the point of a table-valued UDF if not to return a resultset? Is that not the same purpose of a View? What's the difference and how is that not using it the same way?

    Is there some fact behind the loss of performance? That is what I am looking for. Everything I've read said that a single-line UDF should operate the same as a View. If you have some information that would go against that then that's exactly what I'm looking for.

  • I think the "real" problem here, that you are avoiding, is that the database schema keeps changing. If it was designed properly to begin with, you would only have a minimal amount of columns being added and would not have any of these problems. I've worked on just shy of 500 projects with .NET developers and cannot think of 1 case where what your doing would have been a good solution to anything.

    Jared
    CE - Microsoft

  • Putts (5/9/2012)


    You are just making statements now with no backing.

    What is the point of a table-valued UDF if not to return a resultset? Is that not the same purpose of a View? What's the difference and how is that not using it the same way?

    Is there some fact behind the loss of performance? That is what I am looking for. Everything I've read said that a single-line UDF should operate the same as a View. If you have some information that would go against that then that's exactly what I'm looking for.

    Your example of putting these functions in the where clause makes is non-sargable.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Lynn Pettis (5/9/2012)


    Putts (5/9/2012)


    GilaMonster (5/9/2012)


    Putts (5/9/2012)


    Any change to the table then forces you to refresh all your SPs.

    Not at all. The only time that a procedure will need refreshing if the table is changed is if select * is used.

    I'm not seeing a good reason for the multiple levels of indirection. Any changes to the base table should be proceeded with impact analysis and followed with proper testing, so there should be minimal chance of stuff getting missed, and if it does it gets picked up in testing.

    Typically, our changes are "this column needs to be added" or "this column needs to be removed."

    Walking through a sample scenario of "ColumnB is added to TableX"

    First, we add the column to the table.

    Then there'll be one method in code that converts data from TableX to an instance of ClassX. So we'll update that.

    We'll also find the 2 or 3 SPs that Insert/Update and Delete rows from TableX and modify them accordingly.

    We'll then find the corresponding code methods that call those SPs and modify them to send the new value for ColumnB.

    No matter how we do the rest, those steps are consistent. What remains is a choice between:

    Modify one Function to now include ColumnB in the Select.

    ... or ...

    Find all references in any SP to that table and then go through each of those SPs and make the necessary change. ... And hope you don't miss one.

    I'm not following why people would prefer option B in that scenario.

    You still have to identify every stored procedure that now has the new column so you can modify the code that calls it so that it now has the column. So how does that save time?

    Only if you code it the most painful way possible 😀

    You set up your code to know that it's expecting the following X columns. As long as all SPs return all X columns then you're fine. Hence, if you have all your SPs returning what's coming out of the Function then you just have to make sure that the Function is returning the same X columns that your code is expecting.

    You don't write code to extract data individually for each SP. You code around a template that each SP follows.

  • Lynn Pettis (5/9/2012)


    Putts (5/9/2012)


    GilaMonster (5/9/2012)


    Putts (5/9/2012)


    Any change to the table then forces you to refresh all your SPs.

    Not at all. The only time that a procedure will need refreshing if the table is changed is if select * is used.

    I'm not seeing a good reason for the multiple levels of indirection. Any changes to the base table should be proceeded with impact analysis and followed with proper testing, so there should be minimal chance of stuff getting missed, and if it does it gets picked up in testing.

    Typically, our changes are "this column needs to be added" or "this column needs to be removed."

    Walking through a sample scenario of "ColumnB is added to TableX"

    First, we add the column to the table.

    Then there'll be one method in code that converts data from TableX to an instance of ClassX. So we'll update that.

    We'll also find the 2 or 3 SPs that Insert/Update and Delete rows from TableX and modify them accordingly.

    We'll then find the corresponding code methods that call those SPs and modify them to send the new value for ColumnB.

    No matter how we do the rest, those steps are consistent. What remains is a choice between:

    Modify one Function to now include ColumnB in the Select.

    ... or ...

    Find all references in any SP to that table and then go through each of those SPs and make the necessary change. ... And hope you don't miss one.

    I'm not following why people would prefer option B in that scenario.

    You still have to identify every stored procedure that now has the new column so you can modify the code that calls it so that it now has the column. So how does that save time?

    Why do this:

    Table --> View (or iTVF) --> stored procedure --> App

    When you can do this:

    Table --> stored procedure --> App

    You haven't justified the need for the extra layer of indirection.

  • SQLKnowItAll (5/9/2012)


    I think the "real" problem here, that you are avoiding, is that the database schema keeps changing. If it was designed properly to begin with, you would only have a minimal amount of columns being added and would not have any of these problems. I've worked on just shy of 500 projects with .NET developers and cannot think of 1 case where what your doing would have been a good solution to anything.

    ... then I need to work where you work. Cuz where I work we build the applications according to the specs and then, 2 months later, the specs change and we have to adapt.

    At this point you guys are apparently just trying to make me feel inferior about the environment in which I work and are not actually addressing the question. Is there any negatives to replacing Views with single-line UDFs?

  • Putts (5/9/2012)


    You are just making statements now with no backing.

    What is the point of a table-valued UDF if not to return a resultset? Is that not the same purpose of a View? What's the difference and how is that not using it the same way?

    Is there some fact behind the loss of performance? That is what I am looking for. Everything I've read said that a single-line UDF should operate the same as a View. If you have some information that would go against that then that's exactly what I'm looking for.

    Maybe you missed the post in which Gail said that using a table-valued UDF causes a table scan every time. There are plenty of articles on how to use TVF's efficiently, so you can google that on your own. Short answer to "What is the point of a table-valued UDF if not to return a resultset? Is that not the same purpose of a View?" is that you have simplified the output of each to assume purpose. Just because they both return a result set and can be treated as a table does not mean that they are to be used for the same purpose.

    Jared
    CE - Microsoft

Viewing 15 posts - 16 through 30 (of 57 total)

You must be logged in to reply to this topic. Login to reply