May 4, 2009 at 8:47 am
I've recently joined a company where it seems like they make excessive use of functions (table, scalar). When should a function be used.
The same is true of the stored procedures.
Both the stored procedures and the functions have at least 6 joins and I've seen them have around 12 joins. Everything I've ever learned has told me to stay away from so many joins.
Are views the answer?
If not, can anyone give me any direction on this. I don't want to continue something if that is isn't the correct way to go about it. I often get lost in all of the joins.
Thanks,
Debbie
May 4, 2009 at 8:59 am
The views won't save you from performing the joins either. They'll just hide the fact that there are so many joins to produce that recordset. Joins aren't a bad thing as long as the table design (normalization, indexing) is sound.
I do get nervous when I see prolific use of functions, especially with lots of table access. This is often a sign of massive RBAR processing.
I would be much more concerned about how they are using functions then SPs. Creating views may clean up your SP code, but from what I've experienced, creating views to 'hide' your joins makes it easy for other developers to then start joining your views to each other or using view references in your functions. This can lead to a big mess.
I would go through some of the more heavily used functions and look at the queries that call them and their execution plans. You may find quite a bit of room for improvement. I realize that some of what you are seeing may be too ingrained into the system to rewrite, but you may find room for improvement with indexing.
May 4, 2009 at 9:00 am
"Too many joins" is when either (a) it gets the wrong results, (b) you're pulling data you don't actually need and it impacts performance.
I have a query that uses 18 joins right now. It does what it's supposed to and performs well (proper indexing makes a huge difference for this), and it's not getting anything it doesn't actually need for the job it's doing.
I don't use UDFs for much, because, other than inline table UDFs, they usually perform horribly. Inline table UDFs can usually be accomplished more easily with a temp table, CTE, or an inline cross-apply, and will often perform better with one of those solutions, so I don't even use those very often.
Views are a useful shortcut to having to write the same joins over and over again, but they often end up as performance killers because devs will tend to build "one view to rule them all" solutions, and then use a ten-table join with 50 columns selected, to pull data from two tables that can be joined directly. (The next line would be, "one view to bring them all and in the darkness bind them", which seems to be a common issue.)
So, basically, it depends on what you're doing. But overuse of UDFs is usually caused by the misconception that "they're better than views because you can pass parameters to them", and "you can reuse them for all of your procs, which makes those easier to program".
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 4, 2009 at 12:27 pm
Just to pile on a bit & reinforce what everone else said, when possible, avoid table valued functions like the plague. They are serious performance killers because SQL Server cannot maintain statistics on them. Because of this, they're treated as single row tables in the execution plans. As long as they only move a few rows, this doesn't seriously impact performance, but as soon as they start moving hundreds, or sometimes even tens, of rows, performance goes right out the window.
12 table joins... We had a query that was joining 80 tables. That was because of some bad design. That's been cleaned up and now it only joins 40 tables. It runs sub-second most of the time. Good indexing and proper code will see you through. Don't count the number of joins, look to see that they're using indexed columns and are avoiding key lookups where possible. That's more important.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 4, 2009 at 12:32 pm
Just to add a little to my dilemma, what about if we are creating data models for using Report Builder?
May 4, 2009 at 2:56 pm
Whatever data model is better than no model at all. I would try to review data model and understand what developer wants to get using function/store procedure. I’ve seen many times that the same result may be achieved much easier. However, while learning data model you could find that the data model is not good for the database goal. Usually this is the root of the excessive joins…
Alex Prusakov
May 4, 2009 at 6:25 pm
Like Grant said... "Just to pile it on"...
I agree with what the others have said. While the use of UDF's can make for some very RAD (Rapid Application Development) and have the added benefit of getting all the Developers doing the same things the same way, there can be great danger so far as performance goes.
For example, having a function that strips the time off a date might sound like a good idea, at first... that's until you do an analysis and find out just how much overhead that costs instead of just teaching the developers how to do it right with...
DATEADD(dd,DATEDIFF(dd,0,somedate),0)
I'd also have to say a function that calls another function is a severe warning sign that things have gotten out of hand. Same for views of views or aggregated views. Aggregated views can be particularly nasty if someone joins to one of the aggregated columns or uses it in a WHERE clause as in...
AND someaggregatedcolumn 0
The proper use of Temp Tables, CTE's, and the other things mentioned are usually much more effective in the "Divide'n'Conquer" methods of effectively reducing the number of joins and the hidden RBAR of internally generated rows to support what would otherwise be classified as "monster join queries".
Doing everything in a single query is NOT the definition of "good set-based programming". 😉 Neither is using a pot wad of UDF's or views.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply