October 17, 2007 at 8:39 pm
A lot of answers are mentioning that LEFT OUTER JOINs are slow and sometimes indeed they are.
But what if we really have to write a query like this:
SELECT A.some_fields, B.some_fields
FROM A
LEFT OUTER JOIN B
ON A.f1=B.f1 AND func(A.something)=true
is there any other way to write this query in order to make it as fas as the same query with INNER JOIN could be?
Usually this kind of query could be faster than the first one :
SELECT A.some_fields, B.some_fields
FROM (SELECT some_fields FROM A WHERE func(A.something)=true)
LEFT OUTER JOIN B
ON A.f1=B.f1
But it's still not as fast as the equivalent INNER JOIN query.
Any help would be greatly appreciated - I have to write tons of queries based on LEFT OUTER JOINs .... and in 90% of them I can't use an INNER JOIN ... :-((
By the way, the theory says that any query containing sub-queries could be re-written without subqueries and we are strongly advised to do so in order to improve performance. But for god's sake, how could we re-write this second query without subqueries and make it even faster?
Thank you again for any help!
October 17, 2007 at 9:42 pm
SELECT A.some_fields, B.some_fields
FROM A
LEFT OUTER JOIN B ON A.f1=B.f1
WHERE func(A.something)=true
Clause "func(A.something)=true" is big NO-NO in queries.
Results in table scan every time.
Rethink your query to avoid such calls.
_____________
Code for TallyGenerator
October 17, 2007 at 9:55 pm
Sergiy (10/17/2007)
SELECT A.some_fields, B.some_fieldsFROM A
LEFT OUTER JOIN B ON A.f1=B.f1
WHERE func(A.something)=true
Clause "func(A.something)=true" is big NO-NO in queries.
Results in table scan every time.
Rethink your query to avoid such calls.
So what if you have an index on the column, would it not just be an index scan, which isn't that bad?
October 17, 2007 at 11:20 pm
Max Yasnytskyy (10/17/2007)
So what if you have an index on the column, would it not just be an index scan, which isn't that bad?
Index contains values from the column.
And it's not the values in the column you evaluate in such WHERE clause.
There is no way index can help here.
So, it will be table scan anyway (or clustered index scan - it's effectively the same because leaf level of clustered index contains entire table).
_____________
Code for TallyGenerator
October 17, 2007 at 11:22 pm
Depends... index scans can be worse than a table scan because even after the index scan, it still has to get the data.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 18, 2007 at 2:13 am
Just being picky, but...
There's no such thing as a LEFT JOIN being 'equivalent' to an INNER JOIN.
Those are quite different, different questions asking for different answers and works in different ways.
Also, LEFT JOINS per se aren't inherently slower than anything else.
For a given individual query, LEFT JOIN may perform better than a NOT EXISTS/EXISTS or correlated subquery construct, for others not. And over time, what used to be 'best' may change when your data grows or it's density changes. Databases are living breathing things, and what is 'good' today may be 'bad' tomorrow.
Query-tuning and index strategies is an evergoing process as time passes. (for most db's anyway)
In this case, the function seems to be 'the bad guy'.
Try to write your query without the function, incorporate it's functionalty into the query directly and see if it improves?
/Kenneth
October 18, 2007 at 7:57 am
I was a little misunderstood - func(A.something) is not an UDF ... just something like A.some_field = a_value or something like that, I know that UDF are not a very good idea in a WHERE clause ... 😉
October 18, 2007 at 1:58 pm
System functions are as bad here as UDFs.
But it does not mean you cannot use functions in WHERE clause at all.
You should not use it to modify values in table columns.
_____________
Code for TallyGenerator
October 19, 2007 at 1:38 am
I have the same problem with a query which has many LEFT JOINs. And for some LEFT JOINs I have bad seek index. For example:
SELECT s.nrintstoc, s.tipdoc FROM
stocuri s
left join nir n on s.nrintdoc=n.nrintnir and s.tipdoc='NR'
left join bon b on s.nrintdoc=b.nrintbon and s.tipdoc in ('BP', 'bc', 'BT', 'BA')
From Execution Plan I deducted that I have problems on seeking column tipdoc - char(2). nrintdoc, nrintnir and nrinbon are integers (int).
How to rethink this query or structure of tables?
October 19, 2007 at 2:08 am
Is the 'problem column' indexed in any way?
If it is, is the statistics up to date?
How many rows are expected to have this value? ('NR' was it?)
If it's too common (ie too low density) then an index won't be used anyway, because in such cases a scan is cheaper than repeatedly accessing the index.
/Kenneth
October 19, 2007 at 2:19 am
Yes, i already update statistics for stocuri table and have indexes on pairs of indexes like nrintdoc and tipdoc and nothing changes.
October 19, 2007 at 2:20 am
Sorin,
Your query returns columns only from table "s".
Why you need those joins at all?
_____________
Code for TallyGenerator
October 19, 2007 at 2:25 am
I put only few columns, you right, I have many columns to return, but here I show you only involving columns in this bad issue.
Also, the count of values for tipdoc are the following:
'NR' - 248.000
'BP' - 51.000
'bc' - 401
'BT' - 567.000
'BA' - 0
October 19, 2007 at 2:34 am
Well, according to these counts (they are the complete table?) it looks like your qyery is hitting just about every row. I'd expect this table to be scanned, since you're obviously matching the entire table anyway.
The performance will then be dictated on how fast your drives are (for the scan) and how large your data cahce is (RAM on the server)
btw, how slow is 'slow'..? What's the current timings that you want to improve.
Also, what is the total rowcount returned from this query?
/Kenneth
October 19, 2007 at 2:42 am
I tested the execution of stored proc which contains this query on my workstation (1GB RAM) and time is about 5 minutes and returns about 180 rows.
Also, rowcounts for tables are :
stocuri - 870.000
bon - 650.000
nir - 92.000
When it is executed on production server (4 processors and 4 GB RAM but spindles are slow - 7200 rpm) the execution is very slow according with concurential access of db server (about 250 users) - about 20 minutes or slowest.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply