September 3, 2017 at 6:00 pm
Two tables each with 6.4 million records in SQL Server 2012. Let's say A and B. I have to join two tables based on two columns. Lets say A.C1 = B.C1 and also A.C2 <> B.C2. Both C1 and C2 are nvarchar columns. Table B does not have cluster index. I have the query as show below:
Select B.C1...A.C12, A.C2...A.C5 from A inner join B on A.C1 = B.C1 Where A.C2 <> B.C2
This is taking a more than 16 min to 30 min. It does a full table scan on B as other columns from B. When it takes more than 30 mins, the query timeouts. Can you please let me know what is the best way to write this query.
September 3, 2017 at 10:21 pm
With no indexes to help it, SQL has to do a full table scan.
Add an index to B
Without the full query and table DDL, it is almost impossible to suggest a solution.
Based on the information at hand, this would be a startCREATE NONCLUSTERED INDEX [IndexName] ON B(C1, C2)
September 4, 2017 at 1:35 am
Since there are no filters other than the join, the most efficient way to execute that is probably with two full table scans. Do you really have no further restrictions on the data you want returned?
And 30 minutes???
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 5, 2017 at 8:36 am
As previous answer said there's not much you can improve without using indexes. The problem is in the very business rule and in the naure of the data.
But maybe there's one untried solution. You can try to create a persited view (with schemabinding). Note creating this view will increase the DB size (causing potentially problems) and take an anwfull time to complete. The idea here is to "caching" the results so saving your CPU/ RAM at the expense of your IO/HD.
Note this can help the select but can negative impact insert/update/delete in that two tables.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply