Performance when using UNION in a view

Question

Post reply

Performance when using UNION in a view

Out For Justice

SSCrazy

Points: 2733
More actions
November 23, 2006 at 10:16 am

#116350

Hi SQL Gurus,
I have 2 databases which have identical schemas. They contain client data that must remain segragated.
We are running some analysis that goes across both these databases where I hit on a performance question.
In Query Analyser if I run the query
USE database3
SELECT * FROM database1.MyTable
UNION
SELECT * FROM database2.MyTable
the query takes around 300 seconds.
I also have a view which is defined as
USE database3
CREATE VIEW MyView AS
SELECT * FROM database1.MyTable
UNION
SELECT * FROM database2.MyTable
If I run the query
USE database3
SELECT * FROM MyView
the query takes 360 seconds, around a minute slower. Why is this? I would have expected the view to be in some way optimised. I certainly wouldn't have expected it to be slower. I have run some timing tests around 20 times and get consistent results.
Does anyone know why this is, or can point me to any relevant documentation?
Each table has around 10 million rows. We are using SQL 2000 Enterprise running on Windows 2000
Thanks in advance!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

Ninja's_RGR'us

SSC Guru

Points: 294069

November 23, 2006 at 11:13 am

#673729

What's the difference in the execution plans?

andrewkane17 SSCertifiable Points: 7529 More actions · Answer 2

Do each of the seperate queries that the union combine return values distinct from each other, if so you should use 'union all' this will just union the results, whereas 'union' alone will sort the result sets and eliminate duplicates. This sort takes a lot of processing.

Out For Justice SSCrazy Points: 2733 More actions · Answer 3

UNION ALL fixes the problem! I hadn't appreciated the difference between UNION and UNION ALL. I knew I must be doing something wrong!

Thanks for very much for your help.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

Ninja's_RGR'us

SSC Guru

Points: 294069

November 24, 2006 at 6:42 am

#673787

So what are the final run times?

Out For Justice SSCrazy Points: 2733 More actions · Answer 5

Final times were basically all the same at around the 300 seconds mark. It would be nice if it were faster, but as this is for our own internal purposes, it doesn't really matter.

Joe,

We have separate databases because each database represents the data from a different client. We process data for a large number of external clients. If any client saw another clients data, we as a company would be royally screwed! Without going into any detail about how our app works, we regularly need to import client data, process and back up, but often we will need to restore from a previous back up a day or so after processing; if all data was in the same database, any restore would overwrite any updates from other clients. Secondly, we have something in the region of a terabyte of client databases, they are much easier to manage at the file level as individual databases. We have more than one SQL Server, so small databases allows us to do load balancing when required; we can physically move a database from one machine to another if we want to. Thirdly, we have also found that as the databases grow larger, their performance degrades, so separating them out makes sense from that angle also.

We only go across the databases for our own internal auditing purposes, which is not part of the process for our clients. No data appears in 2 places (I'm not sure why you think it is) this is the whole point of using a view!