March 1, 2007 at 5:17 am
I am in the process of moving a large amount of data from an Oracle database to SQL 2000. Some of the datatypes after the DTS to the staging area are larger then they need to be, so I am looking into converting them to smaller types. For instance, the PK on one of the tables is a decimal (30,0) (17 bytes of storage), and a bigint would be enough (8 bytes). So, I created 3 tables of subsets (10000000 rows) for testing. Table1 has no conversions, Table2 has only the PK converted from decimal (30,0) to bigint, and Table3 has the PK converted along with some other numerics and any varchars less then 5 convered to chars. I then clustered each on the PK and ran a basic count distinct on a couple of columns, and summed a couple of columns (just to check table access speed).
As expected, the query ran much faster on table2 then table1. However, the query ran slightly slower on table3 then table2... not what I expected since the rowsize of table3 is smaller...
The costs of the queries in the estimator are
Table1 1540
Table2 1517
Table3 1366
and the run time of the queries were
Table1 4:46
Table2 3:49
Table3 3:57
I've repeated these tests multiple times with the same results. All tables are in the same db and in the same filegroup. Any thoughts on why this might be is appreciated.
Thanks...
March 1, 2007 at 7:15 am
It's hard to answer that one without seeing the test script you used. Can you provide it to us?
March 1, 2007 at 7:19 am
If you altered the varchar columns to char 'in situ' you might want to create a new table (Table4) with the columns in their final datatypes and copy the data from the original (Table3) to the new table.
If you dropped varchars columns, DBCC CLEANTABLE might help on the original Table3.
It 'feels' like [now] empty space in the table has not been reclaimed so the volume of data read has not changed.
HTH
March 1, 2007 at 7:42 am
To create the 3 tables, I just used select into from the source database to 3 new tables in a new database. The first had no conversions, the second had 1, and the last quite a few more. Then, created clustered indexes on each. And ran this on each:
select count(disting field1), count(distinct field2), sum(field 3), sum(field4) from new_table
An sp_spacedused shows the following datasize for each table
Table1 5077672 KB
Table2 5034664 KB
Table3 4490568 KB
March 1, 2007 at 7:50 am
There's a lot more to that to do performance testing.
Are you using a sql server on a local pc (network can't be an issue)?
Are you emptying the procedure and memory cache on the server before each run?
Are you sure you have the exact same data in all tables (except for the datatypes)?
Have you deragmented the db before doing those tests (would be surprised of a huge impact... but may explain a few secs)?
Are you selecting the data or simply reading it on the drives (local memory becomes a factor when presenting the data)?
I'm sure there are more explenations but you get the idea.
March 1, 2007 at 8:10 am
Thanks for the tips... I did not empty the mem cache. I will try that. However, I did create a new db just for this purpose, used the exact same data, and am not returning data, just counts and sums. I ran it a few different times, and only when no one else is using the server (and no jobs running). It is over a network, but each time returns that exact same run time.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply