July 8, 2021 at 2:38 pm
Hello Forum,
Inside my database server I have 3 harddisks. Sadly there is no more free space to mount additional disks. So I set up:
* Disk 1: OS, SQL-Server executables incl. master, model, msdb and tempdb
* Disk 2: mdf for User DB
* Disk 3: ldf for User DB
For testing I installed Northwind completely on Disk1 (I mean: mdf and ldf both on disk1)
and a copy of the same DB to Northwind2 with mdf on disk2 and ldf on disk3.
I created a Stored Procedure with 2 loops:
1) Loop through all tables
2) Loop through all columns in this table and execute "SELECT * from table ORDER BY column" for each column
I exec this SP multiple times. My expectation was that Northwind2 with mdf and ldf on different disks should be faster, but I realize the opposite: the queries runs for about 2-3 times longer than the queries on Northwind (the database which lives entirely on disk1).
Is this due to old hardware (perhaps disk1 is faster than the other two?), or what can be reason?
Or is my test nonsense? ...
BTW I run this on Express-Edition
July 8, 2021 at 3:16 pm
Performance has a lot to do with page caching and how much memory has been allocated for use by SQL Server. Logical reads (prefetched from memory) are much faster than non-cached physical reads from disk, and you want a high cache hit ratio. However, looping across all tables, selecting * for all rows, that's not a real world data access pattern and it wouldn't leverage cache efficiently because it would constantly be swapping out pages. Actually, if your current test is just selecting data, not writing data, then the log files don't even come into play in any significant way. I'd expect the log drive to be barely utilized.
You may find that your new database configuration is faster than the original when performing a test like measuring time required to insert 100,000 rows into a table or executing a procedure call 100x using a randomized set of parameters. Choose a stored procedure from that Northwind sample database that performs both query and insert / update operations, so you're measuring the combination of both things.
The following link is for a tool you can use to perform query stress tests, where you configure number of executions and parallel threads and then it provides metrics for seconds duration and total logical read count.
https://www.mssqltips.com/sqlservertip/2730/sql-query-stress-tool/
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
July 8, 2021 at 8:55 pm
I would like to add to what Eric said.
I think your test case is flawed. You are not simply testing if having the LDF and MDF on different disks is "faster", you are also checking if Disk1 or Disk2 is faster. A better test case for you would be to make sure the MDF is on the same disk for both test cases. What I mean is put both MDF files on disk 2. Then with test case 1, you have the LDF on disk 2. With test case 2 you have the LDF on disk 3. This way any performance differences due to hardware will not have as much of an impact.
Comparing performance of SQL across different disks is going to give you different performance. What I mean is if disk 1 is a 10,000 RPM high performance disk, it will perform better than a 5400 RPM "eco friendly" disk (My understanding is eco friendly disks are usually quick to sleep and slow to wake with the idea that it saves power, whereas high performance disks will rarely sleep). OR, if it is 1 disk with 3 partitions, that will create odd overhead too. Or if disk 1 has near 0% fragmentation and disk 2 is highly fragmented you may have performance differences. Now if they are the same spec and same age of disk, then performance on the disks should be similar. I would do a test with HDTUNE to rule out (or prove) disk I/O as being the cause of the performance issues.
I am also assuming that when you say "hard disk" you are meaning a HDD and not an SSD. HDTUNE can still do benchmarks on your disks though so you can try to rule out disk I/O.
But, assuming the disk is not the cause of performance issues, it could be any number of things. If you are running this on from a client connecting to the server (ie a 2 machine setup), it could be something else is hogging some of the bandwidth while you are running your tests. If it is running on the same physical box, it could be that the antivirus kicked in during the second query run. Depending on how the query is ordering the tables it is gathering in the loop, it could be that they are coming back in a different order. What I mean is how the loops are ordering the data that comes back. It MAY be that in disk 1 version it is pulling it back in an efficient manner based on the order it is on disk resulting in less head movement but on disk 2 it is grabbing tables from the start and end of the disk. It MIGHT not even be SQL Server that is slower; it could be SSMS if you are presenting the results to grid. SSMS can be a bit of a resource hog and can cause odd performance issues if you are dumping a lot of data to grid. results to text is better, but again, if you have too many results it can get clunky there too.
If you really want to narrow down which is "faster", you are going to need a lot more metrics and will need to determine what you are wanting to check for being "faster". Are you comparing SELECT performance only? What about INSERT, UPDATE, and DELETE? As Eric said, if you are ONLY checking SELECT, your log file could be on any disk and the performance shouldn't really change.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
July 13, 2021 at 9:14 am
Thank you both.
As always very helpfull comments - I like this forum 🙂
July 13, 2021 at 9:36 am
Of course, when I was doing INSERT, UPDATE or DELETE then I see the difference when ldf is on different disk
I should have thought more on my loop-select-procedure before writing my first post 😀
While a transaction log captures DML Operations, Selects are not logged to transaction log since SELECT don't change the state of data and also there's no way to rollback a select statement. And so my loop() with doing SELECT is not very meaningful.
Sorry for the noise
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply