October 13, 2017 at 8:37 am
Hello experts,
Could anyone recommend what kinds of training I should look into to improve my understanding of SQL Server from the Windows and networking side?
Until now I have been more of an accidental/development DBA with our systems side handling most of the networking, OS, and clustering side - including installing the server OS and doing the clustering and network configuration. But I want to learn more about those aspect as they relate to SQL Server because, frankly, I feel ignorant on too much of that stuff and I want to be able to better investigate issues even when I have to report them to the other team for them to resolve.
Right now I have a basic understanding of navigating the Failover Cluster Manager and a somewhat better understanding of the SQL Server Configuration Manager, but I want to be better trained in depth on how clustering and networking work when employed for SQL Server.
Thanks for any pointers - including training sites or courses that would be useful.
- webrunner
-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
October 20, 2017 at 11:17 am
That is a lot to bite off. If you want basic IT, I'd recommend looking at getting your A+ certification. If you want to specialize in servers, then I think you would be looking at MTA, MCSA or MCSE.
https://www.microsoft.com/en-us/learning/browse-all-certifications.aspx
That is a good place to look, but it is a LOT to learn. The other option would be to talk to your server guys and ask them to train you on simple tasks and slowly get more advanced tasks.
Where I work, I tend to keep clear of the actual OS side of things UNLESS I need to tweak something. Like if I need to open a port in the firewall or install an update. But the actual setup of the server is handled by our Server guys who do it based off of a WMI image that they built up so new phsyical systems can be up in a few hours with all of the proper patches and settings applied to them. Then we use GPO's to apply the remaining settings that can't be captured in a WMI image or that have changed since the WMI image was captured. Then I get the system after it is patched and set up with the default software set. But the network config is handled by IT; the clustering is handled by IT; installing the OS is handled by IT.
In the event I have issues, my usual first spot to check is logs, but it depends on the issues. Sometimes I use DMV's, sometimes perfmon, sometimes logs... sometimes I run over to our IT department to have them check things out. Recently we had VMWare update virtual network drivers while those systems were in use. The update allowed the network interface to be pinged so our heartbeat monitors said things were peachy. RDP and IIS stopped accepting connections though. Had to reboot the systems. But that fell onto our IT department, not me. I noticed the problem and alerted the proper people to investigate the problem.
Nothing wrong with knowing more, but if you pile too much onto your plate, you will find yourself overburdened rather quickly and work will become a lot more stressful. I know I know how to do some of the IT related tasks, and I even have permissions to do them, but we try to keep separation of duties. That is, if it is an IT problem, I talk to IT to get it fixed, not patch it myself. If I know the solution, I will reocmmend it to IT (such as "can I get you to reboot this server? I am unable to RDP or access it through the web interface." or "is something wrong with one of the DC's? I seem to have lost the ability to log on on the web portal which connects to <domain controller>." Both of these have happened to me.). I find it is good to know what to look for with things which is where an A+ certification would be beneficial. But it is still better to know who to talk to when things go sideways.
Like if the database starts behaving strangely and your systems guys notice it first, they should talk to you, not go in and disable optimize for ad-hoc queries (for example).
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
October 20, 2017 at 1:23 pm
Thank you for this info! It's exactly the high-level breakdown that I needed. Our setup sounds a lot like yours. I definitely am not expected to handle the server builds or other OS-level things that our systems team handles. But I would like to be conversant in what is happening in a Windows server, in a cluster, and generally on our network so I understand more than just ping and telnet usage.
I really appreciate your feedback because it doesn't seem that there are ready-made network and OS-related "curriculum guides" - for lack of a better term - with regard to DBA-type work.
Thanks again.
- webrunner
-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
October 20, 2017 at 2:24 pm
Yeah, I would agree with that.
My knowledge of the IT world comes because I worked in IT prior to moving to being a software developer (mostly web based) and finally a DBA. One of the biggest things with debugging server issues is permissions. There have been several times where I go in to try to debug an issue only to find out I have no permissions to even connect to the server in question.
A lot of the windows related stuff that I do is quite limited. I'll ask IT about firewall ports (as we have had GPO's close off SQL ports before) or in the case of SSRS, adding something to trusted sites.
My biggest problem with having a wide skillset is that I feel I am not specialized in any one thing. And the more I learn about one, the more I forget about another. Since I am a DBA at work, my primary focus is on DBA related things. If something feels like it MIGHT not be a DBA problem, I will ask IT to help with it while investigating the DBA angle too. The more eyes on a problem, the faster it can get solved... sometimes. I've also had cases where we step on each others toes OR the problem gets resolved and not everyone is notified.
Communication is key in the technology field and everyone in the IT/IS department needs to work as one unit as much as possible. The other day, I overheard one of the IT guys say he couldn't connect to VMWare's vCenter. My first step - check that the database was up and there was nothing crazy going on on that server. That way if they determine that it might be a database issue, I can rule it out. OR if I determine it IS a database issue, I can let them know and they can focus on other tasks while I resolve the problem. Thankfully it was something simple (the SSL certificate had expired and had since been renewed, but he hadn't installed the new certificate on his machine).
With IT problems, be it server or software or database, a lot of times it is hard to get proper training on the problem. I have my mental note of things to check when things go bad with a database. But it all depends on the problem. IT problems are very similar.
As far as I know there is no "windows server for the DBA" guides anywhere that I am aware of. What I would suggest is if you suspect that the network may be the problem, send a notice out to your networking guys and tell them that you are investigating it, but ask them to look into it from their end as well. Worst case, you are wasting someones time a little bit. Best case, they catch the problem before you dig too deeply. I know there are some wait stats that are indications of various hardware problems such as CPU, memory and network bandwidth. So having some of those numbers to toss at the server guys could be helpful. BUT, if I remember right, they don't ALWAYS indicate problems in that route. For example, if you are using a linked server to pull 1 TB of data across the network to process and store in a data warehouse, the wait stats may look like network bottleneck. Which makes sense as it is a lot of data, but maybe there is a better method of pulling that data across and if that process was better optimized you might only be pulling across a few hundred MB of data at a time and thus the network bottleneck will go away.
In the end, I think it turns into a group effort. The biggest thing is learning how to operate as a team and not treat it like a hot potato. "database is slow?"->"database is fine. CPU/memory is spiking?"->"CPU/memory is fine. network spiking?"->"network is fine. Database is slow?" and repeat. The worst thing you can do in an technology field is finger point. The blame game gets nothing solved.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply