Introduction
We do not have a designated SQL Server DBA for our Business Intelligence and Data Warehouse group in my company. So far the developers have done most of the DBA work and we are functioning fine. That makes me wonder if we ever need a DBA.
So actually what does a DBA do? Brian Knight wrote an article "DBA Job Description: What type of DBA are you?" He said there were production DBAs and development DBAs. A Production DBA installs SQL server, installs and deploys databases, creates the backup plan, manages security and tunes the physical layer of the database. The development DBA creates logical and physical data models, creates DTS packages and installation scripts, writes stored procedures, and looks at execution plans to improve query performance. He also mentioned there was a new type DBA called Hybrid DBA who performs both production and development DBA work together.
Then is there any difference between a production DBA, development DBA and a data warehouse DBA. What kind of activities is a data warehouse DBA suppose to do? I'd like to give my view of what this type of DBA needs to do.
Data Warehouse DBA
First of all a data warehouse DBA has to understand OLAP architecture design, data warehouse data models, star schema concepts, source data and the ERP system. If the company has a data architect, the data architect would design the data warehouse, data mart tables and the views and the DBA would implement the structure and perform the administration of the system.
The data warehouse DBA is also a production DBA - monitoring the system performance, develop and implement archiving, backup and recovery strategies for the data warehouse, planning storage capacity for scalable database growth.
One of the most important tasks of the data warehouse is the ETL processing. It is either built by in house developers or could use vendor ETL tools. The DBA should have detailed knowledge of how data is loaded into the data warehouse, supports the BI developers to develop and enhance the processes, helps out to test and implements the ETL stored procedures, triggers and views, etc. Another duty is monitored the ETL jobs, automates error handling and optimizes the process for better performance.
The data warehouse is different from the transactional systems. In the transactional systems, the system availability and the response time is critical. However the data warehouse is an ever-growing decision support environment, which contains multiple integrated and reconciled databases for multiple usages. The stability of the data warehouse is very important to the users. The data warehouse usually is opened for read only transactions by the users, but the users may write long, complex queries or transactions to get the data. The DBA may have to understand how the end users access the data and build indexes in certain tables specifically for those long queries.
Since the data warehouse is a decision making system and contains the company's historical data, the data quality is very important. The company depends on it to make future decisions. Since the data comes from the transaction system or external data, if the data in the transaction system or the external system is wrong, it would affect the data warehouse data. An audit process should be used to validate data against external sources and monitor data warehouse quality. Sometimes the DBA may need to work with the developers for any troubleshooting problems and issues, to ensure reliable data availability in the data warehouse.
Most companies also have BI analysis tools and reporting tools to generate reports from the data warehouse. The DBA should be familiar with those tools and have a fully understanding how the tools access the data warehouse data.
Conclusion
A data warehouse DBA performs both production and application DBA work, plus is required to know the company business, the data warehouse architecture and be an expert in the SQL language. Also the job requires excellent communication skills since the DBA has to talk to the data architect, ETL developers and reporting developers. A data warehouse DBA definitely is a team player and has excellent problem solving skills.
No matter if you are a production DBA, an application DBA or a data warehouse DBA, you are responsible for one the biggest assets of the company - the data. A good DBA does not just sit at the desk and get the job done. A good DBA should have a self-motivated attitude, a passion to improve to do a better job and minimize mistakes. Also since the technology world is moving so fast; SQL Server 2005 now includes a BI development studio. So the data warehouse DBA needs to learn .NET programming, how the CLR is used for stored procedure, and how to manage XML communications.