I hope you’re all enjoying the latest rev of the Microsoft Ignite conference as much as I am. I was going through the content catalog for this year’s conference and noticed while they didn’t have as many data platform sessions as I hoped for this time, one good session did catch my eye. Two of my favorite people in the data community, SolarWinds Head Geeks Tom LaRock and Kevin Kline presented a session called “Steps for a Successful Database Migration.”
You all know by now that I love sessions that provide real-world advice and experiences, so I watched their session and took some notes. While they focused primarily on cloud migrations, the advice they shared is relevant no matter where your data platform resides, either on-premises or in the cloud.
The primary advice they give highlights several real-world examples. These include cleansing your data, assessing your data platform, and making sure the destination can accommodate your architecture, establishing benchmarks, managing costs, and business validation of the data. I migrate a lot of data each year, including a migration just this morning, and these items are some of the tasks that if not completed, will make or break your data platform migration.
Cleansing your data
Too many migrations I’ve been a part of have been delayed because the data to be migrated contains historical data the business no longer needs or ever accesses, but is afraid to delete. My migration this morning included an auditing table with over 850M records in it, of which over 95% were over three years old. I got the business to agree to archive this data, but only after the migration has completed and is settled. With the amount of indexes present on this poorly designed third-party application database underneath their ERP system, this represents over 400GB of data that must be migrated that the business will never need.
Their advice to cleanse your data and code is sorely needed during real-world migrations. Not only do we want to minimize seam on the data we have to transport, we also need to minimize the time it takes to perform the migration period now there are several tricks we can go through to make this quick, but we also don’t want to overcomplicate the migration. Additionally, cloud or on-prem, enterprise storage isn’t cheap. Keeping this kind of data around when it’s no longer needed starts to increase the costs of the data platform to store this data. If this is on-prem, we have a significant upfront cost that probably needs to be repurchased every five or so years. If the data platform is in the cloud, you’re paying for this space every single month.
Also, if you’re aware of any simple fixes for technical debt to both make the data platform more efficient and improve the data architecture, the time to improve the architecture is before the migration commences. You don’t want to be incurring additional costs once you get to the cloud if these items can be easily addressed. Even if you’re staying on-prem, improving the efficiency of the platform can go a long way to reducing the platform costs by improving your virtual machine density and making more of your SQL Server licensing.
I treat compute resources in all platforms like I treat software licensing. Buy what you need, nothing more and nothing less.
Assessing your data estate
This particular task is one I see many organizations struggling with during cloud migrations. The organization wants to modernize a legacy application, so they start looking at things like a database as a service. But, the legacy application can have several bad practices built into the architecture that will only come to light during significant amounts of testing. It could be they have three-part naming to join multiple databases together. It could be they have linked servers as part of a routinely called feature. All of the dependencies and database coding features that can prohibit the use of a particular cloud data platform must be identified in Maps well before migration starts. Some of these could be easy to work through, but some of these might be roadblocks to adoption of a particular cloud platform. If you find some of these, it’s much easier to course correct and change the platform of choice before you start migrating and back yourself into a corner.
Generating your benchmarks
The guys focused on getting benchmarks of how the system is performing, and I love to take it a few steps beyond this. Not only do I love to see benchmarks of how the systems are performing, I love to generate a solid baseline of the actual resource consumption and demand on the platform before I select the scale of a given cloud platform. By looking at the baseline, you could potentially uncover some bottlenecks on the current platform that could both be improved now and potentially reduce the scale, and therefore costs, of your destination platform. It’ll also give you a good set of key performance indicators to compare against while doing your cloud testing and after you complete your migration.
I’ve been consulting in this area for over 11 years now, and not having a good indicator of how the system is performing when things are running well gives you nothing to review and compare against if things are running poorly. It’s one of the most consistent oversights in system migrations, both on-premises and in the cloud.
Optimizing for cost
The next is optimizing for cost. Optimizing for cost means reducing the compute requirements in aggregate to potentially reduce the scale of the platform you’re migrating to. I claim this act still applies to on-premises migrations as well. In optimizing for costs in the cloud, which is different than performance tuning, any deployed service scale reduction will lead to decreased costs immediately. Optimizing for cost can also mean tuning to reduce resource consumption in on-premises platforms, so you can get greater virtual machine consolidation to make better use of your SQL Server licensing footprint.
This is certainly easier said than done, especially if you’re running third-party applications, but there are many little tuning tricks with the database side, such as indexes and statistics, concurrency settings, and parallelism settings, that can make a big difference to the way your application runs. But, if you own the code and can make changes, there are tremendous opportunities to make the platform more efficient. Making the platform more efficient can mean reducing resource consumption. A byproduct of this is making it faster. Everybody wins if you can make it faster and cheaper at the same time.
Testing for data fidelity
Business validation of data integrity after the migration is one of the most commonly overlooked tasks I see as well. What if your database provides financial reporting for your organization, and your migration method missed a single general ledger entry? Not only will your end-of-year finances be off, but when you do roll up reporting, your order totals won’t line up with what the general ledger lists.
I know it sounds farfetched, but I see this with migration methods and other tasks in several platforms. How many times have you validated the transactions being replicated with SQL Server transactional replication? What if one record is changed while you’re doing your migration? Is this change picked up properly by your migration method?
Having a business application owner available to validate the numbers in the data look good during and after the migration is absolutely critical to making sure the business purpose of this data is still intact. Just because migration method gives you a big green checkbox or a success message, it may not mean everything is exactly perfect at the destination.
Conclusion
If you haven’t yet watched this SolarWinds database session, the Microsoft Ignite conference is completely free and online, so you can check out this session from anywhere at any time during the conference. I welcome the real-world experience conveyed in this session. Definitely check it out if you can, and I know I can speak for them and say any of us would be happy to answer any questions you have on this topic at any time. Add this session to your Ignite conference backpack here and learn from the pros.
The post Steps for a Successful Data Migration Session at Microsoft Ignite first appeared on Convergence of Data, Cloud, and Infrastructure.