Production databases can get very large.
This in itself is to be expected, but when a copy of the database is needed the database must be restored, requiring additional and costly storage. For example, if you want to give each developer a full copy of your production server, you'll need n times the storage cost for your n-developer team. The same is true for any test databases that are created during the course of your project lifecycle.
If you've read my previous blog posts, you'll be aware that I've been focusing on the database continuous integration theme. In my CI setup I create a "production"-equivalent database directly from its source control representation, and use this to test my upgrade scripts. Despite this being a perfectly valid and practical thing to do as part of a CI setup, it's not the exact equivalent to running the upgrade script on a copy of the actual production database.
So why shouldn't I instead simply restore the most recent production backup as part of my CI process?
There are two reasons why this would be impractical.
1. My CI environment isn't an exact copy of my production environment. Indeed, this would be the case in a perfect world, and it is strongly recommended as a good practice if you follow Jez Humble and David Farley's "Continuous Delivery" teachings, but in practical terms this might not always be possible, especially where storage is concerned. It may just not be possible to restore a huge production database on the environment you've been allotted.
2. It's not just about the storage requirements, it's also the time it takes to do the restore. The whole point of continuous integration is that you are alerted as early as possible whether the build (yes, the database upgrade script counts!) is broken. If I have to run an hour-long restore each time I commit a change to source control I'm just not going to get the feedback quickly enough to react.
So what's the solution?
Red Gate has a technology, SQL Virtual Restore, that is able to restore a database without using up additional storage.
Although this sounds too good to be true, the explanation is quite simple (although I'm sure the technical implementation details under the hood are quite complex!) Instead of restoring the backup in the conventional sense, SQL Virtual Restore will effectively mount the backup using its HyperBac technology. It creates a data and log file, .vmdf, and .vldf, that becomes the delta between the .bak file and the virtual database. This means that both read and write operations are permitted on a virtual database as from SQL Server's point of view it is no different from a conventional database. Instead of doubling the storage requirements upon a restore, there is no 'duplicate' storage requirements, other than the trivially small virtual log and data files (see illustration below). The benefit is magnified the more databases you mount to the same backup file. This technique could be used to provide a large development team a full development instance of a large production database.
It is also incredibly easy to set up. Once SQL Virtual Restore is installed, you simply run a conventional RESTORE command to create the virtual database. This is what I have running as part of a nightly "release test" process triggered by my CI tool.
RESTORE DATABASE WidgetProduction_virtual
FROM DISK=N'C:\WidgetWF\ProdBackup\WidgetProduction.bak'
WITH
MOVE N'WidgetProduction'
TO N'C:\WidgetWF\ProdBackup\WidgetProduction_WidgetProduction_Virtual.vmdf',
MOVE N'WidgetProduction_log'
TO N'C:\WidgetWF\ProdBackup\WidgetProduction_log_WidgetProduction_Virtual.vldf',
NORECOVERY, STATS=1, REPLACE
GO
RESTORE DATABASE mydatabase WITH RECOVERY
Note the only change from what you would do normally is the naming of the .vmdf and .vldf files. SQL Virtual Restore intercepts this by monitoring the extension and applies its magic, ensuring the 'virtual' restore happens rather than the conventional storage-heavy restore.
My automated release test then applies the upgrade scripts to the virtual production database and runs some validation tests, giving me confidence that were I to run this on production for real, all would go smoothly.
For illustration, here is my 8Gb production database:
And its corresponding backup file:
Here are the .vldf and .vmdf files, which represent the only additional used storage for the new database following the virtual restore.
The beauty of this product is its simplicity. Once it is installed, the interaction with the backup and virtual database is exactly the same as before, as the clever stuff is being done at a lower level.
SQL Virtual Restore can be downloaded as a fully functional 14-day trial.
Technorati Tags: SQL Server