Skip to content

Versioning DB Metadata Changes and Performance

Here’s yet another guy with more good ideas on how to version database metadata changes. Unfortunately, the ideal method of database metadata versioning remains to be discovered. The ideas discussed in K. Scott Allen’s series are good from a version control point of view, but they will have serious, negative performance implications in the real world. In particular, this bit:

Once a script is published into source control, it cannot be changed! Once someone updates their database with an update script, they should never have to run the same script on that same database.

Now, I think that, in general, immutability is a good thing, and it goes without saying that back version since source control should be immutable. But these scripts are not backed versions of a file; they’re a list of the DDL scripts which will actually be run on the end user’s server. And there’s the rub. Let’s say that in developing feature 100, you decide that you need to add a new column to a large table:

ALTER TABLE BIG_TABLE
  ADD FEATURE_100_COLUMN MY_DOMAIN;
UPDATE BIG_TABLE
  SET FEATURE_100_COLUMN = DEFAULT_VALUE;

Naturally, you add this script to version control, with whatever method you are using to version your database metadata changes. Later, you work on feature 150, which requires another column on the same table:

ALTER TABLE BIG_TABLE
  ADD FEATURE_150_COLUMN MY_DOMAIN;
UPDATE BIG_TABLE
  SET FEATURE_150_COLUMN = DEFAULT_VALUE;

Simply concatenating these two scripts together, as you must do if the scripts are immutable, results in an update process for the end-user which takes far longer than necessary, as the table must be altered and updated twice. It would be far more efficient to merge the two scripts:

ALTER TABLE BIG_TABLE
  ADD FEATURE_100_COLUMN MY_DOMAIN,
  ADD FEATURE_150_COLUMN MY_DOMAIN;
UPDATE BIG_TABLE
  SET FEATURE_100_COLUMN = DEFAULT_VALUE,
  SET FEATURE_150_COLUMN = DEFAULT_VALUE;

This merged script should run an approximately half of the time it takes to run the two scripts separately.

And yet, as I said earlier, treating the scripts as immutable is a good idea from a source control point of view. In particular, you want to be able to extract the metadata changes used to implement a particular feature by themselves. Hence, it makes sense to use a database metadata version control strategy which treats these two goals separately, instead of trying to combine them into a single feature. It makes sense to me to store these scripts for individual features, and have a separate script used to update a customer database.

One of Scott’s goals in developing a system he advocates is to allow/require developers to work on their own copy of the development database, which they can update along with checking out the corresponding source code, instead of using a shared, centralized development database. Merging the DDL scripts from multiple feature implementations means that one can no longer update one’s personal development database to include arbitrary new features. Instead, it is necessary to start with the metadata for the last released version of the software, and then apply the tip revision of the update script. Many developers might find this inconvenient.

Like I said, the ideal system for this has yet to be invented, as far as I can see.

Post a Comment

Your email is never published nor shared. Required fields are marked *

Bad Behavior has blocked 713 access attempts in the last 7 days.

Close