Experience of 1440 database migrations

Imagine an Oracle DBA. He is already over thirty, he is a bit plump, wears a vest, he has a secret access to all databases hanging on his neck, and in the summary half the page of certificates he passed. Saturday. Big release day. The climax. Time to roll changes to the database. He dials sqlplus, presses ENTER and across the black screen somewhere up into the void, kilometers of SQL commands rush. Just like in star wars. After five minutes, everything is ready. An hour later, the release is complete. The work is done, the day was a success. Now you can and a couple of beers.

It is quite another thing - Monday. It turns out that some commands were not fulfilled due to errors, which, however, did not stop the script in its unrestrained pursuit of a black void. And without that difficult task to understand what is broken, complicated by some pressure from the leadership. In general, Monday did not ask.

Of course, this is a made-up story. This has never happened to anyone. At least, it didn’t happen if work on changing the database schema was organized through migrations.

What is a database migration tool?

The idea of managing database schema changes through migrations is extremely simple:

Each change is recorded as a separate migration file.
Migration file includes both direct and reverse change.
The application of migrations to the database is carried out by a special utility.

The simplest example of migration is:

-- 20180618152059: create sequence for some_table CREATE SEQUENCE some_table_seq; --//@UNDO DROP SEQUENCE some_table_seq;

This approach provides many advantages compared to organizing changes in a general SQL file. The mere absence of merge conflicts is worth something.

It is all the more surprising that the approach itself gained popularity relatively recently. It seems that the overall familiarity with the approach was brought by the Ruby on Rails framework, in which the migration tool was built in initially, this was the end of 2005. A little earlier, Martin Fowler, 2003, wrote about the approach. Probably, the point is that development began to actively adapt the use of the version control system only at the beginning of this century. Back in 2000, the first item of the Joel Spolsky test sounded “Do you use source control?” - this suggests that not all of them used the version control systems. But we digress.

Eight years with MyBatis Migrations

At Wrike, we started using the database change approach through migrations in 2010, March 29, at half past twelve. Since then, we have implemented 1440 migrations, containing 6436 direct changes and 5015 reverse. In general, we have gained some experience using the MyBatis Migrations tool in conjunction with PostgreSQL .

In short, we have never regretted. If you are not using Migrations or something similar, it's time to start. Yes, Pentium 4 is also outdated.

But it is boring to talk about the merits of something, let's go straight to the difficulties.

The specifics of working with PostgreSQL

There are no difficulties with writing migrations for Postgres, except perhaps two:

You cannot create indexes,
Cannot add NOT NULL columns.

No, actually it is possible, just not in a very obvious way. When creating an index, you should always specify CREATE INDEX CONCURRENTLY , otherwise you will break production, since Postgres will block the table during the creation of the index, and this can be quite long. Of course, the developers forget about this once, you always have to have this subtlety in mind. Here you could write a test. But this is only a slight inconvenience.

Creating NOT NULL columns is trickier, here you need to make a change in four steps:

Create a null column (Postgres is free).
Set the column DEFAULT value.
In a loop, in portions, update the NULL values in a DEFAULT.
Set SET NOT NULL.

The biggest catch here is in the third paragraph. NULL values need to be updated in portions, since UPDATE some_table SET some_column='' WHERE some_column IS NULL ; locks the table, as in the case of the index, with the same consequences. And Migrations can only execute SQL commands, so such scripts have to be rolled into production by hand. The pleasure is below average. Now, if it were possible to write a cycle in Migrations, there would be no problem. Perhaps this is implementable via hooks .

Creating a UNIQUE index and changing the PRIMARY KEY also requires some skill, but these operations are relatively rare to make.

Cluster specifics

The database migration management tool is good as long as you have one database. All the fun, if you have several bases. Especially if you have several types of databases, each of which has several instances.

As a result, after git pull developer must roll the changes to the first instance of the first database, then to the second instance, then to the first instance of the second database, and so on - such a principle. Here it is time to write the utility to manage the database migration management utility. Total automation.

Role Juggling

It's not a secret that roles as entities do not live at the level of a separate database, but at the level of the entire database server, at least in Postgres. In this case, it may be necessary to specify REVOKE INSERT ON some_table FROM some_role ; It is still possible to expect that roles will be pre-configured in production, but for dev or staging this is already difficult. At the same time, in development, of course, all databases exist on the same local server, so you simply cannot write a CREATE ROLE migration, and IF NOT EXISTS not supported. Everything is solved simply:

 DO $$ BEGIN IF NOT EXISTS (SELECT * FROM pg_roles WHERE rolname = 'some_role') THEN CREATE ROLE "some_role" NOLOGIN; END IF; END; $$;

Look! I catch them and throw them up, catch and throw them, it's so easy.

A bit of development reality

Developers make mistakes, and even in SQL migrations, this happens. Usually, errors can be noticed on the review, but sometimes it is unusual. If we talk about direct changes, there are still shoals before production still do not reach - too many verification steps. But with reverse changes, incidents can occur. To avoid errors in UNDO migration, when testing migration, you need to perform not just ./migrate up , but ./migrate up , then ./migrate down , then again ./migrate up . There is nothing difficult in this; you only need to ensure that forty developers always do this. In a good way, the utility could perform such a combo for the developer’s environment automatically.

Test environments

If the test environment is short lived: let's say you create a container, initialize the database and run integration tests, there should be no problems. Call ./migrate bootstrap , then ./migrate up , and you ./migrate up done. That's only when the number of migrations exceeds 1000, this process can be delayed. It's a shame when the database is initialized longer than tests run. We have to dodge.

Long-lived environments are still more difficult. QA, you know, do not like, coming to work, to see an immaculately clean database. I do not know why, but fact is fact. So the state of the databases used in manual testing has to be maintained in integrity. And it is not always easy.

The subtlety is that if the migration is applied to the database, the migration identifier is written to it. And if the migration code was later changed, the base will not be affected. If the changes are not critical, the code can successfully reach production. Rassinh. Of course, this is a disgrace. The first principle of working with migrations is to never change written migrations, but always create new ones. But sometimes you want to do such a thing — I’ll change a little here, nothing will break, because it’s true. Of course! Roll around!

If migrations were signed after the review, it would be forbidden to apply drafts to staging. And still it would be possible to save in the changelog not only the identifier of the migration, but also the checksum — also useful.

Return as it was

Especially a cunning turn happens when a task is canceled: they did it, they did it and they changed their mind. Quite a normal situation. Once the code is no longer needed, the branch should be removed. And there was a migration in the same place ... and she was already in staging ... well, ... oops. A good reason to check whether you can restore the backup repository. Although recall that there was probably easier.

At the same time, migration is text. And it would be possible to save this text there, in the changelog . Then, if the migration from the code is gone, it does not matter for what reasons, it could always be rolled back. And even automatically.

Do UNDO again

The UNDO section is definitely needed. But why write it? Of course, there are some intricate cases, but most of the changes are CREATE TABLE or ADD COLUMN or CREATE INDEX . For them, the utility could generate inverse operations automatically, right on the SQL code. Of course, there is a specificity. CREATE TABLE ${name} - this is such a special command, suddenly non-standard. And in order to generate DROP TABLE ${name} , you need to be able to parse the expression right up to the third word. Although, in general, this is a fully realizable technical problem. Could be out of the box.

Conclusion

Of course, I'm carping. MyBatis Migrations was conceived as a simple and versatile utility, minimally tied to the specifics of databases. And she is more than justified. But it seems that a few small improvements would make it much better, especially when used in the long run.
-
Dmitry Mamonov / Wrike

Source: https://habr.com/ru/post/414441/

All Articles