m82labs   about   posts   categories   feed

MariaDB Backups for the SQL Server DBA

by Mark Wilkinson · July 02, 2015

As a DBA your most important job is making sure you don’t lose your data in the case of catastrophic failure. Why should that be any different when your data is stored in a system you don’t know very well?

With the plethora of database systems available today most DBAs will find themselves supporting more than just SQL Server. This will be the first in a series of posts that will explore backup and restore options in a variety of relational and non-relational systems from the perspective of a SQL Server DBA. While this series will not be comprehensive by any means (there are 10+ engine options for MariaDB alone) I will try to cover the most common use cases for each system.

Prerequisites

This post, and really the whole series, is going to assume that you have some experience on the Linux command line and that you have access to a lab environment where you can follow along. While some of the systems in this series can run on Windows, they weren’t typically designed with it in mind. If you have not used Linux I highly suggest you spin up a VM and give it a try. All the command examples given are going to be executed on an Ubuntu Linux box, but the commands should work in any distribution.

MariaDB and the XtraDB Storage Engine

MariaDB is a DBMS designed to be a drop-in replacement for MySQL, even using the same binary and config file names. It was developed as a fork of MySQL by some of the original developers after concerns arose when MySQL was purchased by Oracle. For the most part, MariaDB is almost 100% compatible with existing MySQL code.

For more details on the incompatibilities, check out the MariaDB website.

MariaDB can use a variety of storage engines depending on what you are using it for. For example, if you are looking for a traditional ACID-compliant RDBMS like SQL Server you can use the XtraDB engine, if you are looking for an in-memory database the MEMORY storage engine might be a good fit, if you need to query data from CSV, XML, or JSON files you can use the CONNECT engine . Depending on your needs you can even mix it up and use different engines for different tables. All of these options can make it an extremely flexible solution, and at a price of $FREE, it’s worth giving it a try.

For the rest of this post we are going to focus on the default storage engine for MariaDB, XtraDB. XtraDB is a high-performance, backwards-compatible, fork of InnoDB developed by Percona. It’s important to note that XtraDB is compatible with both MySQL and MariaDB, so this post can work for either.

XtraBackup and Innobackupex

Xtrabackup is a tool developed by Percona to perform non-locking backups of your database. It’s a little confusing at first, but Xtrabackup is only part of the story, what you will be spending the most time with is Innobackupex. Innobackupex is a Perl wrapper script that calls Xtrabackup and adds a lot of nice functionality like automatically time-stamping your backups (more on this later). There are more options for backing up your data than just Xtrabackup/Innobackupex, but for the SQL Server DBA this option should seem the most familiar.

There are many other options when taking backups on a MariaDB or MySQL instance. I chose to cover xtrabackup in this post because it will likely be the most familiar for the typical SQL Server DBA, it gives you a lot of flexible backup options, and it allows you to take online backups with a minimal amount of setup.

Before You Run a Backup

Before you run your first backup you need to ensure you have the following:

1. Add backup system users to the mysql group.

Any users that need to perform backups should be added to this group. To add the m82labs user to it (you’ll want to add your own user for this example) execute the following:

$ sudo usermod -G mysql -a m82labs

You’ll also need to make some changes to the permissions on your MariaDB data file directories for this to work properly (here we assume your data directory is /var/lib/mysql):

$ find /var/lib/mysql -type d -exec sudo chmod 750 {} \;

These changes will give the mysql user full access to the files, users in the mysql group get read and execute (read allows us to read the file for backup, execute allows us to traverse the directories), and anyone else gets no permissions at all.

Linux permissons can be confusing at first, I highly recommend reading up on using chmod to get a good understanding of how to set permissions. Check the Resources section at the bottom of the post for more information on using chmod.

We will also need to make some changes to how MariaDB create directories for new databases so we don’t have to manually change permissions each time we create a new database. To do this you will need to modify the /etc/init.d/mysql file and add the following lines to the top of the file just below the INIT INFO header block:

UMASK_DIR=424  # = Evaluates to 750
export UMASK_DIR

After this change you will need restart the database service:

$ sudo service mysql restart

Then log out and log back in for the group membership changes to register.

2. A backup directory that is owned by the mysql user.

Make a new directory and change the permissions and ownership. This ensures that the mysqld service can write to the directory, and any user in the mysql group can also manipulate the contents of the directory.

$ sudo mkdir /opt/backups
$ sudo chown -R mysql:mysql /opt/backups
$ sudo chmod -R 770 /opt/backups

3. A database user with the correct permissions.

Typically you should create a dedicated user for backups. Here we’ll assume you named it backup. Since this user is going to be handling your backups only, it should be safe to assume the user will only connect from the local machine.

In MariaDB and MySQL a user isn’t just identified by a username, but also a hostname. As an example, you could create four different backup users with four different passwords as long as you specified that they all connect from different hosts. Though not recommended in this case, putting a % in the hostname field will allow the user to connect from any computer.

To create the user connect to your MariaDB instance and execute the following statements:

CREATE USER 'backup'@'localhost';
SET PASSWORD FOR 'backup'@'localhost' = PASSWORD('SuperSecret');

This creates a user named backup that can only connect from localhost with a password of SuperSecret. This account will need the following permissions:

  • RELOAD
  • LOCK TABLES
  • REPLICATION CLIENT
  • CREATE TABLESPACE
  • PROCESS
  • SUPER

To GRANT the required permissions, execute the following:

GRANT RELOAD,LOCK TABLES,REPLICATION CLIENT,CREATE TABLESPACE,PROCESS,SUPER
ON *.* TO 'backup'@'localhost';

Now that all of this is set up we can move onto creating some backups.

Anatomy of a Backup

Taking backups with Innobackupex is pretty straight forward:

$ innobackupex --user=backup --password='SuperSecret' /opt/backups

This command is simply connecting to your MariaDB instance with the supplied username and password, making a backup, and storing it in the directory you specified. You can supply a host and port if you like, but by default it will attempt to connect to localhost using the port specified in the servers /etc/mysql/my.cnf file. If no such file exists it will check the $MYSQL_TCP_PORT environment variable, the /etc/services file (see more about this file here), or if all else fails, the default port of 3306.

After running the above command you should see a bunch of output on the screen. Assuming you have no errors, lets go check out the /opt/backups directory and see what we have.

 $ ls -la /opt/backups
 

If you do have errors, run through all the steps again and make sure you didn’t miss anything. Typically errors during a backup are related to permissions.

One of the first things you’ll notice about backups from innobackupex is that your backup is not just a single file, it is an entire directory named with a time stamp. If you run an ls on that directory you will see several files and sub-directories in it. One of the interesting things about this backup is that it contains ALL the information needed to restore the backup.

There is no concept of msdb in the MariaDB world. All of the information that would normally be found in the various backup related DMVs is available in the files located in your new backup directory. Here is a brief description of what you will find here (visit the Percona website for more detail):

  • backup-my.cnf: After a backup is taken it still needs to be “prepared” (more on that later) before it can be used. This config file is used to start a small DB instance to prepare your backup for a restore.

  • xtrabackup_checkpoints: This file contains information about the type of backup you took along with the range of LSNs involved in the backup.

  • xtrabackup_binary: A copy of the xtrabackup binary used to perform the backup.

  • xtrabackup_logfile: This is the equivalent of the transaction log. There are no transaction log-only backups in MariaDB, the transaction log is automatically included when you run a backup.

Preparing Your Backups

Before we get into the details of the various backup types, it is very important that we discuss the process of ‘preparing’ a backup. As we touched on before, preparing a backup gets the backup ready for restoring.

When you take a backup it cannot copy all the required files into the backup directory at the exact same moment in time, some files get copied a few seconds after others. Running a prepare on the backup will get all files point-in-time consistent with each other.

The prepare also accomplishes another important thing, it tests your backup. If something is wrong with your backup, the prepare process will fail. Because of this, a lot of people like to run a prepare right after they take a backup. This is a great idea, but as we will discuss below, it’s not always an option.

Backup Types

You have a lot of options when backing up your MariaDB data. Innobackupex allows you to take full backups, incremental, partial, and compact backups. Each have pros and cons and in the end you have a lot of options for a very flexible backup schedule.

Full Backup

This is the simplest of the backups, and is still required if you decide to take incremental backups. One thing to note about the full backup is that it is a full instance backup, not just a single DB. That being said, when you take a full backup you can still restore a single database or table from that full backup, it is just a bit of work. This will be discussed in more detail later on.

Taking a Full Backup

Earlier in this post we saw the command for a full backup, but we’ll repeat it here:

$ innobackupex --user=backup --password='SuperSecret' /opt/backups

This will create a directory named with the current timestamp in the /opt/backups/ directory. You could take another full immediately after this one is done and it would create another timestamped directory for you.

Preparing a Full Backup

As we mentioned earlier, the backup has to be prepared before it can be restored. To prepare a backup you simply run the following command:

$ innobackupex --user=backup --password='SuperSecret' --apply-log --use-memory=1G /opt/backups/YourBackupDirectory

The --use-memory parameter is optional, but if you give the prepare process more memory it can definitely speed things up. You’ll have to play with this option to see what works best for your system.

Note: This step should be skipped if you plan on doing incremental backups based off of this full backup. In that case you could copy the backup to a different location and attempt a prepare there.

Restoring a Full Backup

After preparing the backup we can do an actual restore. Since a full backup is a backup of all databases on the instance, the database service needs to be stopped before a restore can begin and all existing data in the data directory needs to be deleted. I would highly recommend creating a “staging” area of sorts to hold the original contents of your data directory before you delete it. This way, if something goes wrong with the restore, you can always just copy your original files back over.

$ sudo mkdir /opt/mysql.temp
$ sudo chown mysql:mysql /opt/mysql.temp
$ sudo chmod 770 /opt/mysql.temp

To stop the database service and restore the backup:

$ sudo service mysql stop
$ sudo mv /var/lib/mysql/* /opt/mysql.temp/
$ innobackupex --copy-back /opt/backups/YourBackupDirectory

This will copy your database files to the empty data directory, now we need to fix ownership issues (the files will be owned by the user that ran the restore) and restart the database service:

$ sudo chown -R mysql:mysql /var/lib/mysql
$ sudo service mysql start

With any luck you should now have a functional instance based on the backup you restored. Make sure you remember to delete everything from your “staging” area after the restore is complete and well-tested.

Incremental Backup

Xtrabackup supports true incremental backups. Unlike SQL Servers differential backups, which store the changes made since the last full backup, incremental backups store the changes made since the last incremental backup.

Taking an Incremental Backup

The first step in taking incremental backups is to take a full backup to base your incremental on. For our purposes lets assume we already have a full backup from our steps above at the following path: /opt/backups/2015-06-28_13-42-58/

To take an incremental you would issue the following command:

$ innobackupex --user=backup --password='SuperSecret' --incremental /opt/backups --incremental-basedir=/opt/backups/2015-06-28_13-42-58/

The key to this command is the --incremental-basedir parameter. This tells innobackupex to base the incremental off of whatever backup is in the directory specified. If you recall from earlier, each backup folder contains a file named xtrabackup_checkpoints, this file tells innobackupex which LSN to start it’s incremental backup at.

After running this command you will have just another timestamped directory in your backups directory. To find your incremental backups you could execute a command like this:

$ find /opt/backups -type f | xargs grep incremental

You should see output similar to this:

/opt/backups/2015-06-28_15-10-37/xtrabackup_checkpoints:backup_type = incremental

From here you can continue taking incremental backups, each one based off of the previous incremental, for example the next incremental would have a --incremental-basedir of /opt/backups/2015-06-28_15-10-37/. There is also an option available to base your incremental simply on a known LSN. We aren’t really going to go into any detail here, but it would do roughly the same operations as basing it off of an existing backup, except you have to manually specify the starting LSN via the --incremental-lsn parameter.

Preparing an Incremental Backup

Preparing an incremental backup is a little different than preparing a full backup. Earlier we mentioned that if you choose to take incremental backups you cannot prepare your backups right after you take them. The reason for this is that when preparing an incremental backup, all incremental backups need to be applied to the base full backup before anything can be prepared.

This process is very similar to the concept of restoring differential backups and transaction logs in SQL Server, you don’t want your subsequent restores to go through recovery until all backups are applied.

To begin we need to apply our logs to the base full backup we initially created:

$ innobackupex --user=backup --password='SuperSecret' --apply-log --redo-only /opt/backups/2015-06-28_13-42-58/

This brings the full backup to a state where all logs have been applied, but uncommitted transactions have NOT been rolled back.

Now we start applying out incremental backups:

$ innobackupex --user=backup --password='SuperSecret' --apply-log --redo-only /opt/backups/2015-06-28_13-42-58/ --incremental-dir=/opt/backups/2015-06-28_15-10-37/

If we had more incremental backups we would continue to execute this command, changing the --incremental-dir parameter to apply each incremental backup in the order they were taken. When you get to your final incremental to apply, you need to omit the --redo-only option, but if you forget, it’s no big deal, the server will handle it automatically.

Once all of your incremental backups are applied, we need to prepare the final backup. This statement needs to be run on the directory the base full backup resides in, the one we based all of our incrementals on:

$ innobackupex --user=backup --password='SuperSecret' --apply-log --use-memory=1G /opt/backups/2015-06-28_13-42-58/

Now just as before we can go through the steps of restoring a full backup.

Stop the service, remove the old data files, and copy the new data files:

$ sudo service mysql stop
$ sudo rm -rf /var/lib/mysql/*
$ innobackupex --copy-back /opt/backups/YourBackupDirectory

Now we fix any ownership issues and restart the service:

$ sudo chown -R mysql:mysql /var/lib/mysql
$ sudo service mysql start

Compact Backups

The compact backup isn’t so much a type of backup as an option when you are taking backups. A compact backup does not include any secondary index pages, a secondary index in MariaDB is analogous to a non-clustered index in SQL Server.

It’s important to understand that a compact backup does include the metadata needed to recreate the indexes, but it does not include the actual index pages themselves. What that means is that you can reduce your backup size while still retaining the ability to rebuild your secondary indexes at a later time.

The compact backup can potentially be much smaller than a standard backup depending on the number of secondary indexes you have. This has two obvious benefits: less space required per backup, and less time to actually perform the backup. Like all things in life and computers though, you are trading space and time NOW for space and time LATER. When you need to restore a compact backup, you will need to rebuild the indexes, which will take time and consume additional space.

You can take a compact backup by adding the --compact option to either a full or incremental backup. When you create a compact backup the compact flag will be set to 1 in the xtrabackup_checkpoints file in the backup directory. If you want to find you compact backups you can run this command:

$ find /opt/backups -type f | xargs grep -B 4 'compact = 1'

You should see output similar to this:

/opt/backups/2015-06-29_16-10-04/xtrabackup_checkpoints-backup_type = full-backuped
/opt/backups/2015-06-29_16-10-04/xtrabackup_checkpoints-from_lsn = 0
/opt/backups/2015-06-29_16-10-04/xtrabackup_checkpoints-to_lsn = 1644288
/opt/backups/2015-06-29_16-10-04/xtrabackup_checkpoints-last_lsn = 1644288
/opt/backups/2015-06-29_16-10-04/xtrabackup_checkpoints:compact = 1
--
/opt/backups/2015-06-29_16-11-11/xtrabackup_checkpoints-backup_type = incremental
/opt/backups/2015-06-29_16-11-11/xtrabackup_checkpoints-from_lsn = 1644288
/opt/backups/2015-06-29_16-11-11/xtrabackup_checkpoints-to_lsn = 1644288
/opt/backups/2015-06-29_16-11-11/xtrabackup_checkpoints-last_lsn = 1644288
/opt/backups/2015-06-29_16-11-11/xtrabackup_checkpoints:compact = 1

Preparing and Restoring a Compact Backup

Preparing and restoring a compact backup is simple. To prepare the backup you would prepare the backup as usual but include the --rebuild-indexes option.

In the case of a full backup there is only a single step involved in preparing the backup, so the --rebuild-indexes option would be included in that step. When preparing incremental backups the --rebuild-indexes option should only be included in the final step of the prepare, once you are applying the final incremental backup to the full, base, backup.

Once the index rebuilds are complete, and the logs are applied, your backup will be full-size, as if you have taken a normal (non-compact) backup, so make sure you have enough space before you start the prepare process. At this point you can restore the backup as usual via the --copy-back option discussed earlier.

Compact can be a great way to save space and reduce the amount of time it takes to create a backup. While compact is great,

Table or Database-Specific Restores

Now what if you only wanted to restore a single database? Well, that gets a bit tougher. If you want to do this level of restore you need to make sure the innodb_file_per_table option is enabled. If you are using MariaDB 5.5 or greater this is on by default.

Technically you cannot restore a single database from a prepared backup, but what you can do is restore each of the tables in the database, effectively restoring the entire database.

The Restore Process

For the examples below lets assume we want to restore a database creatively named mydatabase. We will walk through the steps to restore a single table named OneofMyTables. To restore a specific database, you would need to go through this process for each table in that database.

The first step in this process is to export all (yes all) of the tables from your prepared backup:

$ innobackupex --apply-log --export /opt/backups/YourBackupDirectory

Now, in the database you want to restore the tables to you will have to drop the old table and recreate it:

USE mydatabase;
DROP TABLE OneOfMyTables;
CREATE TABLE OneOfMyTables ( MyColumn INT );

Then we need to discard the tablespace for each table:

ALTER TABLE OneOfMyTables DISCARD TABLESPACE;

Now we need to remove the existing data files and copy the OneOfMyTables.ibd and OneOfMyTables.exp files from our backup to our data directory:

$ cd /var/lib/mysql/mydatabase/
$ sudo rm -rf OneOfMyTables.*
$ sudo cp /opt/backups/YourBackupDirectory/mydatabase/OneOfMyTables.exp .
$ sudo cp /opt/backups/YourBackupDirectory/mydatabase/OneOfMyTables.ibd .

Since you are executing these commands as a user, we’ll need to change the file ownership back so MariaDB can read the files. This step can wait until all of your table export files have been copied over:

$ sudo chown -R mysql:mysql /var/lib/mysql/mydatabase/

Executing this statement from within the mydatabase database:

ALTER TABLE mydatabase.OneOfMyTables IMPORT TABLESPACE;

Now you should be all set. The table should be available to query and it should contain the data from the backup.

Foreign Key Considerations

If you are dealing with tables that have foreign key constraints, it is up to you to make sure all related tables are imported in a consistent state. If you are importing a table that is used by other tables for foreign key look-ups, you will need to disable foreign key checks before you drop the original table:

SET FOREIGN_KEY_CHECKS=0;

Once the table has been completely imported you can turn your foreign key checks back on:

SET FOREIGN_KEY_CHECKS=1;

From this point forward your foreign key constraints will function as usual, but there is a caveat here. If the data you imported violates a foreign key constraint you will not know until you try to manipulate the record that violates the constraint. MariaDB has no built-in mechanism to recheck your foreign key constraints. Interestingly, some adventurous users have written scripts and tools to take care of this, I am not going to cover those scripts here but I will include a link the Resources section.

Thoughts on Table Imports

Table imports are kind of a clunky process, but it’s still a pretty cool feature. It’s worth going over this a few times with some test data to get a good understanding of the steps involved. There are some good scripts people have written to attempt to automate this process, but it’s always useful learning how to do it by hand.

Partial Backup

Partial backups are a special type of backup that allow you to backup specific tables or databases instead of backing up everything. While this sounds great, it has it’s drawbacks.

Taking a Partial Backup

There are three ways you can take a partial backup. Each method is defined by a different option:

  • - -include: This is the most flexible and the most complex to use. The --include option identifies which tables to backup via regular expression (regex). In our mydatabase.OneOfMyTables example from above we might specify a regex of ^mydatabase[.]OneOfMyTables$. If we wanted to grab the whole database we might use ^mydatabase[.]*. If you are familiar with regex you can see the flexibility here, if you are not I suggest reading up on it.

  • - -tables-file: This option takes a text file as an argument, this text file contains a fully qualified table per line (database.table).

  • - -databases: This option takes a space separated list of database names and/or fully qualified table names OR a text file containing a list with one database and/or fully qualified table per line.

All of these options will result in a backup that looks pretty much like all the other backups you have seen so far, the only difference is that instead of seeing one directory per database, you will see one directory per database that was specified and one database directory per table that was specified. For example, if you used the --databases option and passed it TestDB TestDB01.SomeTable you would see a directory for TestDB and a directory for TestDB01.

Preparing and Restoring a Partial Backup

Unfortunately restoring a partial backup can really only be done via the export and import table process we discussed above in the Table or Database-Specific Restores section. The only other option is to restore ONLY the partial backup and wipe out the rest of your data. This could be useful in some limited situations.

As mentioned, you can restore a partial backup using the same methods you would use to restore a full backup, but the following must be true:

  • The mysql database was included in the partial backup
  • Your data directory must be empty

If those conditions are met, you can restore a partial backup by following the steps for preparing and restoring (via --copy-back) a full backup.

Final Thoughts

Overall you have a lot of options when using Perconas Xtrabackup software, but as I was told by a user on the #mysql IRC channel “it’s not exactly polished”. This article really just scratches the surface when it comes to your backup and restore options, Xtrabackup is just one piece of software to accomplish this.

If you are interested in diving in a little deeper you’ll find a lot people doing things like maintaining a replica slave specifically for executing backups, other people are using LVM or SAN snapshots to take backups, while some are still able to use the mysqldump utility to literally dump the sql scripts needed to rebuild the schema of the instance and populate the tables with data.

Whatever your needs are you should be able to find a backup process that works for you in the MariaDB/MySQL world. Compared to SQL Server you might find that some of it takes a little more work, or a little more time to get used to, but in the end you should still be more than able to effectively backup the data you have been charged with protecting.

Coming Soon

We covered a lot in this post, but there is still more to cover! Keep an eye out for the the second post in this series where we will dive into some more advanced backup and restore options, including streaming compressed backups and encrypted backups.

Thanks

Special thanks to Anthony E. Nocentino @ centinosystems.com for proof-reading this post and offering some suggestions! Anthony is the Enterprise Architect at Centino Systems, you can find him on twitter: @nocentino

Resources

Below is a list of websites and other places you can find more information on MariaDB, MySQL, Percona, and Linux: