Setting up replication to an RDS instance

This is basically the official RDS documentation rephrased in a way that makes sense to my brain. These will take data from a “normal” MySQL server (e.g., installed by you into an EC2 instance) and import it into an RDS instance, then enable replication. Their instructions are correct, but caused me a good bit of confusion and didn’t prepare me for some gotchas.

You’ll have two instances, which I’ll refer to as such:

  • Master, the non-RDS instance (Amazon calls this the “Replication Source”)
  • Slave, the RDS instance which will pull data from the master

Launch a slave RDS instance

This one is normal. Log into AWS, and start up an RDS instance. Amazon says that you should not enable multi-AZ support until the import is complete. I missed that detail, and importing my trivial (one row in one table, for testing) database went fine. They’re probably right, though. Don’t forget the credentials you create! For this post, I used ‘dbuser’ as a username, and ‘dbpassword’ as a password. (Obviously, use something better in the real world.)

Make sure to get security groups / VPC ACLs right. I put them in the same VPC, and just enabled 3306 all around and it was good. They have more detailed instructions in the docs.

Configure the master

You’ll need to do several things on the master:

Enable binlogs and set a server-id

MySQL requires that a binary log (binlog) be used before replication is possible. You also need to set a server-id parameter, with a unique ID.

I just dropped this in the [mysqld] section of /etc/mysql.conf:

log-bin=mysql-bin
server-id=101

If this is the only server, server-id doesn’t really matter.

You need to service mysqld restart for this to apply.

Add a replication user

This one wasn’t abundantly clear to me. You need to add a replication user to the master, which the slave will use.

You’ll want the following two statements (with this example taken direclty from the MySQL docs): CREATE USER 'repl'@'%.mydomain.com' IDENTIFIED BY 'slavepass';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%.mydomain.com';

Obviously, customize the hostname part. I just used ‘%’ because I was doing a POC test in a VPC, but that should be locked down for anything real.

Export a DB dump

Use mysqldump to create a snapshot.

I just wanted to copy one database, so I ran something like this:

mysqldump -u root -p --database test_db1 --master-data > test_db1.dump

That will prompt for a password, and then write a dump of the database to test_db1.dump. Next, we’ll import this.

Import the dump to RDS

Hopefully by now the RDS instance has come online. Test that you can connect to it over MySQL. (Note: you cannot ssh into the RDS node. It only exposes MySQL as a service.)

We now want to import that database dump, and then we can start replication. But first, we need to tweak one thing in the dump we just created!

With --master-data, a line like this is written near the top of the dump file: CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=107;

I had to remove that line, or else I got this error:

Access denied; you need (at least one of) the SUPER privilege(s) for this operation

With that fixed, it’s time to import the data. The thing that’s not necessarily intuitive is that you want to run the MySQL client from your existing database server, and use -h to specify a remote hostname. You can’t ssh to the RDS instance and run it locally, because they don’t have ssh enabled. Here’s the command I used:

mysql -u dbuser -p -h REDACTED.us-east-1.rds.amazonaws.com < test_db1.dump

Enable replication

With the old database imported on RDS, it’s time to enable replication to get it to sync up with anything since the dump was taken, and then stay current. Since we don’t have ssh access, Amazon gives us a few custom procedures in MySQL we can run.

Connect to MySQL on your RDS slave (e.g., mysql -u dbuser -p -h REDACTED.us-east-1.rds.amazonaws.com or whatever).

In that MySQL shell, use their mysql.rds_set_external_master procedure, by running something like this (read the docs for more details):

CALL mysql.rds_set_external_master (
'REDACTED.us-east-1.rds.amazonaws.com',
'3306',
'repl',
'repl-password',
'mysql-bin',
'00001',
'0'
);

It’s important to note that you need to use the credentials for the replication user you created, not the normal admin credentials.

Once that’s configured, start replication, with mysql.rds_start_replication. That one is much simpler, as it doesn’t take any arguments:

CALL mysql.rds_start_replication;

Then, you can run SHOW SLAVE STATUS\G to view the replication status. If all went well, there will be no errors. Yay! You can skip replication errors with another procedure they implement, mysql.rds_skip_repl_error, though ideally that won’t be necessary.

At this point, data inserted to the master should show up on the slave automatically. (Don’t insert rows into the slave yet, or you’ll end up with a real mess!)

Promote the slave

Amazon provides those instructions for the purposes of importing a database, then cutting over to use the RDS node as a master. When the RDS slave is cut over and your application is ready, you can stop replication, decommission the master, and start using the RDS slave as your master.

There are two procedures you’ll be interested in here; mysql.rds_stop_replication and mysql.rds_reset_external_master to unset the master information. Remember to clean up security groups, the old master, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax