This chapter describes the various replication features in MySQL. It serves as a reference to the options available with replication. You will be introduced to replication and learn how to implement it. Towards the end, there are some frequently asked questions and descriptions of problems and how to solve them.
One way replication can be used is to increase both robustness and speed. For robustness you can have two systems and can switch to the backup if you have problems with the master. The extra speed is achieved by sending a part of the non-updating queries to the replica server. Of course this only works if non-updating queries dominate, but that is the normal case.
Starting in Version 3.23.15, MySQL supports one-way replication internally. One server acts as the master, while the other acts as the slave. Note that one server could play the roles of master in one pair and slave in the other. The master server keeps a binary log of updates (See section 23.4 The Binary Log.) and an index file to binary logs to keep track of log rotation. The slave, upon connecting, informs the master where it left off since the last successfully propagated update, catches up on the updates, and then blocks and waits for the master to notify it of the new updates.
Note that if you are replicating a database, all updates to this database should be done through the master!
On older servers one can use the update log to do simple replication. See section 22.1 Database Replication with Update Log.
Another benefit of using replication is that one can get live backups of the system by doing a backup on a slave instead of doing it on the master. See section 22.2 Database Backups.
MySQL replication is based on the server keeping track of all changes to your database (updates, deletes, etc) in the binary log. (See section 23.4 The Binary Log.) and the slave server(s) reading the saved queries from the master server's binary log so that the slave can execute the same queries on its copy of the data.
It is very important to realize that the binary log is simply a record starting from a fixed point in time (the moment you enable binary logging). Any slaves which you set up will need copies of all the data from your master as it existed the moment that you enabled binary logging on the master. If you start your slaves with data that doesn't agree with what was on the master when the binary log was started, your slaves may fail.
A future version (4.0) of MySQL will remove the need to keep a (possibly large) snapshot of data for new slaves that you might wish to set up through the live backup functionality with no locking required. However, at this time, it is necessary to block all writes either with a global read lock or by shutting down the master while taking a snapshot.
Once a slave is properly configured and running, it will simply connect
to the master and wait for updates to process. If the master goes away
or the slave loses connectivity with your master, it will keep trying to
master-connect-retry seconds until it is able to
reconnect and resume listening for updates.
Each slave keeps track of where it left off. The master server has no knowledge of how many slaves there are or which ones are up-to-date at any given time.
The next section explains the master/slave setup process in more detail.
Below is a quick description of how to set up complete replication on your current MySQL server. It assumes you want to replicate all your databases and have not configured replication before. You will need to shutdown your master server briefly to complete the steps outlined below.
FILEprivilege and permission to connect from all the slaves. If the user is only doing replication (which is recommended), you don't need to grant any additional privileges. For example, to create a user named
replwhich can access your master from any host, you might use this command:
GRANT FILE ON *.* TO repl@"%" IDENTIFIED BY '<password>';
mysqladmin -u root -p<password> shutdown
tar -cvf /tmp/mysql-snapshot.tar /path/to/data-dirWindows users can use WinZip or similar software to create an archive of the data directory.
my.cnfon the master add
server-id=unique numberto the
[mysqld]section and restart it. It is very important that the id of the slave is different from the id of the master. Think of
server-idas something similar to the IP address - it uniquely identifies the server instance in the community of replication partners.
[mysqld] log-bin server-id=1
my.cnfon the slave(s):
master-host=<hostname of the master> master-user=<replication user name> master-password=<replication user password> master-port=<TCP/IP port for master> server-id=<some unique number between 2 and 2^32-1>replacing the values in <> with what is relevant to your system.
server-idmust be different for each server participating in replication. If you don't specify a server-id, it will be set to 1 if you have not defined
master-host, else it will be set to 2. Note that in the case of
server-idomission the master will refuse connections from all slaves, and the slave will refuse to connect to a master. Thus, omitting
server-idis only good for backup with a binary log.
After you have done the above, the slave(s) should connect to the master and catch up on any updates which happened since the snapshot was taken.
If you have forgotten to set
server-id for the slave you will get
the following error in the error log file:
Warning: one should set server_id to a non-0 value if master_host is set. The server will not act as a slave.
If you have forgot to do this for the master, the slaves will not be able to connect to the master.
If a slave is not able to replicate for any reason, you will find error messages in the error log on the slave.
Once a slave is replicating, you will find a file called
master.info in the same directory as your error log. The
master.info file is used by the slave to keep track of how much
of the master's binary log is has processed. Do not remove or
edit the file, unless you really know what you are doing. Even in that case,
it is preferred that you use
CHANGE MASTER TO command.
Below is an explanation of what is supported and what is not:
RAND()in updates does not replicate properly. Use
RAND(some_non_rand_expr)if you are replicating updates with
RAND(). You can, for example, use
UNIX_TIMESTAMP()for the argument to
--default-character-set) on the master and the slave. If not, you may get duplicate key errors on the slave, because a key that is regarded as unique on the master may not be that in the other character set.
LOAD DATA INFILEwill be handled properly as long as the file still resides on the master server at the time of update propagation.
LOAD LOCAL DATA INFILEwill be skipped.
FLUSHcommands are not stored in the binary log and are because of this not replicated to the slaves. This is not normally a problem as
FLUSHdoesn't change anything. This does however mean that if you update the
MySQLprivilege tables directly without using
GRANTstatement and you replicate the
MySQLprivilege database, you must do a
FLUSH PRIVILEGESon your slaves to put the new privileges into effect.
SLAVE STOP, then check
Slave_open_temp_tablesvariable to see if it is 0, then issue
mysqladmin shutdown. If the number is not 0, restart the slave thread with
SLAVE STARTand see if you have better luck next time. There will be a cleaner solution, but it has to wait until version 4.0. In earlier versions temporary tables are not being replicated properly - we recommend that you either upgrade, or execute
SET SQL_LOG_BIN=0on your clients before all queries with temp tables.
log-slave-updatesenabled. Note, however, that many queries will not work right in this kind of setup unless your client code is written to take care of the potential problems that can happen from updates that occur in different sequence on different servers. This means that you can do a setup like the following:
A -> B -> C -> AThis setup will only works if you only do non conflicting updates between the tables. In other words, if you insert data in A and C, you should never insert a row in A that may have a conflicting key with a row insert in C. You should also not update the sam rows on two servers if the order in which the updates are applied matters. Note that the log format has changed in Version 3.23.26 so that pre-3.23.26 slaves will not be able to read it.
.errfile. You should then connect to the slave manually, fix the cause of the error (for example, non-existent table), and then run
SLAVE STARTsql command (available starting in Version 3.23.16). In Version 3.23.15, you will have to restart the server.
master-connect-retry(default 60) seconds. Because of this, it is safe to shut down the master, and then restart it after a while. The slave will also be able to deal with network connectivity outages.
my.cnfor just exclude a set of databases with
replicate-ignore-db. Note that up until Version 3.23.23, there was a bug that did not properly deal with
LOAD DATA INFILEif you did it in a database that was excluded from replication.
SET SQL_LOG_BIN = 0will turn off replication (binary) logging on the master, and
SET SQL_LOG_BIN = 1will turn in back on - you must have the process privilege to do this.
FLUSH SLAVEcommands. In Version 3.23.26 we have renamed them to
RESET SLAVErespectively to clarify what they do. The old
FLUSHvariants still work, though, for compatibility.
LOAD TABLE FROM MASTERfor network backup and to set up replication initially. We have recently received a number of bug reports concerning it that we are investigating, so we recommend that you use it only in testing until we make it more stable.
CHANGE MASTER TO.
replicate-rewrite-dbto tell the slave to apply updates from one database on the master to the one with a different name on the slave.
PURGE MASTER LOGS TO 'log-name'to get rid of old logs while the slave is running.
If you are using replication, we recommend you to use MySQL Version 3.23.30 or later. Older versions work, but they do have some bugs and are missing some features.
On both master and slave you need to use the
This sets an unique replication id. You should pick a unique value in the
range between 1 to 2^32-1 for each master and slave.
The following table has the options you can use for the MASTER:
Write to a binary update log to the specified location. Note that if you
give it a parameter with an extension
(for example, |
Because the user could issue the |
If set, setting |
Tells the master it should log updates for the specified database, and
exclude all others not explicitly mentioned.
Tells the master that updates to the given database should not be logged
to the binary log (Example: |
The following table has the options you can use for the SLAVE:
Master hostname or IP address for replication. If not set, the slave
thread will not be started.
The user the slave thread will us for authentication when connecting to
the master. The user must have |
The password the slave thread will authenticate with when connecting to
the master. If not set, an empty password is assumed. (Example:
The port the master is listening on. If not set, the compiled setting of
The number of seconds the slave thread will sleep before retrying to
connect to the master in case the master goes down or the connection is
lost. Default is 60. (Example: |
The location of the file that remembers where we left off on the master
during the replication process. The default is master.info in the data
directory. Sasha: The only reason I see for ever changing the default
is the desire to be rebelious. (Example:
Tells the slave thread to restrict replication to the specified database.
To specify more than one table, use the directive multiple times,
once for each table. .
Tells the slave thread to not replicate to the specified table. To
specify more than one table to ignore, use the directive multiple
times, once for each table.(Example:
Tells the slave thread to restrict replication to the tables that match the
specified wildcard pattern. .
To specify more than one table, use the directive multiple times,
once for each table. .
Tells the slave thread to not replicate to the tables that match the given
wild card pattern. To
specify more than one table to ignore, use the directive multiple
times, once for each table.(Example:
Tells the slave thread to not replicate to the specified database. To
specify more than one database to ignore, use the directive multiple
times, once for each database. This option will not work if you use cross
database updates. If you need cross database updates to work, make sure
you have 3.23.28 or later, and use
Tells the slave thread to restrict replication to the specified database.
To specify more than one database, use the directive multiple times,
once for each database. Note that this will only work if you do not use
cross-database queries such as |
|Tells the slave to log the updates from the slave thread to the binary log. Off by default. You will need to turn it on if you plan to daisy-chain the slaves.|
Updates to a database with a different name than the original (Example:
Tells the slave server not to start the slave on the startup. The user
can start it later with |
Replication can be controlled through the SQL interface. Below is the summary of commands:
|Starts the slave thread. (Slave)|
|Stops the slave thread. (Slave)|
|Disables update logging if the user has process privilege. Ignored otherwise. (Master)|
|Re-enables update logging if the user has process privilege. Ignored otherwise. (Master)|
| Skip the next |
| Deletes all binary logs listed in the index file, resetting the binlog
index file to be empty. In pre-3.23.26 versions, |
| Makes the slave forget its replication position in the master
logs. In pre 3.23.26 versions the command was called
|Downloads a copy of the table from master to the slave. (Slave)|
| Changes the master parameters to the values specified in
CHANGE MASTER TO MASTER_HOST='master2.mycompany.com', MASTER_USER='replication', MASTER_PASSWORD='bigs3cret', MASTER_PORT=3306, MASTER_LOG_FILE='master2-bin.001', MASTER_LOG_POS=4;You only need to specify the values that need to be changed. The values that you omit will stay the same with the exception of when you change the host or the port. In that case, the slave will assume that since you are connecting to a different host or a different port, the master is different. Therefore, the old values of log and position are not applicable anymore, and will automatically be reset to an empty string and 0, respectively (the start values). Note that if you restart the slave, it will remember its last master. If this is not desirable, you should delete the `master.info' file before restarting, and the slave will read its master from
|Provides status information on the binlog of the master. (Master)|
|Provides status information on essential parameters of the slave thread. (Slave)|
| Only available starting in Version 3.23.28. Lists the binary logs on the master. You should use this command prior to |
| Available starting in Version 3.23.28. Deletes all the
replication logs that are listed in the log
index as being prior to the specified log, and removed them from the
log index, so that the given log now becomes first. Example:
PURGE MASTER LOGS TO 'mysql-bin.010'This command will do nothing and fail with an error if you have an active slave that is currently reading one of the logs you are trying to delete. However, if you have a dormant slave, and happen to purge one of the logs it wants to read, the slave will be unable to replicate once it comes up. The command is safe to run while slaves are replicating - you do not need to stop them. You must first check all the slaves with
Q: Why do I sometimes see more than one
Binlog_Dump thread on
the master after I have restarted the slave?
Binlog_Dump is a continuous process that is handled by the
server in the following way:
pthread_cond_wait(), from which we can be awakened either by an update or a kill.
So if the slave thread stops on the slave, the corresponding
Binlog_Dump thread on the master will not notice it until after
at least one update to the master (or a kill), which is needed to wake
it up from
pthread_cond_wait(). In the meantime, the slave
could have opened another connection, which resulted in another
The above problem should not be present in Version 3.23.26 and later
versions. In Version 3.23.26 we added
server-id to each
replication server, and now all the old zombie threads are killed on the
master when a new replication thread connects from the same slave
Q: How do I rotate replication logs?
A: In Version 3.23.28 you should use
PURGE MASTER LOGS
TO command after determining which logs can be deleted, and optionally
backing them up first. In earlier versions the process is much more
painful, and cannot be safely done without stopping all the slaves in
the case that you plan to re-use log names. You will need to stop the
slave threads, edit the binary log index file, delete all the old logs,
restart the master, start slave threads, and then remove the old log files.
Q: How do I upgrade on a hot replication setup?
A: If you are upgrading pre-3.23.26 versions, you should just
lock the master tables, let the slave catch up, then run
MASTER on the master, and
FLUSH SLAVE on the slave to reset the
logs, then restart new versions of the master and the slave. Note that
the slave can stay down for some time - since the master is logging
all the updates, the slave will be able to catch up once it is up and
After 3.23.26, we have locked the replication protocol for modifications, so you can upgrade masters and slave on the fly to a newer 3.23 version and you can have different versions of MySQL running on the slave and the master, as long as they are both newer than 3.23.26.
Q: What issues should I be aware of when setting up two-way replication?
A: MySQL replication currently does not support any locking protocol between master and slave to guarantee the atomicity of a distributed (cross-server) update. In in other words, it is possible for client A to make an update to co-master 1, and in the meantime, before it propagates to co-master 2, client B could make an update to co-master 2 that will make the update of client A work differently than it did on co-master 1. Thus when the update of client A will make it to co-master 2, it will produce tables that will be different than what you have on co-master 1, even after all the updates from co-master 2 have also propagated. So you should not co-chain two servers in a two-way replication relationship, unless you are sure that you updates can safely happen in any order, or unless you take care of mis-ordered updates somehow in the client code.
You must also realize that two-way replication actually does not improve performance very much, if at all, as far as updates are concerned. Both servers need to do the same amount of updates each, as you would have one server do. The only difference is that there will be a little less lock contention, because the updates originating on another server will be serialized in one slave thread. This benefit, though, might be offset by network delays.
Q: How can I use replication to improve performance of my system?
A: You should set up one server as the master, and direct all
writes to it, and configure as many slaves as you have the money and
rackspace for, distributing the reads among the master and the slaves.
You can also start the slaves with
to get speed improvements for the slave. In this case the slave will
MyISAM tables instead of
to get more speed.
Q: What should I do to prepare my client code to use performance-enhancing replication?
A: If the part of your code that is responsible for database access has been properly abstracted/modularized, converting it to run with the replicated setup should be very smooth and easy - just change the implementation of your database access to read from some slave or the master, and to always write to the master. If your code does not have this level of abstraction, setting up a replicated system will give you an opportunity/motivation to it clean up. You should start by creating a wrapper library /module with the following functions:
safe_ means that the function will take care of handling all
the error conditions.
You should then convert your client code to use the wrapper library.
It may be a painful and scary process at first, but it will pay off in
the long run. All applications that follow the above pattern will be
able to take advantage of one-master/many slaves solution. The
code will be a lot easier to maintain, and adding troubleshooting
options will be trivial. You will just need to modify one or two
functions, for example, to log how long each query took, or which
query, among your many thousands, gave you an error. If you have written a lot of code already,
you may want to automate the conversion task by using Monty's
replace utility, which comes with the standard distribution of
MySQL, or just write your own Perl script. Hopefully, your
code follows some recognizable pattern. If not, then you are probably
better off re-writing it anyway, or at least going through and manually
beating it into a pattern.
Note that, of course, you can use different names for the functions. What is important is having unified interface for connecting for reads, connecting for writes, doing a read, and doing a write.
Q: When and how much can MySQL replication improve the performance of my system?
A: MySQL replication is most beneficial for a system with frequent reads and not so frequent writes. In theory, by using a one master/many slaves setup you can scale by adding more slaves until you either run out of network bandwidth, or your update load grows to the point that the master cannot handle it.
In order to determine how many slaves you can get before the added
benefits begin to level out, and how much you can improve performance
of your site, you need to know your query patterns, and empirically
(by benchmarking) determine the relationship between the throughput
on reads (reads per second, or
max_reads) and on writes
max_writes) on a typical master and a typical slave. The
example below will show you a rather simplified calculation of what you
can get with replication for our imagined system.
Let's say our system load consists of 10% writes and 90% reads, and we
have determined that
max_reads = 1200 - 2 *
or in other words, our system can do 1200 reads per second with no
writes, our average write is twice as slow as average read,
and the relationship is
linear. Let us suppose that our master and slave are of the same
capacity, and we have N slaves and 1 master. Then we have for each
server (master or slave):
reads = 1200 - 2 * writes (from bencmarks)
reads = 9* writes / (N + 1) (reads split, but writes go
to all servers)
9*writes/(N+1) + 2 * writes = 1200
writes = 1200/(2 + 9/(N+1)
So if N = 0, which means we have no replication, our system can handle 1200/11, about 109 writes per second (which means we will have 9 times as many reads due to the nature of our application).
If N = 1, we can get up to 184 writes per second.
If N = 8, we get up to 400.
If N = 17, 480 writes.
Eventually as N approaches infinity (and our budget negative infinity), we can get very close to 600 writes per second, increasing system throughput about 5.5 times. However, with only 8 servers, we increased it almost 4 times already.
Note that our computations assumed infinite network bandwidth, and neglected several other factors that could turn out to be significant on your system. In many cases, you may not be able to make a computation similar to the one above that will accurately predict what will happen on your system if you add N replication slaves. However, answering the following questions should help you decided whether and how much, if at all, the replication will improve the performance of your system:
Q: How can I use replication to provide redundancy/high availability?
A: With the currently available features, you would have to set up a master and a slave (or several slaves), and write a script that will monitor the master to see if it is up, and instruct your applications and the slaves of the master change in case of failure. Some suggestions:
CHANGE MASTER TOcommand.
nsupdateto dynamically update your DNS.
log-binoption and without
log-slave-updates. This way the slave will be ready to become a master as soon as you issue
RESET MASTER, and
CHANGE MASTER TOon the other slaves. It will also help you catch spurious updates that may happen because of misconfiguration of the slave (ideally, you want to configure access rights so that no client can update the slave, except for the slave thread) combined with the bugs in your client programs (they should never update the slave directly).
We are currently working on integrating an automatic master election system into MySQL, but until it is ready, you will have to create your own monitoring tools.
If you have followed the instructions, and your replication setup is not working, first eliminate the user error factor by checking the following:
SHOW MASTER STATUS. If it is,
Positionwill be non-zero. If not, verify that you have given the master
log-binoption and have set
SHOW SLAVE STATUS. The answer is found in
Slave_runningcolumn. If not, verify slave options and check the error log for messages.
SHOW PROCESSLIST, find the thread with
system uservalue in
Hostcolumn, and check the
Statecolumn. If it says
connecting to master, verify the privileges for the replication user on the master, master host name, your DNS setup, whether the master is actually running, whether it is reachable from the slave, and if all that seems ok, read the error logs.
SET SQL_SLAVE_SKIP_COUNTER=1; SLAVE START;to skip a query that does not use auto_increment, last_insert_id or timestamp, or
SET SQL_SLAVE_SKIP_COUNTER=2; SLAVE START;otherwise
grep -i slave /path/to/your-log.erron the slave. There is no generic pattern to search for on the master, as the only errors it logs are general system errors - if it can, it will send the error to the slave when things go wrong.
When you have determined that there is no user error involved, and replication
still either does not work at all or is unstable, it is time to start working
on a bug report. We need to get as much info as possible from you to be able
to track down the bug. Please do spend some time and effort preparing a good
bug report. Ideally, we would like to have a test case in the format found in
mysql-test/t/rpl* directory of the source tree. If you submit a test
case like that, you can expect a patch within a day or two in most cases,
although, of course, you mileage may vary depending on a number of factors.
Second best option is a just program with easily configurable connection arguments for the master and the slave that will demonstrate the problem on our systems. You can write one in Perl or in C, depending on which language you know better.
If you have one of the above ways to demonstrate the bug, use
mysqlbug to prepare a bug report and send it to
email@example.com. If you have a phantom - a problem that
does occur but you cannot duplicate "at will":
log-bin- this will keep a log of all updates on the slave.
SHOW MASTER STATUSon the master at the time you have discovered the problem
SHOW SLAVE STATUSon the master at the time you have discovered the problem
mysqlbinlogto examine the binary logs. The following should be helpful to find the trouble query, for example:
mysqlbinlog -j pos_from_slave_status /path/to/log_from_slave_status | head
Once you have collected the evidence on the phantom problem, try hard to isolate it into a separate test case first. Then report the problem to firstname.lastname@example.org with as much info as possible.
Go to the first, previous, next, last section, table of contents.