Conceptually, the binary log is a sequence of binary log events (also called
binlog events or even just
events when there is no risk of confusion). The binary log
actually consists of several files, as shown in Figure 1, that together form the
binary log.

The actual events are stored in a series of files called binlog files with names in the form
host-bin.000001, accompanied by a binlog index file that is usually
named host-bin.index and keeps track of the existing
binlog files. The binlog file that is currently being written to by the
server is called the active binlog file. If all the slaves
have caught up with the master, this is also the file that is being read
by the slaves. The names of the binlog files and the binlog index file can
be controlled using the options log-bin and log-bin-index.
The index file keeps track of all the binlog files used by the
server so that the server can correctly create new binlog files when
necessary, even after server restarts. Each line in the index file
contains the full name of a binlog file that is part of the binary log.
Commands that affect the binlog files, such as PURGE BINARY LOGS,
RESET MASTER, and FLUSH LOGS, also affect the index file by adding
or removing lines to match the files that were added or removed by the
command.
As shown in Figure 2, each binlog file is
made up of binlog events, with the Format_description event serving as the file’s header and the Rotate event as its footer. Note that a binlog file might not end with a
rotate event if the server was stopped or crashed.

The Format_description event
contains information about the server that wrote the binlog file as well
as some critical information about the file’s status. If the server is
stopped and restarted, a new binlog file is created and a new Format_description event is written to it. This
is necessary since changes can potentially occur between bringing a server
down and bringing it up again. For example, the server could be upgraded,
in which case a new Format_description
event would have to be written.
When the server has finished writing a binlog file, a Rotate event is added to end the file. The event
points to the next binlog file in sequence by giving the name of the file
as well as the position to start reading from.
The Format_description event and
the Rotate event will be described in
detail in the next section.
With the exception of the Format_description and Rotate events, the events of a binlog file are
grouped into units called groups. In transactional storage
engines, each group is roughly equivalent to a transaction, but for
nontransactional storage engines or statements that cannot be part of a
transaction, such as CREATE or ALTER statements, each statement is a group by
itself. In short, each group of events in the binlog file contains either
a single statement not in a transaction or a transaction consisting of
several statements.
Normally, each group is executed entirely or not at all. If, for
some reason, the slave stops in the middle of a group, replication will
start from the beginning of the group and not from the last statement
executed.
1. Binlog Event Structure
In MySQL 5.0, a new binlog format—binlog format 4—was introduced. The
preceding formats were not easy to extend with additional fields if the
need should arise, so binlog format 4 was designed specifically to be
extensible. This is still the event format used in every server version
since 5.0, even though each version of the server has extended the
binlog format with new events and some events with new fields.
Each binlog event consists of three parts:
Common header
The common header is—as the name suggests—common to all
events in the binlog file.
The common header contains basic information about the
event, the most important fields being the event type and the size
of the event.
Post header
The post header is specific to each event type; in other
words, each event type stores different information in this field.
But the size of this header, just as with the common header, is
the same throughout a given binlog file. The size of each event
type is given by the Format_description event.
Event body
Last in each event comes the event body, which is the
variable-sized part of the event. The size is listed in the common
header for the event. The event body stores the main data of the
event, which is different for different event types. For
the Query event, for
instance, the body stores the query, and for the User_var event, the body stores the name
and value of a user variable that was just set by a
statement.
As already noted, the Format_description event starts every binlog
file and contains common information about the events in the file. The
result is that the Format_description event can be
different between different files; this typically occurs when a server
is upgraded and restarted.
Binlog file format version
This is the version of the binlog file, which should not be
confused with the version of the server. MySQL versions 3.23, 4.0,
and 4.1 use version 3 of the binary log, while MySQL versions 5.0
and later use version 4 of the binary log.
The binlog file format version changes when developers make
significant changes in the overall structure of the file or the
events. In version 5.0, the start event for a binlog file was
changed to use a different format and the common headers for all
events were also changed, which prompted the change in the binlog
file format version.
Server version
This is a version string denoting the server that created
the file. This includes the version of the server as well as
additional information if special builds are made. The format is
normally the three-position version number, followed by a hyphen
and any additional build options. For example, “5.1.40-debug-log”
means debug build version 5.1.40 of the server.
Common header length
This field stores the length of the common header. Since
it’s here in the Format_description, this length can
be different for different binlog files. This holds for all events
except the Format_description and Rotate events, which cannot vary. The
length of Format_description
is fixed because a server has to read the event regardless of
which version of the server produced it. The reason the Rotate event has a fixed common header
is that the event is used when the slave connects to the master,
before any events from the binlog file have been seen. So for
these two events, the size of the common header is fixed and will
never change between server versions.
Post-header lengths
The post-header length for each event is fixed within a
binlog file, and this field stores an array of the post-header
length for each event that can occur in the binlog file. Since the
number of events can vary between servers, the number of events
that the server can produce is stored before this field.
Since both the size of the common header and the size of the post
header for each event type are given in the Format_description event, extending the format
with new events or even increasing the size of the post headers by
adding new fields will not affect the high-level format of the binlog
file.
With each extension, particular care is taken to ensure that the
extension does not affect interpretation of earlier-version events. For
example, the common header can be extended with an additional field to
indicate that the event is compressed and the type of compression used,
but if this field is missing—which would be the case if a slave is
reading events from an old master—the server should still be able to
fall back on its old behavior.