Optimizing Replication: Selective Data Transfer with binlog-do-db and replicate-do-db in MySQL and MariaDB

2024-04-02

Replication in MySQL and MariaDB

In MySQL and MariaDB, master-slave replication allows you to keep a secondary server (slave) in sync with a primary server (master). This ensures data redundancy and scalability for read-heavy workloads or disaster recovery scenarios.

Binary Logging

A crucial aspect of replication is binary logging. The master server records all data-modification statements (like INSERT, UPDATE, and DELETE) in a binary log file. This log becomes the source of truth for keeping the slave server updated.

Filtering Replication with binlog-do-db and replicate-do-db

These options enable you to selectively replicate data based on specific databases. However, they have distinct functionalities:

  • binlog-do-db (on the Master):

    • Configured on the master server.
    • Instructs the master to only write statements that modify tables within the specified databases to its binary log.
    • Statements affecting excluded databases won't be logged, preventing replication for those databases.
  • replicate-do-db (on the Slave):

    • Configured on the slave server.
    • Instructs the slave to only apply statements from its relay log (a temporary log storing binary log events received from the master) that modify tables within the specified databases.
    • Statements targeting excluded databases on the master will still be replicated, but the slave will ignore them.

Key Differences:

  • Target: binlog-do-db controls what goes into the master's binary log, while replicate-do-db filters what gets applied on the slave.
  • Effect on Master: binlog-do-db affects the master's binary log content, potentially impacting other slaves replicating from the same master. replicate-do-db only affects the specific slave where it's configured.

Important Considerations:

  • Alternatives: Consider using whitelisting with binlog-ignore-db or replicate-ignore-db to explicitly include only the databases you want replicated. This can be more secure and avoid unintended exclusions.
  • Backup Implications: Filtering with these options can prevent capturing all data changes in the binary log. This might affect your ability to perform point-in-time recovery from backups.
  • Use Cases: Use these options cautiously for scenarios like:
    • Offloading read traffic to a slave for specific databases.
    • Migrating data from one database to another server selectively.

In summary:

  • binlog-do-db (master): Only log statements for specific databases on the master.
  • replicate-do-db (slave): Only apply statements from the relay log for specific databases on the slave.



Using binlog-do-db on the Master Server (MySQL and MariaDB):

Command Line:

mysqld_safe --binlog-do-db=mydatabase

Option File:

[mysqld]
binlog-do-db=mydatabase

Explanation:

In both methods, we're specifying mydatabase as the database to include in the binary log. The master server will only write statements that modify tables within mydatabase to its binary log.

Command Line (after stopping the slave thread):

STOP SLAVE;
SET GLOBAL replicate_do_db='mydatabase';
START SLAVE;
[mysqld]
replicate-do-db=mydatabase
  • We first stop the slave thread using STOP SLAVE.
  • Then, we set the replicate_do_db system variable to mydatabase. This instructs the slave to only apply statements from the relay log that modify tables within mydatabase.
  • Finally, we restart the slave thread using START SLAVE.

Remember:

  • Replace mydatabase with the actual database name you want to replicate or filter.
  • Consider using binlog-ignore-db or replicate-ignore-db for whitelisting to be more explicit about what gets replicated/ignored.
  • Be cautious when using filtering options, as they can impact backup and recovery strategies.



Logical Replication:

  • Introduced in MySQL 5.6 and MariaDB 10.0, logical replication allows replicating specific tables or even specific DML statements (INSERT, UPDATE, DELETE) within a table.
  • Pros:
    • More granular control over what gets replicated.
    • Doesn't affect the master's binary log content.
  • Cons:
    • Requires newer MySQL/MariaDB versions.
    • May have higher overhead compared to statement-based filtering.

User-Defined Functions (UDFs) in Triggers:

  • Create triggers on the master that use UDFs to determine if a statement should be replicated based on specific criteria (e.g., database name, table name, user permissions).
  • The UDF logic can be customized to your needs.
  • Pros:
    • Highly flexible filtering based on custom criteria.
  • Cons:
    • Can be complex to set up and maintain.
    • Might introduce performance overhead on the master.

Database-Level Permissions:

  • Grant appropriate permissions (SELECT, INSERT, UPDATE, DELETE) to users on specific databases on the slave. Users can then access/modify only the authorized databases.
  • Pros:
    • Simple to configure and manage.
  • Cons:
    • Less granular control than other methods.
    • Requires additional user management.

Filtering Based on Replication Channels:

  • MySQL 8.0 introduced replication channels, which allow defining logical replication streams with specific filtering criteria (e.g., databases, tables, DML statements).
  • Pros:
    • Granular control over replication using channels.
    • Doesn't affect the master's binary log.
  • Cons:
    • Requires MySQL 8.0 or later.
    • Might be a more advanced setup for some users.

Choosing the Right Method:

The best method depends on your specific requirements for filtering, complexity tolerance, and MySQL/MariaDB version. Here's a general guideline:

  • Logical replication: Ideal if you need very granular control and are using a compatible MySQL/MariaDB version.
  • UDFs in triggers: Consider this if you have specific filtering logic beyond database names and need flexibility.
  • Database-level permissions: Choose this for a simple setup where filtering by database is sufficient.
  • Replication channels: A good option for MySQL 8.0 users who want granular control and channel-based replication management.

Remember that binlog-do-db and replicate-do-db can still be viable options for simpler filtering needs, but consider these alternatives for more control or to avoid potential drawbacks like affecting the master's binary log or backup strategies.


mysql mariadb master-slave


Ensuring Clarity and Consistency: Best Practices for Using Backticks in MySQL

While MySQL can often interpret field names without backticks, there are specific situations where they become essential:...


Calculating Month Differences Between Dates in MySQL: Exploring TIMESTAMPDIFF() , PERIOD_DIFF() , and Alternative Approaches

Understanding the Concepts:MySQL: A powerful relational database management system used to store, organize, and manage data in a structured way...


MySQL: Mastering NULL Value Handling in ORDER BY for Numbers (NULLs Last)

Understanding NULL Values and Sorting BehaviorIn MySQL, NULL values are considered lower than any non-NULL values.This means by default:...


Understanding Table Structure in SQLite: Alternatives to MySQL's DESCRIBE

In MySQL:The DESCRIBE table_name command is a built-in function that retrieves information about the structure of a table in a MySQL database...


Understanding MariaDB Installation in WSL: Beyond the Code

However, it does involve using Linux commands to configure your WSL environment. Here's a breakdown of the terms:MariaDB: An open-source relational database management system similar to MySQL...


mysql mariadb master slave