Optimizing Replication: Selective Data Transfer with binlog-do-db and replicate-do-db in MySQL and MariaDB

2024-04-02

Replication in MySQL and MariaDB

In MySQL and MariaDB, master-slave replication allows you to keep a secondary server (slave) in sync with a primary server (master). This ensures data redundancy and scalability for read-heavy workloads or disaster recovery scenarios.

Binary Logging

A crucial aspect of replication is binary logging. The master server records all data-modification statements (like INSERT, UPDATE, and DELETE) in a binary log file. This log becomes the source of truth for keeping the slave server updated.

Filtering Replication with binlog-do-db and replicate-do-db

These options enable you to selectively replicate data based on specific databases. However, they have distinct functionalities:

  • binlog-do-db (on the Master):

    • Configured on the master server.
    • Instructs the master to only write statements that modify tables within the specified databases to its binary log.
    • Statements affecting excluded databases won't be logged, preventing replication for those databases.
  • replicate-do-db (on the Slave):

    • Instructs the slave to only apply statements from its relay log (a temporary log storing binary log events received from the master) that modify tables within the specified databases.
    • Statements targeting excluded databases on the master will still be replicated, but the slave will ignore them.

Key Differences:

  • Target: binlog-do-db controls what goes into the master's binary log, while replicate-do-db filters what gets applied on the slave.
  • Effect on Master: binlog-do-db affects the master's binary log content, potentially impacting other slaves replicating from the same master. replicate-do-db only affects the specific slave where it's configured.

Important Considerations:

  • Alternatives: Consider using whitelisting with binlog-ignore-db or replicate-ignore-db to explicitly include only the databases you want replicated. This can be more secure and avoid unintended exclusions.
  • Backup Implications: Filtering with these options can prevent capturing all data changes in the binary log. This might affect your ability to perform point-in-time recovery from backups.
  • Use Cases: Use these options cautiously for scenarios like:
    • Offloading read traffic to a slave for specific databases.
    • Migrating data from one database to another server selectively.

In summary:

  • binlog-do-db (master): Only log statements for specific databases on the master.
  • replicate-do-db (slave): Only apply statements from the relay log for specific databases on the slave.

Choose the approach that best suits your replication needs while considering the potential drawbacks.




Using binlog-do-db on the Master Server (MySQL and MariaDB):

Command Line:

mysqld_safe --binlog-do-db=mydatabase

Option File:

[mysqld]
binlog-do-db=mydatabase

Explanation:

In both methods, we're specifying mydatabase as the database to include in the binary log. The master server will only write statements that modify tables within mydatabase to its binary log.

Command Line (after stopping the slave thread):

STOP SLAVE;
SET GLOBAL replicate_do_db='mydatabase';
START SLAVE;
[mysqld]
replicate-do-db=mydatabase
  • We first stop the slave thread using STOP SLAVE.
  • Then, we set the replicate_do_db system variable to mydatabase. This instructs the slave to only apply statements from the relay log that modify tables within mydatabase.

Remember:

  • Replace mydatabase with the actual database name you want to replicate or filter.
  • Consider using binlog-ignore-db or replicate-ignore-db for whitelisting to be more explicit about what gets replicated/ignored.
  • Be cautious when using filtering options, as they can impact backup and recovery strategies.



Logical Replication:

  • Introduced in MySQL 5.6 and MariaDB 10.0, logical replication allows replicating specific tables or even specific DML statements (INSERT, UPDATE, DELETE) within a table.
  • Pros:
    • More granular control over what gets replicated.
    • Doesn't affect the master's binary log content.
  • Cons:
    • Requires newer MySQL/MariaDB versions.
    • May have higher overhead compared to statement-based filtering.

User-Defined Functions (UDFs) in Triggers:

  • Create triggers on the master that use UDFs to determine if a statement should be replicated based on specific criteria (e.g., database name, table name, user permissions).
  • The UDF logic can be customized to your needs.
  • Pros:
  • Cons:
    • Can be complex to set up and maintain.
    • Might introduce performance overhead on the master.

Database-Level Permissions:

  • Grant appropriate permissions (SELECT, INSERT, UPDATE, DELETE) to users on specific databases on the slave. Users can then access/modify only the authorized databases.
  • Pros:
  • Cons:
    • Less granular control than other methods.
    • Requires additional user management.

Filtering Based on Replication Channels:

  • MySQL 8.0 introduced replication channels, which allow defining logical replication streams with specific filtering criteria (e.g., databases, tables, DML statements).
  • Cons:
    • Requires MySQL 8.0 or later.
    • Might be a more advanced setup for some users.

Choosing the Right Method:

The best method depends on your specific requirements for filtering, complexity tolerance, and MySQL/MariaDB version. Here's a general guideline:

  • Logical replication: Ideal if you need very granular control and are using a compatible MySQL/MariaDB version.
  • UDFs in triggers: Consider this if you have specific filtering logic beyond database names and need flexibility.
  • Database-level permissions: Choose this for a simple setup where filtering by database is sufficient.
  • Replication channels: A good option for MySQL 8.0 users who want granular control and channel-based replication management.

Remember that binlog-do-db and replicate-do-db can still be viable options for simpler filtering needs, but consider these alternatives for more control or to avoid potential drawbacks like affecting the master's binary log or backup strategies.


mysql mariadb master-slave


Understanding MySQL Indexes: INDEX vs. PRIMARY KEY vs. UNIQUE vs. FULLTEXT

Here's a table summarizing the key differences:By understanding these different types of indexes, you can optimize your MySQL queries and improve database performance...


Granting All Privileges on a Database in MySQL and MariaDB

Granting All Privileges in MySQL and MariaDBIn MySQL and MariaDB, you can grant a user complete control over a specific database using the GRANT statement...


Demystifying Regular Expressions: Leveraging REGEXP_REPLACE in MariaDB for Powerful Data Processing

Purpose:The REGEXP_REPLACE function in MariaDB is used to search for a specific pattern (regular expression) within a string and replace all occurrences of that pattern with another string...


Resolving "Authentication plugin 'caching_sha2_password' cannot be loaded" Error in MySQL 8.0

Understanding the Error:MySQL 8.0 Change: Starting with version 8.0, MySQL switched its default authentication plugin from mysql_native_password (used in previous versions) to the more secure caching_sha2_password...


Troubleshooting "ERROR 2002 (HY000)": Connecting to MySQL/MariaDB Server

Here's a breakdown of the message:Possible Causes:There are several reasons why you might encounter this error:Troubleshooting Steps:...


mysql mariadb master slave