Conquering Mammoth Database Backups: Splitting mysqldump Files with Ease
Splitting Large mysqldump Output into Smaller Files
This is a common tool available on most systems. It allows splitting a file based on lines or bytes. Here's an example:
mysqldump your_database > entire_dump.sql && split -l 10000 entire_dump.sql split_ -d
This command dumps your database into entire_dump.sql
and then splits it into files named split_00
, split_01
, and so on, with each file containing approximately 10,000 lines (adjust the -l
option for desired size).
Piping to head and tail:
These commands can be used to extract specific sections of a file. Here's an example:
mysqldump your_database | head -n 10000 > split_01.sql | tail -n +10001 > entire_dump_remaining.sql
This approach pipes the mysqldump
output to head
which extracts the first 10,000 lines and writes them to split_01.sql
. The remaining output is then piped to tail
which removes the first 10,000 lines and writes the remaining data to entire_dump_remaining.sql
. This process can be repeated to create multiple split files.
Python Script:
For more flexibility, you can write a Python script to handle the splitting:
def split_dump(filename, lines_per_file):
with open(filename, 'r') as f:
lines = f.readlines()
start = 0
for i in range(0, len(lines), lines_per_file):
end = i + lines_per_file
with open(f"split_{i}.sql", 'w') as out:
out.writelines(lines[start:end])
start = end
# Example usage
split_dump("entire_dump.sql", 5000)
This script reads the entire dump file, iterates through lines in chunks of a specified size, and writes each chunk to a separate file with a sequential naming convention.
Related Issues and Solutions:
- Maintaining data integrity: When splitting based on lines, ensure each split file contains complete statements (e.g., full CREATE TABLE statements).
- Splitting by table size: If your goal is to split by table size instead of lines, consider using tools like
mysqldumpsplitter.jar
which can split the dump based on table boundaries.
mysql migration