MariaDB 10 Indexing Strategies: When "Large Indexes" Aren't a Simple Setting

2024-07-27

Use the DYNAMIC Row Format:

When creating a table, specify ROW_FORMAT=DYNAMIC in the CREATE TABLE statement. This format allows for more flexible storage of data lengths, enabling larger indexes.

CREATE TABLE my_table (
  id INT PRIMARY KEY,
  name VARCHAR(255),
  data BLOB,
  ROW_FORMAT=DYNAMIC
) ENGINE=InnoDB;

Alternatively, you can set innodb_default_row_format=DYNAMIC globally before creating tables. This will make DYNAMIC the default for all new InnoDB tables.

SET GLOBAL innodb_default_row_format=DYNAMIC;

Consider Upgrading MariaDB (if applicable):

MariaDB versions 10.2 and later introduced compressed row prefixes, which can further improve storage efficiency for indexes, especially for character sets like utf-8. Upgrading to a newer version might be beneficial if you're using an older MariaDB version.

Indexing and utf-8:

Indexing is a database optimization technique that creates data structures to speed up searches on specific columns.
utf-8 is a character encoding that can represent a wide range of characters from various languages. It's a common encoding for storing text data in databases.

Key Points:

Using ROW_FORMAT=DYNAMIC allows for larger indexes by enabling more flexible storage of data lengths.
Compressed row prefixes (introduced in MariaDB 10.2+) can further improve storage efficiency for indexes, especially with utf-8 data.
There's no direct "large index" setting in MariaDB 10. You achieve this by using the DYNAMIC row format.

CREATE TABLE products (
  product_id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  description TEXT,  -- Example of a larger data type
  price DECIMAL(10,2) NOT NULL,
  ROW_FORMAT=DYNAMIC  -- Enable DYNAMIC row format for larger indexes
) ENGINE=InnoDB;

This code creates a table named products with several columns. By including ROW_FORMAT=DYNAMIC in the CREATE TABLE statement, we're specifying that this table should use the DYNAMIC row format, which allows for more flexible storage of data lengths, potentially enabling larger indexes.

Example 2: Setting DYNAMIC as Default Row Format (Global)

SET GLOBAL innodb_default_row_format=DYNAMIC;

CREATE TABLE customers (
  customer_id INT PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  email VARCHAR(100) UNIQUE,
  address TEXT,
  ENGINE=InnoDB
);

In this example, we first set the global innodb_default_row_format variable to DYNAMIC. This means that any new InnoDB tables created afterwards will automatically use the DYNAMIC row format unless explicitly specified otherwise.

Then, we create a table named customers without explicitly specifying ROW_FORMAT=DYNAMIC. Since the global default is now DYNAMIC, the table will be created with this format.

Partitioning:
- Divide your table into smaller, more manageable partitions based on a specific column value (e.g., date range, customer ID). This can help improve query performance for searches that target specific partitions. However, partitioning adds some complexity to table management.
Example: Partitioning a table by year:
```
CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  customer_id INT NOT NULL,
  order_date DATE NOT NULL,
  amount DECIMAL(10,2) NOT NULL,
  PARTITION BY RANGE (YEAR(order_date))
  (PARTITION p2023 VALUES LESS THAN (2024),
   PARTITION p2024 VALUES LESS THAN (2025))
) ENGINE=InnoDB;
```

Multiple Smaller Indexes:

Instead of a single large index, create multiple smaller indexes on relevant columns. This can still improve query performance, but it might require more complex queries to leverage multiple indexes.

Example: Indexing name and category separately:

CREATE TABLE products (
  product_id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  category VARCHAR(50) NOT NULL,
  description TEXT,
  price DECIMAL(10,2) NOT NULL,
  INDEX(name),  -- Index on name column
  INDEX(category)  -- Index on category column
) ENGINE=InnoDB;

Denormalization (with Caution):

Choosing the Right Approach:

The best method depends on your specific data model, query patterns, and performance goals. Consider these factors when deciding:

Query workload: Analyze your queries to identify frequently used filters and joins. This will guide your partitioning or indexing strategy.
Data growth: If your data grows significantly, partitioning can help manage large tables more efficiently.
Schema complexity: Denormalization can simplify queries but might introduce maintenance challenges.

mysql indexing utf-8