Understanding SQLite UPSERT (INSERT - ON DUPLICATE KEY UPDATE)

2024-07-27

Upsert in a Nutshell:

Upsert (UPDATE or INSERT) is a functionality used to streamline data manipulation in a database.
It combines the actions of insert and update into a single statement.

SQLite's UPSERT:

SQLite offers a non-standard SQL extension for upsert.
It leverages the INSERT statement with the ON DUPLICATE KEY UPDATE clause.

How it Works:

Insert Attempt: You try to insert a new row into a table.
Uniqueness Check: SQLite checks if the inserted data violates a unique constraint (unique index, primary key).
Duplicate Found: If a duplicate is found based on the unique constraint, the update part kicks in.
Update Execution: The ON DUPLICATE KEY UPDATE clause defines what columns to update in the existing row with the new data you were trying to insert.
Insert Completion: If no duplicate is found, the new row is inserted as usual.

Key Points:

Upsert in SQLite works specifically with uniqueness constraints, not general duplicates.
It's an extension, so it's not part of standard SQL. This means it might not work in other database systems.

MySQL vs. SQLite:

MySQL offers a similar functionality with INSERT ... ON DUPLICATE KEY UPDATE. However, it's a standard extension for MySQL, while it's non-standard for SQLite.

Alternative for Standard SQL:

If you need upsert functionality in databases that don't support it natively, you can achieve a similar outcome using a two-step approach:
1. Try inserting the data.
2. If the insert fails due to a duplicate key constraint, perform an update using a separate statement.

Example Codes for SQLite UPSERT

Here are some examples of SQLite code using INSERT ON DUPLICATE KEY UPDATE:

Simple Upsert:

This example inserts a new row ('Alice', 25) into a table named users with columns name (unique) and age. If a user named 'Alice' already exists, it updates the age to 25.

CREATE TABLE users (name TEXT PRIMARY KEY, age INTEGER);

INSERT OR REPLACE INTO users (name, age) VALUES ('Alice', 25);

Upsert with Specific Updates:

This example inserts a new row ('Bob', 'New York') into a table named customers with columns id (primary key), name, and city. If a customer with the same id already exists, it only updates the city to 'New York'.

CREATE TABLE customers (id INTEGER PRIMARY KEY, name TEXT, city TEXT);

INSERT INTO customers (id, name, city) VALUES (10, 'Bob', 'New York')
ON DUPLICATE KEY UPDATE city = excluded.city;

Conditional Update:

This example inserts a new row ('Charlie', 'CA') into a table named employees with columns id (primary key), name, and state. If an employee with the same id already exists and has a different state (excluded.state), it updates the state to 'CA'.

CREATE TABLE employees (id INTEGER PRIMARY KEY, name TEXT, state TEXT);

INSERT INTO employees (id, name, state) VALUES (15, 'Charlie', 'CA')
ON DUPLICATE KEY UPDATE state = excluded.state
WHERE excluded.state != 'CA';

Remember:

Replace users, customers, and employees with your actual table names.
Modify the column names and data types according to your schema.

Alternative Methods for Upsert in Standard SQL

Since SQLite's INSERT ON DUPLICATE KEY UPDATE is non-standard, here are alternative methods for achieving upsert functionality in databases that adhere to standard SQL:

Two-Step Approach:

This method involves two separate statements:

Try Insert: First, you attempt to insert the new data using a regular INSERT statement.
Update on Failure: If the insert fails due to a duplicate key constraint violation, you execute an UPDATE statement to modify the existing row.

Here's an example:

CREATE TABLE products (id INTEGER PRIMARY KEY, name TEXT, price REAL);

-- Attempt to insert new product
INSERT INTO products (id, name, price) VALUES (123, 'Headphones', 79.99);

-- Update existing product if insert fails due to duplicate ID
IF @@ROWCOUNT = 0  -- Check if insert failed (no rows inserted)
THEN
  UPDATE products
  SET name = 'Headphones', price = 79.99
  WHERE id = 123;
END IF;

MERGE Statement (if supported):

Some database systems, like SQL Server and PostgreSQL, offer a standard MERGE statement specifically designed for upsert functionality. It combines insert and update logic into a single statement.

Here's an example using SQL Server (assuming the table and data types are defined):

MERGE INTO products
USING (SELECT 123 AS id, 'Headphones' AS name, 79.99 AS price) AS new_data
ON (products.id = new_data.id)
WHEN MATCHED THEN
  UPDATE SET name = new_data.name, price = new_data.price
WHEN NOT MATCHED THEN
  INSERT (id, name, price) VALUES (new_data.id, new_data.name, new_data.price);

Choosing the Right Method:

If portability across different database systems is crucial, the two-step approach with INSERT and conditional UPDATE is a reliable option.
If you're working with a database that supports MERGE, it can offer a more concise and efficient approach.

Additional Considerations:

The two-step approach might require additional error handling depending on your specific needs.
MERGE can be more complex to understand and write compared to the two-step approach.

sql mysql database