Understanding PostgreSQL's Upsert Magic: INSERT ON CONFLICT DO UPDATE Explained
Upsert in PostgreSQL
- PostgreSQL doesn't have a built-in
upsert
command, but you can achieve upsert-like behavior usingINSERT ON CONFLICT DO UPDATE
. - This construct allows you to insert rows into a table and, if a conflict occurs (based on a unique constraint or index), update the existing row with the new values instead.
Breakdown of the Statement
INSERT INTO table_name (column1, column2, ...)
: This specifies the table you want to insert data into and the columns you're providing values for.VALUES (value1, value2, ...)
: This defines the actual data you're trying to insert for each column.ON CONFLICT (conflict_constraint)
: This clause tells PostgreSQL what condition should trigger the update behavior. Usually, it's a unique constraint on one or more columns. When a row you're trying to insert would violate this constraint (because a matching row already exists), the update part is executed.DO UPDATE SET
: This part defines how you want to update the existing row when a conflict happens.
Using EXCLUDED for All Values
-
EXCLUDED
is a virtual table available within theDO UPDATE
clause. It contains the values that were attempted to be inserted but caused the conflict. -
To update all columns of the existing row with the attempted insert values, you can use:
SET (column1, column2, ...) = (EXCLUDED.column1, EXCLUDED.column2, ...)
- This assigns the values from the
EXCLUDED
table's corresponding columns to the columns in your table being updated.
- This assigns the values from the
Example
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE,
email VARCHAR(100)
);
INSERT INTO users (username, email)
VALUES ('john_doe', '[email protected]')
ON CONFLICT (username) DO UPDATE SET
email = EXCLUDED.email; -- Update only the email if username conflict
- In this example, if you try to insert a new user with the same username as an existing one, only the email will be updated to the new value, while the
id
andusername
remain unchanged.
Additional Considerations
- You can update specific columns instead of all by listing them individually in the
SET
clause. - PostgreSQL 9.5 and later versions are required for
ON CONFLICT DO UPDATE
. - Consider using
RETURNING
to get information about the inserted or updated row after the operation.
By combining INSERT ON CONFLICT DO UPDATE
with EXCLUDED
, you can achieve upsert-like functionality in PostgreSQL, ensuring data integrity while keeping your code concise and efficient.
Example 1: Update All Columns on Conflict
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(50) UNIQUE,
price DECIMAL(10,2),
stock INTEGER
);
INSERT INTO products (name, price, stock)
VALUES ('T-Shirt', 19.99, 100)
ON CONFLICT (name) DO UPDATE SET
(name, price, stock) = (EXCLUDED.name, EXCLUDED.price, EXCLUDED.stock);
This code creates a products
table with a unique constraint on the name
column. It then attempts to insert a product with the name "T-Shirt", price 19.99, and stock 100. However, if a product with the same name already exists, the ON CONFLICT
clause kicks in:
- It checks if the
name
violates the unique constraint (causing a conflict). - If there's a conflict, the
DO UPDATE
part is executed. - The
SET
clause updates all columns (name
,price
,stock
) of the existing row with the values from the attempted insert (EXCLUDED
).
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
email VARCHAR(100) UNIQUE,
phone_number VARCHAR(20),
address TEXT
);
INSERT INTO customers (email, phone_number, address)
VALUES ('[email protected]', '123-456-7890', '10 Main St')
ON CONFLICT (email) DO UPDATE SET
phone_number = EXCLUDED.phone_number,
address = EXCLUDED.address;
This example creates a customers
table with a unique constraint on the email
column. It attempts to insert a customer with the email "[email protected]", phone number "123-456-7890", and address "10 Main St". If a customer with that email already exists:
- The
ON CONFLICT
clause checks for a conflict based on the uniqueemail
constraint. - In case of a conflict, the
DO UPDATE
part is triggered. - The
SET
clause updates only thephone_number
andaddress
columns using the corresponding values fromEXCLUDED
. Theemail
remains unchanged.
These examples showcase the flexibility of INSERT ON CONFLICT DO UPDATE
for upsert-like behavior in PostgreSQL. You can customize the update logic based on your specific requirements.
Separate INSERT and UPDATE in a Loop (Pre-9.5):
- This method works in any PostgreSQL version but can be less performant and susceptible to race conditions.
- It involves:
- Attempting an
INSERT
statement. - If the
INSERT
fails due to a unique constraint violation, catch the error. - Within the error handling block, perform an
UPDATE
statement with the same data. - This loop continues until the insert succeeds or a non-constraint-related error occurs.
- Attempting an
Example:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE,
email VARCHAR(100)
);
BEGIN; -- Wrap in a transaction for atomicity
LOOP
INSERT INTO users (username, email)
VALUES ('john_doe', '[email protected]');
EXIT WHEN pg_exception_pgcode() IS NULL; -- Exit if no error
IF pg_exception_pgcode() = '23505' THEN -- Unique constraint violation
UPDATE users
SET email = '[email protected]' -- Update email only
WHERE username = 'john_doe';
EXIT; -- Exit after update
END IF;
RAISE EXCEPTION 'Unknown error'; -- Re-raise unexpected errors
END LOOP;
COMMIT;
PL/pgSQL Function (Any Version):
- You can create a PL/pgSQL function that encapsulates the upsert logic.
- The function can attempt an
INSERT
and handle the conflict with anUPDATE
using exception handling within the function. - This approach offers better code organization and potentially improved error handling compared to the loop method.
CREATE OR REPLACE FUNCTION upsert_user(username VARCHAR(50), email VARCHAR(100))
RETURNS VOID AS $$
BEGIN
INSERT INTO users (username, email)
VALUES (username, email);
EXCEPTION WHEN unique_violation THEN
UPDATE users
SET email = EXCLUDED.email
WHERE username = EXCLUDED.username;
END;
$$ LANGUAGE plpgsql;
SELECT upsert_user('john_doe', '[email protected]');
External Tools (Any Version):
- In some cases, you might consider using an ORM (Object-Relational Mapper) or a database administration tool that might provide built-in upsert functionality.
- These tools can simplify the logic and potentially offer additional features.
Choosing the Right Method:
- For PostgreSQL 9.5 and later,
INSERT ON CONFLICT DO UPDATE
is generally the recommended approach due to its efficiency and clarity. - If you're using an older version, the PL/pgSQL function method offers a well-structured and maintainable approach.
- The separate
INSERT
andUPDATE
loop might be suitable for simple cases but can be less performant and prone to race conditions. - External tools can be a good option if they provide convenient upsert functionality within your framework.
postgresql upsert postgresql-9.5