Understanding PostgreSQL's Upsert Magic: INSERT ON CONFLICT DO UPDATE Explained

2024-07-27

  • PostgreSQL doesn't have a built-in upsert command, but you can achieve upsert-like behavior using INSERT ON CONFLICT DO UPDATE.
  • This construct allows you to insert rows into a table and, if a conflict occurs (based on a unique constraint or index), update the existing row with the new values instead.

Breakdown of the Statement

  1. INSERT INTO table_name (column1, column2, ...): This specifies the table you want to insert data into and the columns you're providing values for.
  2. VALUES (value1, value2, ...): This defines the actual data you're trying to insert for each column.
  3. ON CONFLICT (conflict_constraint): This clause tells PostgreSQL what condition should trigger the update behavior. Usually, it's a unique constraint on one or more columns. When a row you're trying to insert would violate this constraint (because a matching row already exists), the update part is executed.
  4. DO UPDATE SET: This part defines how you want to update the existing row when a conflict happens.

Using EXCLUDED for All Values

  • To update all columns of the existing row with the attempted insert values, you can use:

    SET (column1, column2, ...) = (EXCLUDED.column1, EXCLUDED.column2, ...)
    
    • This assigns the values from the EXCLUDED table's corresponding columns to the columns in your table being updated.

Example

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  username VARCHAR(50) UNIQUE,
  email VARCHAR(100)
);

INSERT INTO users (username, email)
VALUES ('john_doe', '[email protected]')
ON CONFLICT (username) DO UPDATE SET
  email = EXCLUDED.email;  -- Update only the email if username conflict
  • In this example, if you try to insert a new user with the same username as an existing one, only the email will be updated to the new value, while the id and username remain unchanged.

Additional Considerations

  • You can update specific columns instead of all by listing them individually in the SET clause.
  • PostgreSQL 9.5 and later versions are required for ON CONFLICT DO UPDATE.
  • Consider using RETURNING to get information about the inserted or updated row after the operation.



CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(50) UNIQUE,
  price DECIMAL(10,2),
  stock INTEGER
);

INSERT INTO products (name, price, stock)
VALUES ('T-Shirt', 19.99, 100)
ON CONFLICT (name) DO UPDATE SET
  (name, price, stock) = (EXCLUDED.name, EXCLUDED.price, EXCLUDED.stock);

This code creates a products table with a unique constraint on the name column. It then attempts to insert a product with the name "T-Shirt", price 19.99, and stock 100. However, if a product with the same name already exists, the ON CONFLICT clause kicks in:

  • It checks if the name violates the unique constraint (causing a conflict).
  • If there's a conflict, the DO UPDATE part is executed.
  • The SET clause updates all columns (name, price, stock) of the existing row with the values from the attempted insert (EXCLUDED).
CREATE TABLE customers (
  id SERIAL PRIMARY KEY,
  email VARCHAR(100) UNIQUE,
  phone_number VARCHAR(20),
  address TEXT
);

INSERT INTO customers (email, phone_number, address)
VALUES ('[email protected]', '123-456-7890', '10 Main St')
ON CONFLICT (email) DO UPDATE SET
  phone_number = EXCLUDED.phone_number,
  address = EXCLUDED.address;

This example creates a customers table with a unique constraint on the email column. It attempts to insert a customer with the email "[email protected]", phone number "123-456-7890", and address "10 Main St". If a customer with that email already exists:

  • The ON CONFLICT clause checks for a conflict based on the unique email constraint.
  • The SET clause updates only the phone_number and address columns using the corresponding values from EXCLUDED. The email remains unchanged.



  • This method works in any PostgreSQL version but can be less performant and susceptible to race conditions.
  • It involves:
    • Attempting an INSERT statement.
    • If the INSERT fails due to a unique constraint violation, catch the error.
    • Within the error handling block, perform an UPDATE statement with the same data.
    • This loop continues until the insert succeeds or a non-constraint-related error occurs.
CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  username VARCHAR(50) UNIQUE,
  email VARCHAR(100)
);

BEGIN;  -- Wrap in a transaction for atomicity

LOOP
  INSERT INTO users (username, email)
  VALUES ('john_doe', '[email protected]');

  EXIT WHEN pg_exception_pgcode() IS NULL;  -- Exit if no error

  IF pg_exception_pgcode() = '23505' THEN  -- Unique constraint violation
    UPDATE users
    SET email = '[email protected]'  -- Update email only
    WHERE username = 'john_doe';

    EXIT;  -- Exit after update
  END IF;

  RAISE EXCEPTION 'Unknown error';  -- Re-raise unexpected errors

END LOOP;

COMMIT;

PL/pgSQL Function (Any Version):

  • You can create a PL/pgSQL function that encapsulates the upsert logic.
  • The function can attempt an INSERT and handle the conflict with an UPDATE using exception handling within the function.
  • This approach offers better code organization and potentially improved error handling compared to the loop method.
CREATE OR REPLACE FUNCTION upsert_user(username VARCHAR(50), email VARCHAR(100))
RETURNS VOID AS $$
BEGIN
  INSERT INTO users (username, email)
  VALUES (username, email);

EXCEPTION WHEN unique_violation THEN
  UPDATE users
  SET email = EXCLUDED.email
  WHERE username = EXCLUDED.username;
END;
$$ LANGUAGE plpgsql;

SELECT upsert_user('john_doe', '[email protected]');

External Tools (Any Version):

  • In some cases, you might consider using an ORM (Object-Relational Mapper) or a database administration tool that might provide built-in upsert functionality.
  • These tools can simplify the logic and potentially offer additional features.

Choosing the Right Method:

  • For PostgreSQL 9.5 and later, INSERT ON CONFLICT DO UPDATE is generally the recommended approach due to its efficiency and clarity.
  • If you're using an older version, the PL/pgSQL function method offers a well-structured and maintainable approach.
  • The separate INSERT and UPDATE loop might be suitable for simple cases but can be less performant and prone to race conditions.
  • External tools can be a good option if they provide convenient upsert functionality within your framework.

postgresql upsert postgresql-9.5



Using Script Variables in psql for PostgreSQL Queries

psql, the command-line interface for PostgreSQL, allows you to define variables within your scripts to make your SQL code more flexible and reusable...


The Truth About Disabling WAL: Alternatives for Optimizing PostgreSQL Performance

Granularity: WAL operates at the page level, not the table level. It doesn't distinguish data belonging to individual tables within a page...


Taming Text in Groups: A Guide to String Concatenation in PostgreSQL GROUP BY

When you're working with relational databases like PostgreSQL, you might often encounter situations where you need to combine string values from multiple rows that share a common value in another column...


Foreign Data Wrappers and DBLink: Bridges for PostgreSQL Cross-Database Communication

Here's a general overview of the steps involved in setting up FDW:Install postgres_fdw: This extension usually comes bundled with PostgreSQL...


Building Applications with C# .NET and PostgreSQL

C#: A modern, object-oriented programming language known for its versatility and performance..NET: A powerful framework that provides a platform for building various applications using C# and other languages...



postgresql upsert 9.5

Unlocking the Secrets of Strings: A Guide to Escape Characters in PostgreSQL

Imagine you want to store a person's name like "O'Malley" in a PostgreSQL database. If you were to simply type 'O'Malley' into your query


Beyond the Basics: Exploring Alternative Methods for MySQL to PostgreSQL Migration

Database: A database is a structured collection of data organized for easy access, retrieval, and management. In this context


Choosing the Right Index: GIN vs. GiST for PostgreSQL Performance

Here's a breakdown of GIN vs GiST:GIN Indexes:Faster lookups: GIN indexes are generally about 3 times faster for searching data compared to GiST


Effective Strategy for Leaving an Audit Trail/Change History in DB Applications

Compliance: Many industries have regulations requiring audit trails for security, financial, or legal purposes.Debugging: When errors occur


MySQL vs PostgreSQL for Web Applications: Choosing the Right Database

MySQL: Known for its ease of use, speed, and reliability. It's a good choice for simpler applications with mostly read operations or those on a budget