Concatenating Multiple Rows in SQL: PostgreSQL's string_agg and GROUP BY
Imagine a table called movies
with columns for movie_title
and actor_name
. You want to create a list of actors for each movie, combining all actors who starred in that movie.
Solution:
SELECT
Clause:string_agg
Function:GROUP BY
Clause:AS
Alias:FROM
Clause:
Complete Query:
SELECT movie_title, string_agg(actor_name, ', ') AS actor_list
FROM movies
GROUP BY movie_title;
Explanation:
- This query selects
movie_title
. - The
string_agg
function aggregatesactor_name
values, separated by commas. - The
GROUP BY
clause ensures actors are grouped by their corresponding movie. - The results are presented with
movie_title
and the concatenatedactor_list
.
Additional Notes:
- You can customize the delimiter in
string_agg
based on your preference (e.g.,;
or newline characters). - To order the concatenated list, use
ORDER BY
withinstring_agg
:
SELECT movie_title, string_agg(actor_name, ', ' ORDER BY actor_name) AS actor_list
FROM movies
GROUP BY movie_title;
This will order the actors alphabetically in each movie's list.
CREATE TABLE IF NOT EXISTS movies (
movie_title TEXT PRIMARY KEY,
actor_name TEXT NOT NULL
);
-- Insert some sample data
INSERT INTO movies (movie_title, actor_name) VALUES
('The Shawshank Redemption', 'Tim Robbins'),
('The Shawshank Redemption', 'Morgan Freeman'),
('The Godfather', 'Marlon Brando'),
('The Godfather', 'Al Pacino'),
('The Dark Knight', 'Christian Bale'),
('The Dark Knight', 'Heath Ledger');
-- Select movie titles and concatenated actor lists
SELECT movie_title, string_agg(actor_name, ', ') AS actor_list
FROM movies
GROUP BY movie_title;
This code first creates the movies
table if it doesn't exist, then inserts some sample data with movie titles and actor names. The main query then retrieves the movie_title
and uses string_agg
to concatenate the actor_name
values for each movie, separated by commas. Finally, the results are grouped by movie_title
and displayed.
Running this code will produce output similar to:
movie_title | actor_list
------------------------|----------------
The Dark Knight | Christian Bale, Heath Ledger
The Godfather | Al Pacino, Marlon Brando
The Shawshank Redemption | Morgan Freeman, Tim Robbins
-
array_to_string
Function:
Query:
SELECT movie_title, array_to_string(array_agg(actor_name), ', ') AS actor_list
FROM movies
GROUP BY movie_title;
- Similar to the previous approach, we select
movie_title
. array_agg
groupsactor_name
values into an array for each movie title.array_to_string
converts the array into a comma-separated string (actor_list
).GROUP BY
ensures actors are grouped by their movie.
Method 2: Using a Custom Aggregate Function (For Older PostgreSQL Versions)
Note: This method is suitable for older PostgreSQL versions (prior to 9.0) that don't have string_agg
. Modern versions are recommended for their simplicity.
-
Custom Aggregate Function:
-
Function Implementation:
Here's a simplified example (consult PostgreSQL documentation for detailed function creation):
CREATE OR REPLACE FUNCTION commacat(acc text, instr text) RETURNS text AS $$
BEGIN
IF acc IS NULL OR acc = '' THEN RETURN instr;
ELSE RETURN acc || ', ' || instr;
END;
$$ LANGUAGE plpgsql;
CREATE AGGREGATE textcat_all( basetype = text, sfunc = commacat, stype = text, initcond = '' );
SELECT movie_title, textcat_all(actor_name) AS actor_list
FROM movies
GROUP BY movie_title;
- The custom function
commacate
appends values with a comma. - The aggregate function
textcat_all
usescommacate
for concatenation. - The query selects and groups as before.
Choosing the Right Method:
string_agg
(PostgreSQL 9.0 or later) is the recommended and most concise method for modern versions.array_agg
andarray_to_string
offer a similar approach, potentially useful if you need to manipulate the array further.- Custom Aggregate Functions are only necessary for older PostgreSQL versions. They require more setup but provide flexibility.
sql postgresql aggregate-functions