Example SQL Queries to Count Distinct Values
Understanding the DISTINCT
Keyword:
- The
DISTINCT
keyword in SQL is used to eliminate duplicate rows from a result set. It ensures that each row in the result is unique. - When applied to a specific column,
DISTINCT
will return only the unique values from that column, ignoring any duplicates.
Finding the Count of Distinct Values:
To count the number of distinct values in a column, you combine the DISTINCT
keyword with the COUNT
function:
SELECT COUNT(DISTINCT column_name)
FROM table_name;
column_name
: Replace this with the actual name of the column you want to count distinct values from.table_name
: Replace this with the name of the table containing the column.
Example:
Consider a table named "products" with a column "color". To find the number of different colors available:
SELECT COUNT(DISTINCT color)
FROM products;
This query will count the unique values in the "color" column and return the result, indicating the total number of distinct colors.
Additional Notes:
- You can use
DISTINCT
with multiple columns to count the number of distinct combinations of values across those columns. - For more complex scenarios, you might need to use subqueries or joins to achieve the desired result.
Example SQL Queries to Count Distinct Values
Basic Example: Counting Distinct Colors
Scenario: We have a table named products
with columns product_id
, name
, and color
. We want to find out how many different colors are available.
Query:
SELECT COUNT(DISTINCT color) AS total_colors
FROM products;
Explanation:
COUNT(DISTINCT color)
: This counts the number of unique values in thecolor
column.AS total_colors
: This assigns an alias "total_colors" to the result.
Counting Distinct Values Within a Group
Scenario: We want to find the number of distinct colors used for each product category.Table: products
with additional column category
.
SELECT category, COUNT(DISTINCT color) AS colors_per_category
FROM products
GROUP BY category;
GROUP BY category
: Groups the results by thecategory
column.COUNT(DISTINCT color)
: Counts the distinct colors within each group.
SELECT COUNT(DISTINCT color) AS colors_for_expensive_products
FROM products
WHERE price > 100;
WHERE price > 100
: Filters the results to include only products with a price greater than $100.
Scenario: We want to find the number of distinct colors used by customers who have placed orders.Tables: products
and orders
with a foreign key relationship.
SELECT COUNT(DISTINCT p.color) AS colors_used_by_customers
FROM products p
JOIN orders o ON p.product_id = o.product_id;
- The subquery
JOIN orders o ON p.product_id = o.product_id
joins theproducts
andorders
tables to find products that have been ordered.
Alternative Methods for Counting Distinct Values in SQL
While the COUNT(DISTINCT column_name)
method is commonly used, there are other approaches that can be considered depending on the specific requirements and performance considerations:
Using a Subquery with EXISTS
This method can be more efficient in certain cases, especially when dealing with large datasets or complex queries.
SELECT COUNT(*)
FROM (
SELECT DISTINCT column_name
FROM table_name
) AS distinct_values;
Using a Temporary Table
If you need to reuse the distinct values multiple times in your query, creating a temporary table can improve performance.
CREATE TEMPORARY TABLE distinct_values (column_name);
INSERT INTO distinct_values (column_name)
SELECT DISTINCT column_name
FROM table_name;
SELECT COUNT(*)
FROM distinct_values;
Using Window Functions (If supported by your database)
Window functions can be used to calculate the count of distinct values within a group.
SELECT DISTINCT column_name,
COUNT(*) OVER (PARTITION BY column_name) AS distinct_count
FROM table_name;
Analytic functions can also be used for similar purposes.
SELECT DISTINCT column_name,
COUNT(*) OVER (PARTITION BY column_name) AS distinct_count
FROM table_name;
Using a Common Table Expression (CTE)
CTEs can be used to create temporary result sets that can be referenced multiple times within a query.
WITH distinct_values AS (
SELECT DISTINCT column_name
FROM table_name
)
SELECT COUNT(*)
FROM distinct_values;
Choosing the Right Method:
The best method depends on factors such as:
- Query complexity: For simple queries,
COUNT(DISTINCT)
might be sufficient. - Performance requirements: If performance is critical, consider using subqueries, temporary tables, or window functions.
- Database compatibility: Ensure that the chosen method is supported by your database system.
sql distinct