Tables in a database often contain duplicate values across multiple rows. Sometimes, you only want to retrieve a list of the unique, distinct values.
The SELECT DISTINCT statement solves this exact problem instantly.
It filters out the duplicates and returns only individual occurrences.
When you run a standard SELECT query, it grabs every matching row.
If ten users live in "London", "London" appears ten times in the result.
By adding the DISTINCT keyword, PostgreSQL evaluates the output first.
It removes the redundant rows, leaving only one instance of "London".
SELECT DISTINCT column_name FROM table_name;
A very common use case is finding all unique locations in a customer table. You might want to know which countries your user base spans across.
Using DISTINCT on the country column gives you a clean, simplified list.
This is significantly better than reading thousands of duplicate countries.
-- Retrieve all unique countries from the customers table SELECT DISTINCT Country FROM Customers;
You are not limited to using DISTINCT on just a single column.
You can apply it to a combination of multiple columns simultaneously.
When applied to multiple columns, it checks the unique combination of values. Only rows where the combination of both columns is unique will be returned.
-- Find unique combinations of city and country SELECT DISTINCT City, Country FROM Customers;
You can pair DISTINCT with aggregate functions to gather statistics.
For example, you can count the total number of distinct countries you have.
Using COUNT(DISTINCT column_name) provides this exact metric easily.
This will be covered more deeply when we explore aggregate functions.
The SELECT DISTINCT statement is crucial for data cleanup and analytics.
It ensures your output is concise by aggressively dropping duplicate rows.
Use it whenever you need a categorical list of unique data entries.
What does the DISTINCT keyword do when added to a SELECT query?