PostgreSQL EXISTS

PostgreSQL EXISTS Operator

The EXISTS operator is a logical operator used in a WHERE clause. It is used to test for the existence of any record in a subquery.

A subquery is a SELECT statement nested inside another statement. EXISTS checks if the subquery returns one or more rows.

If the subquery returns at least one row, EXISTS evaluates to TRUE. If the subquery returns zero rows, EXISTS evaluates to FALSE.

Basic Syntax

The EXISTS operator is followed by a subquery in parentheses. The subquery is typically a "correlated subquery".

A correlated subquery is one that depends on the outer query for its values. It links the inner query with the outer query.

EXISTS Syntax:

SELECT column_name(s)
FROM table_name
WHERE EXISTS (SELECT column_name FROM table_name WHERE condition);

A Practical Example

Let's find all suppliers who supply products with a price less than 20. We can write a query on the Suppliers table.

In the WHERE clause, we use EXISTS with a subquery on the Products table. The subquery checks if any product from that supplier meets the price condition.

EXISTS Example:

-- Find all suppliers who have at least one product cheaper than 20
SELECT SupplierName
FROM Suppliers s
WHERE EXISTS (
  SELECT 1 FROM Products p
  WHERE p.SupplierID = s.SupplierID AND p.Price < 20
);

The SELECT 1 in the subquery is a convention. EXISTS only cares if rows are returned, not what data is in them. Using SELECT 1 is slightly more efficient than SELECT *.

The NOT EXISTS Operator

You can use NOT EXISTS to find rows that do not have a match in the subquery. This is very useful for finding orphaned records or unlinked data.

For example, let's find all customers who have never placed an order.

NOT EXISTS Example:

-- Find all customers who do not have any corresponding orders
SELECT CustomerName
FROM Customers c
WHERE NOT EXISTS (
  SELECT 1 FROM Orders o
  WHERE o.CustomerID = c.CustomerID
);

This is an alternative way to do what we did with a LEFT JOIN and WHERE IS NULL.

EXISTS vs. IN

EXISTS is often compared to the IN operator. IN is used when you have a list of values to check against.

EXISTS is generally much more performant when the subquery returns a large dataset. This is because EXISTS stops processing as soon as it finds the first match.

IN must scan all values from the subquery.

Summary

The EXISTS operator is a powerful tool for checking for related data. It evaluates whether a subquery returns any rows. It is often more efficient than IN or JOINs for certain types of checks.

Exercise

When does the EXISTS operator return TRUE?