PostgreSQL HAVING

PostgreSQL HAVING Clause

The HAVING clause was added to SQL for a very specific reason. The WHERE keyword cannot be used to filter data based on aggregate functions.

HAVING allows you to filter the results of a GROUP BY query. It applies a condition to the groups themselves, not the individual rows.

WHERE vs. HAVING

This is a critical concept to understand in SQL. The WHERE clause filters rows before they are grouped. The HAVING clause filters groups after they have been created.

WHERE works on individual row data. HAVING works on the output of aggregate functions.

Basic Syntax

The HAVING clause is placed after the GROUP BY clause. It comes before the ORDER BY clause.

The syntax is similar to the WHERE clause.

HAVING Syntax:

SELECT column_name(s), aggregate_function(column_name)
FROM table_name
GROUP BY column_name(s)
HAVING condition;

A Practical Example

Let's continue our GROUP BY example from the previous chapter. We counted the number of customers in each country.

Now, what if we only want to see countries with more than 5 customers? We cannot use WHERE COUNT(CustomerID) > 5. This is illegal.

Instead, we must use the HAVING clause.

HAVING Example:

-- List countries that have more than 1 customer
SELECT Country, COUNT(CustomerID) AS NumberOfCustomers
FROM Customers
GROUP BY Country
HAVING COUNT(CustomerID) > 1;

This query first groups all customers by country and counts them. Then, it filters out any of those groups that do not meet the HAVING condition.

Another Example

Let's find the total price of products for suppliers. But we only want to see suppliers whose total product price is over 50.

We would join Products and Suppliers, group by supplier, and sum the prices. Then, we use HAVING to filter the final summed values.

Complex HAVING Example:

-- This example assumes a joinable SupplierID column exists on Products
SELECT s.SupplierName, SUM(p.Price) AS TotalValue
FROM Products p
JOIN Suppliers s ON p.SupplierID = s.SupplierID
GROUP BY s.SupplierName
HAVING SUM(p.Price) > 30;

Order of Operations

Remember the logical order of execution in a SQL query:

  1. FROM and JOINs determine the working set of data.
  2. WHERE filters individual rows.
  3. GROUP BY aggregates the remaining rows into groups.
  4. HAVING filters the groups themselves.
  5. SELECT picks the final columns.
  6. ORDER BY sorts the final output.

Summary

The HAVING clause is the key to filtering aggregated data. Use WHERE to filter before grouping, and HAVING to filter after.

Exercise

Why was the HAVING clause introduced to SQL?