HAVING 子句是 SQL 中经常被初学者轻视但功能强大的子句,它专门用于对 GROUP BY 分组后的结果进行条件过滤。很多人容易把它和 WHERE 混淆,但它们有本质区别。
| 特性 | WHERE | HAVING |
|---|---|---|
| 执行时机 | 在分组前过滤行 | 在分组后过滤分组 |
| 可用函数 | 不能使用聚合函数 | 可以使用聚合函数 |
| 使用位置 | GROUP BY 之前 | GROUP BY 之后 |
| 索引利用 | 可以使用索引 | 通常无法使用索引(因为已分组) |
-- 找出平均成绩大于85分的班级
SELECT class_id, AVG(score) as avg_score
FROM students
GROUP BY class_id
HAVING AVG(score) > 85;
-- 找出订单总数超过100的客户
SELECT customer_id, COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 100;
-- 找出总销售额超过10000且平均订单额大于500的销售员
SELECT salesperson_id,
SUM(amount) as total_sales,
AVG(amount) as avg_order
FROM sales
GROUP BY salesperson_id
HAVING SUM(amount) > 10000
AND AVG(amount) > 500;
-- 先过滤无效数据,再分组统计
SELECT department_id,
COUNT(*) as employee_count,
AVG(salary) as avg_salary
FROM employees
WHERE status = 'active' -- 先过滤活跃员工
GROUP BY department_id
HAVING COUNT(*) >= 10 -- 再筛选员工数>=10的部门
AND AVG(salary) > 50000;
-- 使用CASE语句在HAVING中
SELECT product_category,
SUM(CASE WHEN status = 'completed' THEN amount ELSE 0 END) as completed_sales
FROM orders
GROUP BY product_category
HAVING SUM(CASE WHEN status = 'completed' THEN amount ELSE 0 END) > 10000;
-- 错误:在WHERE中使用聚合函数
SELECT department_id, AVG(salary)
FROM employees
WHERE AVG(salary) > 50000 -- 错误!
GROUP BY department_id;
-- 正确:使用HAVING
SELECT department_id, AVG(salary)
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 50000;
尽量在WHERE中过滤:先减少数据集大小
-- 优化前
SELECT user_id, COUNT(*)
FROM logs
GROUP BY user_id
HAVING COUNT(*) > 1000;
-- 优化后(如果数据库支持)
SELECT user_id, COUNT(*)
FROM logs
WHERE log_date >= '2024-01-01' -- 先按时间过滤
GROUP BY user_id
HAVING COUNT(*) > 1000;
避免在HAVING中重复计算
-- 不推荐
SELECT department_id,
AVG(salary) as avg_sal,
COUNT(*) as emp_count
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 50000
AND COUNT(*) > 5
AND AVG(salary)/COUNT(*) > 1000; -- 重复计算
-- 推荐:使用子查询或CTE
WITH dept_stats AS (
SELECT department_id,
AVG(salary) as avg_sal,
COUNT(*) as emp_count
FROM employees
GROUP BY department_id
)
SELECT *
FROM dept_stats
WHERE avg_sal > 50000
AND emp_count > 5
AND avg_sal/emp_count > 1000;
-- 找出有重复邮箱的用户
SELECT email, COUNT(*)
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
-- 找出每月总销售额都超过10000的年份
SELECT EXTRACT(YEAR FROM order_date) as year,
COUNT(DISTINCT EXTRACT(MONTH FROM order_date)) as months_active
FROM orders
GROUP BY EXTRACT(YEAR FROM order_date)
HAVING COUNT(DISTINCT EXTRACT(MONTH FROM order_date)) = 12
AND MIN(monthly_total) > 10000;
-- 找出销售额高于部门平均值的员工
WITH employee_sales AS (
SELECT employee_id,
department_id,
SUM(amount) as total_sales,
AVG(SUM(amount)) OVER (PARTITION BY department_id) as dept_avg
FROM sales
GROUP BY employee_id, department_id
)
SELECT *
FROM employee_sales
WHERE total_sales > dept_avg;
HAVING 子句的关键要点:
GROUP BY 设计
可使用聚合函数:COUNT(), SUM(), AVG(), MAX(), MIN() 等
执行顺序:在 GROUP BY 之后,ORDER BY 之前
性能考量:尽量在 WHERE 中先过滤,减少分组数据量
掌握 HAVING 子句能让你写出更高效、更精确的分组查询,是 SQL 数据分析中不可或缺的工具。