SQL 返回非分组列,你正在执行 GROUP BY
查询,并希望通过 SELECT
列表返回一些列,但这些列却不会出现在 GROUP BY
子句里。这通常无法办到,因为不能保证这些列在每个分组里都有唯一的值。
SQL 返回非分组列 问题描述
你正在执行 GROUP BY
查询,并希望通过 SELECT
列表返回一些列,但这些列却不会出现在 GROUP BY
子句里。这通常无法办到,因为不能保证这些列在每个分组里都有唯一的值。
假设你希望找出每个部门工资最高和最低的员工,同时也希望找出每个职位对应的工资最高和最低的员工。你想查看每个员工的名字、部门、职位以及工资。你希望返回如下所示的结果集。
DEPTNO ENAME JOB SAL DEPT_STATUS JOB_STATUS
------ ------ --------- ----- --------------- --------------
10 MILLER CLERK 1300 LOW SAL IN DEPT TOP SAL IN JOB
10 CLARK MANAGER 2450 LOW SAL IN JOB
10 KING PRESIDENT 5000 TOP SAL IN DEPT TOP SAL IN JOB
20 SCOTT ANALYST 3000 TOP SAL IN DEPT TOP SAL IN JOB
20 FORD ANALYST 3000 TOP SAL IN DEPT TOP SAL IN JOB
20 SMITH CLERK 800 LOW SAL IN DEPT LOW SAL IN JOB
20 JONES MANAGER 2975 TOP SAL IN JOB
30 JAMES CLERK 950 LOW SAL IN DEPT
30 SALESMAN 1250 LOW SAL IN JOB
30 WARD SALESMAN 1250 LOW SAL IN JOB
30 ALLEN SALESMAN 1600 TOP SAL IN JOB
30 BLAKE MANAGER 2850 TOP SAL IN DEPT
不幸的是,如果把上述所有列都放入 SELECT
子句的话,将会破坏分组操作。考虑如下的例子。员工 KING 的工资最高,你想用下列查询验证这一点。
select ename,max(sal)
from emp
group by ename
以上查询将返回 EMP
表的全部 14 行数据,而不只是 KING 及其工资。之所以会这样,正是因为那个分组操作的存在:它会针对每个 ENAME
调用 MAX(SAL)
函数。因此,上述 SQL 语句看起来好像能够“找出工资最高的员工”,但实际的执行结果却是“找出了 EMP
表里每个 ENAME
对应的最高工资”。因此,本实例将介绍一种能够查询到 ENAME
却不必把 ENAME
放入 GROUP BY
子句的方法。
SQL 返回非分组列 解决方案
使用内嵌视图找出每个 DEPTNO
和 JOB
对应的最高和最低的工资。然后,筛选出工资等于这些值的员工。
DB2、Oracle 和 SQL Server
使用窗口函数 MAX OVER
和 MIN OVER
找出每个 DEPTNO
和 JOB
对应的最高和最低的工资。然后,筛选出工资与之匹配的行。
1 select deptno,ename,job,sal,
2 case when sal = max_by_dept
3 then 'TOP SAL IN DEPT'
4 when sal = min_by_dept
5 then 'LOW SAL IN DEPT'
6 end dept_status,
7 case when sal = max_by_job
8 then 'TOP SAL IN JOB'
9 when sal = min_by_job
10 then 'LOW SAL IN JOB'
11 end job_status
12 from (
13 select deptno,ename,job,sal,
14 max(sal)over(partition by deptno) max_by_dept,
15 max(sal)over(partition by job) max_by_job,
16 min(sal)over(partition by deptno) min_by_dept,
17 min(sal)over(partition by job) min_by_job
18 from emp
19 ) emp_sals
20 where sal in (max_by_dept,max_by_job,
21 min_by_dept,min_by_job)
PostgreSQL 和 MySQL
使用标量子查询找出每个 DEPTNO
和 JOB
对应的最高和最低的工资。然后,只保留与之匹配的员工。
1 select deptno,ename,job,sal,
2 case when sal = max_by_dept
3 then 'TOP SAL IN DEPT'
4 when sal = min_by_dept
5 then 'LOW SAL IN DEPT'
6 end as dept_status,
7 case when sal = max_by_job
8 then 'TOP SAL IN JOB'
9 when sal = min_by_job
10 then 'LOW SAL IN JOB'
11 end as job_status
12 from (
13 select e.deptno,e.ename,e.job,e.sal,
14 (select max(sal) from emp d
15 where d.deptno = e.deptno) as max_by_dept,
16 (select max(sal) from emp d
17 where d.job = e.job) as max_by_job,
18 (select min(sal) from emp d
19 where d.deptno = e.deptno) as min_by_dept,
20 (select min(sal) from emp d
21 where d.job = e.job) as min_by_job
22 from emp e
23 ) x
24 where sal in (max_by_dept,max_by_job,
25 min_by_dept,min_by_job)
SQL 返回非分组列 扩展知识
DB2、Oracle 和 SQL Server
首先使用窗口函数 MAX OVER
和 MIN OVER
找出每个 DEPTNO
和 JOB
对应的最高和最低的工资。
select deptno,ename,job,sal,
max(sal)over(partition by deptno) maxDEPT,
max(sal)over(partition by job) maxJOB,
min(sal)over(partition by deptno) minDEPT,
min(sal)over(partition by job) minJOB
from emp
DEPTNO ENAME JOB SAL MAXDEPT MAXJOB MINDEPT MINJOB
------ ------ --------- ----- ------- ------ ------- ------
10 MILLER CLERK 1300 5000 1300 1300 800
10 CLARK MANAGER 2450 5000 2975 1300 2450
10 KING PRESIDENT 5000 5000 5000 1300 5000
20 SCOTT ANALYST 3000 3000 3000 800 3000
20 FORD ANALYST 3000 3000 3000 800 3000
20 SMITH CLERK 800 3000 1300 800 800
20 JONES MANAGER 2975 3000 2975 800 2450
20 ADAMS CLERK 1100 3000 1300 800 800
30 JAMES CLERK 950 2850 1300 950 800
30 MARTIN SALESMAN 1250 2850 1600 950 1250
30 TURNER SALESMAN 1500 2850 1600 950 1250
30 WARD SALESMAN 1250 2850 1600 950 1250
30 ALLEN SALESMAN 1600 2850 1600 950 1250
30 BLAKE MANAGER 2850 2850 2975 950 2450
现在,每个人的工资都可以和当前 DEPTNO
和 JOB
对应的最高和最低的工资进行比较了。需要注意的是,上述窗口函数背后的分组操作(上述 SELECT
子句里的那 4 列)并不会影响 MIN OVER
和 MAX OVER
函数的返回值。这充分展现了窗口函数的优雅之处:聚合运算是基于指定的“分组”或分区展开的,并且会为每个分组返回多行数据。最后只要把上述窗口函数调用放入一个内嵌视图,并且只保留那些与窗口函数返回值相匹配的行即可。在最终的结果集中,我们将使用 CASE
表达式显示每个员工的“状态”。
select deptno,ename,job,sal,
case when sal = max_by_dept
then 'TOP SAL IN DEPT'
when sal = min_by_dept
then 'LOW SAL IN DEPT'
end dept_status,
case when sal = max_by_job
then 'TOP SAL IN JOB'
when sal = min_by_job
then 'LOW SAL IN JOB'
end job_status
from (
select deptno,ename,job,sal,
max(sal)over(partition by deptno) max_by_dept,
max(sal)over(partition by job) max_by_job,
min(sal)over(partition by deptno) min_by_dept,
min(sal)over(partition by job) min_by_job
from emp
) x
where sal in (max_by_dept,max_by_job,
min_by_dept,min_by_job)
DEPTNO ENAME JOB SAL DEPT_STATUS JOB_STATUS
------ ------ --------- ----- --------------- --------------
10 MILLER CLERK 1300 LOW SAL IN DEPT TOP SAL IN JOB
10 CLARK MANAGER 2450 LOW SAL IN JOB
10 KING PRESIDENT 5000 TOP SAL IN DEPT TOP SAL IN JOB
20 SCOTT ANALYST 3000 TOP SAL IN DEPT TOP SAL IN JOB
20 FORD ANALYST 3000 TOP SAL IN DEPT TOP SAL IN JOB
20 SMITH CLERK 800 LOW SAL IN DEPT LOW SAL IN JOB
20 JONES MANAGER 2975 TOP SAL IN JOB
30 JAMES CLERK 950 LOW SAL IN DEPT
30 MARTIN SALESMAN 1250 LOW SAL IN JOB
30 WARD SALESMAN 1250 LOW SAL IN JOB
30 ALLEN SALESMAN 1600 TOP SAL IN JOB
30 BLAKE MANAGER 2850 TOP SAL IN DEPT
PostgreSQL 和 MySQL
首先使用标量子查询找出每个 DEPTNO
和 JOB
对应的最高和最低的工资。
select e.deptno,e.ename,e.job,e.sal,
(select max(sal) from emp d
where d.deptno = e.deptno) as maxDEPT,
(select max(sal) from emp d
where d.job = e.job) as maxJOB,
(select min(sal) from emp d
where d.deptno = e.deptno) as minDEPT,
(select min(sal) from emp d
where d.job = e.job) as minJOB
from emp e
DEPTNO ENAME JOB SAL MAXDEPT MAXJOB MINDEPT MINJOB
------ ------ --------- ----- ------- ------ ------- ------
20 SMITH CLERK 800 3000 1300 800 800
30 ALLEN SALESMAN 1600 2850 1600 950 1250
30 WARD SALESMAN 1250 2850 1600 950 1250
20 JONES MANAGER 2975 3000 2975 800 2450
30 MARTIN SALESMAN 1250 2850 1600 950 1250
30 BLAKE MANAGER 2850 2850 2975 950 2450
10 CLARK MANAGER 2450 5000 2975 1300 2450
20 SCOTT ANALYST 3000 3000 3000 800 3000
10 KING PRESIDENT 5000 5000 5000 1300 5000
30 TURNER SALESMAN 1500 2850 1600 950 1250
20 ADAMS CLERK 1100 3000 1300 800 800
30 JAMES CLERK 950 2850 1300 950 800
20 FORD ANALYST 3000 3000 3000 800 3000
10 MILLER CLERK 1300 5000 1300 1300 800
现在,每个 DEPTNO
和 JOB
对应的最高和最低的工资可以和 EMP
表里其他的工资比较了。最后,把上述标量子查询放入到一个内嵌视图里,并且只保留那些工资与之相匹配的员工即可。在最终的结果集中,我们将使用 CASE
表达式显示每个员工的“状态”。
select deptno,ename,job,sal,
case when sal = max_by_dept
then 'TOP SAL IN DEPT'
when sal = min_by_dept
then 'LOW SAL IN DEPT'
end as dept_status,
case when sal = max_by_job
then 'TOP SAL IN JOB'
when sal = min_by_job
then 'LOW SAL IN JOB'
end as job_status
from (
select e.deptno,e.ename,e.job,e.sal,
(select max(sal) from emp d
where d.deptno = e.deptno) as max_by_dept,
(select max(sal) from emp d
where d.job = e.job) as max_by_job,
(select min(sal) from emp d
where d.deptno = e.deptno) as min_by_dept,
(select min(sal) from emp d
where d.job = e.job) as min_by_job
from emp e
) x
where sal in (max_by_dept,max_by_job,
min_by_dept,min_by_job)
DEPTNO ENAME JOB SAL DEPT_STATUS JOB_STATUS
------ ------ --------- ----- --------------- --------------
10 CLARK MANAGER 2450 LOW SAL IN JOB
10 KING PRESIDENT 5000 TOP SAL IN DEPT TOP SAL IN JOB
10 MILLER CLERK 1300 LOW SAL IN DEPT TOP SAL IN JOB
20 SMITH CLERK 800 LOW SAL IN DEPT LOW SAL IN JOB
20 FORD ANALYST 3000 TOP SAL IN DEPT TOP SAL IN JOB
20 SCOTT ANALYST 3000 TOP SAL IN DEPT TOP SAL IN JOB
20 JONES MANAGER 2975 TOP SAL IN JOB
30 ALLEN SALESMAN 1600 TOP SAL IN JOB
30 BLAKE MANAGER 2850 TOP SAL IN DEPT
30 MARTIN SALESMAN 1250 LOW SAL IN JOB
30 JAMES CLERK 950 LOW SAL IN DEPT
30 WARD SALESMAN 1250 LOW SAL IN JOB