SQL 识别重叠的日期区间

SQL 识别重叠的日期区间,如果一个员工手头现有的项目尚未结束,却又开始了另一个新项目,那么你希望筛选出相关的数据。我们先看一下 EMP_PROJECT 表的数据。

SQL 识别重叠的日期区间 问题描述

如果一个员工手头现有的项目尚未结束,却又开始了另一个新项目,那么你希望筛选出相关的数据。我们先看一下 EMP_PROJECT 表的数据。

select *
  from emp_project
 
EMPNO ENAME      PROJ_ID PROJ_START  PROJ_END
----- ---------- ------- ----------- -----------
7782  CLARK            1 16-JUN-2005 18-JUN-2005
7782  CLARK            4 19-JUN-2005 24-JUN-2005
7782  CLARK            7 22-JUN-2005 25-JUN-2005
7782  CLARK           10 25-JUN-2005 28-JUN-2005
7782  CLARK           13 28-JUN-2005 02-JUL-2005
7839  KING             2 17-JUN-2005 21-JUN-2005
7839  KING             8 23-JUN-2005 25-JUN-2005
7839  KING            14 29-JUN-2005 30-JUN-2005
7839  KING            11 26-JUN-2005 27-JUN-2005
7839  KING             5 20-JUN-2005 24-JUN-2005
7934  MILLER           3 18-JUN-2005 22-JUN-2005
7934  MILLER          12 27-JUN-2005 28-JUN-2005
7934  MILLER          15 30-JUN-2005 03-JUL-2005
7934  MILLER           9 24-JUN-2005 27-JUN-2005
7934  MILLER           6 21-JUN-2005 23-JUN-2005

看一下上述查询结果,员工 KING 在 PROJ_ID 5 结束之前就开始了 PROJ_ID 8,并且它开始 PROJ_ID 5 的时候,PROJ_ID 2 还没有完结。因此,你希望得到的结果集如下所示。

EMPNO ENAME      MSG
----- ---------- --------------------------------
 7782 CLARK      project 7 overlaps project 4
 7782 CLARK      project 10 overlaps project 7
 7782 CLARK      project 13 overlaps project 10
 7839 KING       project 8 overlaps project 5
 7839 KING       project 5 overlaps project 2
 7934 MILLER     project 12 overlaps project 9
 7934 MILLER     project 6 overlaps project 3

SQL 识别重叠的日期区间 解决方案

这里的关键之处在于要找出那些 PROJ_START(新项目开始的日期)等于或者大于另一个项目 PROJ_START 的行,以及等于或者小于其他项目 PROJ_END 的行。因此,首先要逐个地比较(同一个员工的)每一个项目和其他项目。通过自连接 EMP_PROJECT 表,我们为每一个员工生成全部可能的项目组合。为找到日期重叠的项目,只要在同一个员工的项目里找出 PROJ_START 介于另一个 PROJ_STARTPROJ_END 之间的那些行即可。
DB2、PostgreSQL 和 Oracle
EMP_PROJECT 表自连接,然后使用连接运算符||为时间上发生了重叠的项目构造出符合要求的输出结果。

1  select a.empno,a.ename,
2         'project '||b.proj_id||
3          ' overlaps project '||a.proj_id as msg
4    from emp_project a,
5         emp_project b
6   where a.empno = b.empno
7     and b.proj_start >= a.proj_start
8     and b.proj_start <= a.proj_end
9     and a.proj_id != b.proj_id

MySQL
EMP_PROJECT 表自连接,然后使用 CONCAT 函数为时间上发生了重叠的项目构造出符合要求的输出结果。

1  select a.empno,a.ename,
2         concat('project ',b.proj_id,
3          ' overlaps project ',a.proj_id) as msg
4    from emp_project a,
5         emp_project b
6   where a.empno = b.empno
7     and b.proj_start >= a.proj_start
8     and b.proj_start <= a.proj_end
9     and a.proj_id != b.proj_id

SQL Server
EMP_PROJECT 表自连接,然后使用连接运算符 + 为时间上发生了重叠的项目构造出符合要求的输出结果。

1  select a.empno,a.ename,
2         'project '+b.proj_id+
3          ' overlaps project '+a.proj_id as msg
4    from emp_project a,
5         emp_project b
6   where a.empno = b.empno
7     and b.proj_start >= a.proj_start
8     and b.proj_start <= a.proj_end
9     and a.proj_id != b.proj_id

SQL 识别重叠的日期区间 扩展知识

上述各个解决方案之间的差别仅在于字符串连接方式不同,接下来的讨论里会使用 DB2 语法,但仍然能兼顾全部 3 种解决方案。首先自连接 EMP_PROJECT 表,这样 PROJ_START 日期值就能和其他项目逐一地做比较了。下面是针对员工 KING 的自连接查询结果。可以发现,每一个项目是怎样“看见”其他项目的。

select a.ename,
       a.proj_id as a_id,
       a.proj_start as a_start,
       a.proj_end as a_end,
       b.proj_id as b_id,
       b.proj_start as b_start
  from emp_project a,
       emp_project b
 where a.ename    = 'KING'
   and a.empno    = b.empno
   and a.proj_id != b.proj_id
  order by 2
 
ENAME   A_ID A_START     A_END        B_ID B_START
------ ----- ----------- ----------- ----- -----------
KING       2 17-JUN-2005 21-JUN-2005     8 23-JUN-2005
KING       2 17-JUN-2005 21-JUN-2005    14 29-JUN-2005
KING       2 17-JUN-2005 21-JUN-2005    11 26-JUN-2005
KING       2 17-JUN-2005 21-JUN-2005     5 20-JUN-2005
KING       5 20-JUN-2005 24-JUN-2005     2 17-JUN-2005
KING       5 20-JUN-2005 24-JUN-2005     8 23-JUN-2005
KING       5 20-JUN-2005 24-JUN-2005    11 26-JUN-2005
KING       5 20-JUN-2005 24-JUN-2005    14 29-JUN-2005
KING       8 23-JUN-2005 25-JUN-2005     2 17-JUN-2005
KING       8 23-JUN-2005 25-JUN-2005    14 29-JUN-2005
KING       8 23-JUN-2005 25-JUN-2005     5 20-JUN-2005
KING       8 23-JUN-2005 25-JUN-2005    11 26-JUN-2005
KING      11 26-JUN-2005 27-JUN-2005     2 17-JUN-2005
KING      11 26-JUN-2005 27-JUN-2005     8 23-JUN-2005
KING      11 26-JUN-2005 27-JUN-2005    14 29-JUN-2005
KING      11 26-JUN-2005 27-JUN-2005     5 20-JUN-2005
KING      14 29-JUN-2005 30-JUN-2005     2 17-JUN-2005
KING      14 29-JUN-2005 30-JUN-2005     8 23-JUN-2005
KING      14 29-JUN-2005 30-JUN-2005     5 20-JUN-2005
KING      14 29-JUN-2005 30-JUN-2005    11 26-JUN-2005

从上面的结果集里可以看到,自连接查询使得寻找重叠日期区间的工作变得容易多了。只要筛选出 B_START 介于 A_STARTA_END 之间的那些行即可。DB2 解决方案第 7 行和第 8 行的 WHERE 条件做的就是这件事。

and b.proj_start >= a.proj_start
and b.proj_start <= a.proj_end

找到了所需的行之后,接下来只要连接字符串构造出适当格式的输出结果即可。
如果每一个员工的最大项目个数是固定的,那么 Oracle 用户就可以利用窗口函数 LEAD OVER 来避免使用自连接查询。在某些特定情况下使用自连接查询可能显得代价过于沉重(如果自连接耗费的机器资源远大于 LEAD OVER),这个时候 LEAD OVER 函数就不失为一种简便的方法。例如,考虑下面针对员工 KING 的 LEAD OVER 替代方案。

select empno,
       ename,
       proj_id,
       proj_start,
       proj_end,
       case
       when lead(proj_start,1)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       when lead(proj_start,2)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       when lead(proj_start,3)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       when lead(proj_start,4)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       end is_overlap
 from emp_project
where ename = 'KING'
 
EMPNO ENAME  PROJ_ID PROJ_START  PROJ_END    IS_OVERLAP
----- ------ ------- ----------- ----------- ----------
 7839 KING         2 17-JUN-2005 21-JUN-2005          5
 7839 KING         5 20-JUN-2005 24-JUN-2005          8
 7839 KING         8 23-JUN-2005 25-JUN-2005
 7839 KING        11 26-JUN-2005 27-JUN-2005
 7839 KING        14 29-JUN-2005 30-JUN-2005

对于员工 KING,因为项目个数被限定为 5 个,我们就能使用 LEAD OVER 函数逐一检查全部项目的日期值,而无须自连接查询。现在,离最终的结果集只有一步之遥了。接下来只要把 IS_OVERLAP 不是 Null 的行筛选出来即可。

select empno,ename,
       'project '||is_overlap||
        ' overlaps project '||proj_id msg
  from (
select empno,
       ename,
       proj_id,
       proj_start,
       proj_end,
       case
       when lead(proj_start,1)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       when lead(proj_start,2)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       when lead(proj_start,3)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       when lead(proj_start,4)over(order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(order by proj_start)
       end is_overlap
  from emp_project
 where ename = 'KING'
       )
 where is_overlap is not null
 
EMPNO ENAME  MSG
----- ------ --------------------------------
 7839 KING   project 5 overlaps project 2
 7839 KING   project 8 overlaps project 5

为了让以上解决方案适用于全体员工(而不局限于 KING),要在 LEAD OVER 函数调用里加上针对 ENAME 的分区。

select empno,ename,
       'project '||is_overlap||
        ' overlaps project '||proj_id msg
  from (
select empno,
       ename,
       proj_id,
       proj_start,
       proj_end,
       case
       when lead(proj_start,1)over(partition by ename
                                       order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(partition by ename
                                  order by proj_start)
       when lead(proj_start,2)over(partition by ename
                                       order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(partition by ename
                                  order by proj_start)
       when lead(proj_start,3)over(partition by ename
                                       order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(partition by ename
                                  order by proj_start)
       when lead(proj_start,4)over(partition by ename
                                       order by proj_start)
            between proj_start and proj_end
       then lead(proj_id)over(partition by ename
                                  order by proj_start)
       end is_overlap
  from emp_project
       )
 where is_overlap is not null
 
EMPNO ENAME  MSG
----- ------ -------------------------------
 7782 CLARK  project 7 overlaps project 4
 7782 CLARK  project 10 overlaps project 7
 7782 CLARK  project 13 overlaps project 10
 7839 KING   project 5 overlaps project 2
 7839 KING   project 8 overlaps project 5
 7934 MILLER project 6 overlaps project 3
 7934 MILLER project 12 overlaps project 9

Python教程

Java教程

Web教程

数据库教程

图形图像教程

大数据教程

开发工具教程

计算机教程