教你使用常用的逻辑公式和恒等式等价改写SQL

发布时间 2023-08-17 16:58:01作者: 小至尖尖

今天同事给我一条2秒的SQL看看能不能优化。

原始SQL:

SELECT pk_dept
FROM aaaa
WHERE 1 = 1
  AND ((pk_group = '0001A110000000000JQ6' AND pk_org IN ('0001A110000000001M09')))
  AND (PK_DEPT IN (SELECT t1.ORGID
                   FROM xxxxx t1
                            INNER JOIN (SELECT (CASE WHEN ORGID3 IS NULL THEN ORGID2 ELSE ORGID3 END) orgid
                                        FROM xxxxx
                                        WHERE ORGID = '1001A110000000001U8S') t2
                                       ON t1.ORGID2 = t2.orgid OR t1.ORGID3 = t2.orgid))
  AND (enablestate IN (2))
ORDER BY code;
4 loops=1)
                                                                                                               Heap Fetches: 0
 Planning Time: 258.024 ms
 Execution Time: 2493.882 ms
(1583 rows)

xxxxx   表是一张非常复杂的视图,上面SQL执行计划大概2千行左右,所以本案例就不放整体的计划了。

如果经常做优化的同学对于简单的SQL,相信可以使用瞪眼大法基本定位到语句慢的位置?

子查询慢SQL:

  SELECT t1.ORGID
                   FROM xxxxx t1
                            INNER JOIN (SELECT (CASE WHEN ORGID3 IS NULL THEN ORGID2 ELSE ORGID3 END) orgid
                                        FROM xxxxx
                                        WHERE ORGID = '1001A110000000001U8S') t2
                                       ON t1.ORGID2 = t2.orgid OR t1.ORGID3 = t2.orgid

主要是慢在连接列的 OR 关系运算符上。

我们可以仔细看看这条SQL,xxxxx 分别作为t1、t2 表内连接关联2次, 关联条件为:t1.ORGID2 = t2.orgid OR t1.ORGID3 = t2.orgid  

这种关联条件的逻辑关系可以简化为: t1.ORGID2 = t1.ORGID3

SQL语句可以进行以下精简:

SELECT t1.ORGID FROM xxxxx t1 where  ORGID = '1001A110000000001U8S';

验证等价逻辑:

ncc=#  SELECT t1.ORGID FROM xxxxx t1 WHERE EXISTS (
ncc(# SELECT 1 FROM
ncc(# (SELECT (CASE WHEN ORGID3 IS NULL THEN ORGID2 ELSE ORGID3 END) orgid FROM xxxxx WHERE ORGID = '1001A110000000001U8S') t2
ncc(# WHERE t1.ORGID2 = t2.orgid  OR   t1.ORGID3 = t2.orgid
ncc(#  );
        orgid
----------------------
 1001A110000000001U8S
(1 row)

ncc=#   SELECT t1.ORGID FROM xxxxx t1 where  ORGID = '1001A110000000001U8S';
        orgid
----------------------
 1001A110000000001U8S
(1 row)

ncc=#

可以看到子查询SQL进行改写后是结果是等价的,使用逻辑公式进行化繁为简。

改写优化SQL后执行计划:

SELECT pk_dept
FROM org_dept
WHERE 1 = 1
  AND ((pk_group = '0001A110000000000JQ6' AND pk_org IN ('0001A110000000001M09')))
  AND (PK_DEPT IN (SELECT t1.ORGID
                   FROM EHR_ORG_DEPT t1
                   where ORGID = '1001A110000000001U8S')
    )
  AND (enablestate IN (2))
ORDER BY code;
                                                                                 ->  Seq Scan on org_dept d1_35  (cost=0.00..639.33 rows=7133 width=116) (actual time=0.047..1.480 rows=7133 loops=1)
                                                                                 ->  Hash  (cost=120.64..120.64 rows=3224 width=21) (actual time=1.315..1.315 rows=3224 loops=1)
                                                                                       Buckets: 4096  Batches: 1  Memory Usage: 199kB
                                                                                       ->  Index Only Scan using pk_bd_region on bd_region bg_35  (cost=0.28..120.64 rows=3224 width=21) (actual time=0.024..0.651 rows=3224 loops=1)
                                                                                             Heap Fetches: 0
 Planning Time: 86.684 ms
 Execution Time: 625.439 ms

2.4秒的SQL最后通过化繁为简最终优化到625毫秒就能跑出结果。

总结:逻辑思维课程还是挺重要的,建议大家有空的时候去看看。