今天同事给我一条2秒的SQL看看能不能优化。
原始SQL:
SELECT pk_dept FROM aaaa WHERE 1 = 1 AND ((pk_group = '0001A110000000000JQ6' AND pk_org IN ('0001A110000000001M09'))) AND (PK_DEPT IN (SELECT t1.ORGID FROM xxxxx t1 INNER JOIN (SELECT (CASE WHEN ORGID3 IS NULL THEN ORGID2 ELSE ORGID3 END) orgid FROM xxxxx WHERE ORGID = '1001A110000000001U8S') t2 ON t1.ORGID2 = t2.orgid OR t1.ORGID3 = t2.orgid)) AND (enablestate IN (2)) ORDER BY code;
4 loops=1) Heap Fetches: 0 Planning Time: 258.024 ms Execution Time: 2493.882 ms (1583 rows)
xxxxx 表是一张非常复杂的视图,上面SQL执行计划大概2千行左右,所以本案例就不放整体的计划了。
如果经常做优化的同学对于简单的SQL,相信可以使用瞪眼大法基本定位到语句慢的位置?
子查询慢SQL:
SELECT t1.ORGID FROM xxxxx t1 INNER JOIN (SELECT (CASE WHEN ORGID3 IS NULL THEN ORGID2 ELSE ORGID3 END) orgid FROM xxxxx WHERE ORGID = '1001A110000000001U8S') t2 ON t1.ORGID2 = t2.orgid OR t1.ORGID3 = t2.orgid
主要是慢在连接列的 OR 关系运算符上。
我们可以仔细看看这条SQL,xxxxx 分别作为t1、t2 表内连接关联2次, 关联条件为:t1.ORGID2 = t2.orgid OR t1.ORGID3 = t2.orgid
这种关联条件的逻辑关系可以简化为: t1.ORGID2 = t1.ORGID3
SQL语句可以进行以下精简:
SELECT t1.ORGID FROM xxxxx t1 where ORGID = '1001A110000000001U8S';
验证等价逻辑:
ncc=# SELECT t1.ORGID FROM xxxxx t1 WHERE EXISTS ( ncc(# SELECT 1 FROM ncc(# (SELECT (CASE WHEN ORGID3 IS NULL THEN ORGID2 ELSE ORGID3 END) orgid FROM xxxxx WHERE ORGID = '1001A110000000001U8S') t2 ncc(# WHERE t1.ORGID2 = t2.orgid OR t1.ORGID3 = t2.orgid ncc(# ); orgid ---------------------- 1001A110000000001U8S (1 row) ncc=# SELECT t1.ORGID FROM xxxxx t1 where ORGID = '1001A110000000001U8S'; orgid ---------------------- 1001A110000000001U8S (1 row) ncc=#
可以看到子查询SQL进行改写后是结果是等价的,使用逻辑公式进行化繁为简。
改写优化SQL后执行计划:
SELECT pk_dept FROM org_dept WHERE 1 = 1 AND ((pk_group = '0001A110000000000JQ6' AND pk_org IN ('0001A110000000001M09'))) AND (PK_DEPT IN (SELECT t1.ORGID FROM EHR_ORG_DEPT t1 where ORGID = '1001A110000000001U8S') ) AND (enablestate IN (2)) ORDER BY code;
-> Seq Scan on org_dept d1_35 (cost=0.00..639.33 rows=7133 width=116) (actual time=0.047..1.480 rows=7133 loops=1) -> Hash (cost=120.64..120.64 rows=3224 width=21) (actual time=1.315..1.315 rows=3224 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 199kB -> Index Only Scan using pk_bd_region on bd_region bg_35 (cost=0.28..120.64 rows=3224 width=21) (actual time=0.024..0.651 rows=3224 loops=1) Heap Fetches: 0 Planning Time: 86.684 ms Execution Time: 625.439 ms
2.4秒的SQL最后通过化繁为简最终优化到625毫秒就能跑出结果。
总结:逻辑思维课程还是挺重要的,建议大家有空的时候去看看。