Oracle集群RAC DG日常检查指令

发布时间 2023-11-10 11:17:52作者: Allen158

操作系统进程检查

Pmon检查

  • pmon(Process Monitor process)用于监控其他后台进程。负责在连接出现异常中止后进行清理工作。例如,一个专用服务器进程崩溃或者出于某种原因被结束掉,就要由PMON进程负责善后(恢复或者撤销工作),并释放资源。PMON会回滚未提交的工作,释放锁,并释放之前为失败进程分配的SGA资源。
    PMON还负责监视其他Oracle后台进程,并在必要时重启这些后台进程。
  • 因此在操作系统侧,可以首先检查pmon进程
092755.221: [oracle ~ ]$ ps -ef |grep -i pmon
092755.266: oracle   22589 22483  0 09:27 pts/1    00:00:00 grep --color=auto -i pmon
092755.271: grid     41673     1  0 Nov03 ?        00:00:19 asm_pmon_+ASM1
092755.271: oracle   43062     1  0 Nov03 ?        00:00:55 ora_pmon_oracle1

负载检查

[oracle@ ~ ]$ top 
094246.285: top - 09:42:46 up 3 days, 13:45,  2 users,  load average: 0.44, 0.49, 0.60
094246.290: Tasks: 1179 total,   2 running, 1177 sleeping,   0 stopped,   0 zombie
094246.290: %Cpu(s):  2.7 us,  0.6 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
094246.290: KiB Mem:  26347395+total, 24792364+used, 15550296 free,  1356312 buffers
094246.290: KiB Swap: 33554428 total,     8960 used, 33545468 free. 71150048 cached Mem
094246.295: 
094246.295:   PID USER      PR  NI    VIRT    RES    SHR S   %CPU  %MEM     TIME+ COMMAND                                                                                                                      
094246.295: 35858 oracle    20   0  0.148t  88408  83244 R 100.00 0.034   0:21.06 oracle_35858_or                                                                                                              
094246.295: 36340 oracle    20   0  0.148t  85412  80980 S 11.111 0.032   0:00.04 oracle_36340_or                                                                                                              
094246.295: 36341 oracle    20   0   16348   3500   2224 R 11.111 0.001   0:00.05 top                                                                                                                          
094246.295: 35766 root      20   0 5509468 156808  31652 S  5.556 0.060  23:24.58 java                                                                                                                         
094246.295: 36338 oracle    20   0  0.148t  85392  80960 S  5.556 0.032   0:00.04 oracle_36338_or                                                                                                              
094246.295:     1 root      20   0  181660   6184   4180 S  0.000 0.002   2:08.17 systemd                                                                                                                      
094246.295:     2 root      20   0       0      0      0 S  0.000 0.000   0:00.12 kthreadd                                                                                                                     
094246.295:     4 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/0:0H                                                                                                                 
094246.295:     6 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 mm_percpu_wq                                                                                                                 
094246.295:     7 root      20   0       0      0      0 S  0.000 0.000   0:13.67 ksoftirqd/0                                                                                                                  
094246.295:     8 root      20   0       0      0      0 S  0.000 0.000   6:47.36 rcu_sched                                                                                                                    
094246.295:     9 root      20   0       0      0      0 S  0.000 0.000   0:00.00 rcu_bh                                                                                                                       
094246.295:    10 root      rt   0       0      0      0 S  0.000 0.000   0:00.00 migration/0                                                                                                                  
094246.295:    11 root      rt   0       0      0      0 S  0.000 0.000   0:00.37 watchdog/0                                                                                                                   
094246.295:    12 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/0                                                                                                                      
094246.295:    13 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/1                                                                                                                      
094246.295:    14 root      rt   0       0      0      0 S  0.000 0.000   0:00.36 watchdog/1                                                                                                                   
094246.295:    15 root      rt   0       0      0      0 S  0.000 0.000   0:00.01 migration/1                                                                                                                  
094246.295:    16 root      20   0       0      0      0 S  0.000 0.000   0:03.20 ksoftirqd/1                                                                                                                  
094246.295:    18 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/1:0H                                                                                                                 
094246.295:    19 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/2                                                                                                                      
094246.295:    20 root      rt   0       0      0      0 S  0.000 0.000   0:00.37 watchdog/2                                                                                                                   
094246.295:    21 root      rt   0       0      0      0 S  0.000 0.000   0:00.01 migration/2                                                                                                                  
094246.300:    22 root      20   0       0      0      0 S  0.000 0.000   0:02.63 ksoftirqd/2                                                                                                                  
094246.300:    24 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/2:0H                                                                                                                 
094246.300:    25 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/3                                                                                                                      
094246.300:    26 root      rt   0       0      0      0 S  0.000 0.000   0:00.38 watchdog/3                                                                                                                   
094246.300:    27 root      rt   0       0      0      0 S  0.000 0.000   0:00.01 migration/3                                                                                                                  
094246.300:    28 root      20   0       0      0      0 S  0.000 0.000   0:01.19 ksoftirqd/3                                                                                                                  
094246.300:    30 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/3:0H                                                                                                                 
094246.320:    31 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/4                                                                                                                      
094246.320:    32 root      rt   0       0      0      0 S  0.000 0.000   0:00.39 watchdog/4                                                                                                                   
094246.320:    33 root      rt   0       0      0      0 S  0.000 0.000   0:00.00 migration/4                                                                                                                  
094246.320:    34 root      20   0       0      0      0 S  0.000 0.000   0:02.01 ksoftirqd/4                                                                                                                  
094246.320:    36 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/4:0H                                                                                                                 
094246.320:    37 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/5                                                                                                                      
094246.320:    38 root      rt   0       0      0      0 S  0.000 0.000   0:00.40 watchdog/5                                                                                                                   
094246.320:    39 root      rt   0       0      0      0 S  0.000 0.000   0:00.00 migration/5                                                                                                                  
094246.320:    40 root      20   0       0      0      0 S  0.000 0.000   0:03.99 ksoftirqd/5                                                                                                                  
094246.320:    42 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/5:0H                                                                                                                 
094246.320:    43 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/6                                                                                                                      
094246.325:    44 root      rt   0       0      0      0 S  0.000 0.000   0:00.39 watchdog/6                                                                                                                   
094246.325:    45 root      rt   0       0      0      0 S  0.000 0.000   0:00.00 migration/6                                                                                                                  
094246.325:    46 root      20   0       0      0      0 S  0.000 0.000   0:02.34 ksoftirqd/6                                                                                                                  
094246.325:    48 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/6:0H                                                                                                                 
094246.325:    49 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/7                                                                                                                      
094246.325:    50 root      rt   0       0      0      0 S  0.000 0.000   0:00.40 watchdog/7                                                                                                                   
094246.325:    51 root      rt   0       0      0      0 S  0.000 0.000   0:00.00 migration/7                                                                                                                  
094246.325:    52 root      20   0       0      0      0 S  0.000 0.000   0:01.66 ksoftirqd/7                                                                                                                  
094246.325:    54 root       0 -20       0      0      0 S  0.000 0.000   0:00.00 kworker/7:0H                                                                                                                 
094246.325:    55 root      20   0       0      0      0 S  0.000 0.000   0:00.00 cpuhp/8                                                                                                                      
094246.325:    56 root      rt   0       0      0      0 S  0.000 0.000   0:00.37 watchdog/8             

数据库检查

查看数据库打开状态和相关信息

  • 确定当前库为主库
093045.590: SQL>  select inst_id,open_mode,database_role from gv$database;
093045.610: 
093045.610:    INST_ID OPEN_MODE            DATABASE_ROLE
093045.610: ---------- -------------------- ----------------
093045.610:          2 READ WRITE           PRIMARY
093045.610:          1 READ WRITE           PRIMARY

查找主库

show parameter config 
show parameter unique (主库唯一名)
show parameter log_archive_config(主库 备库唯一名都能看到,由此确认备库唯一名)

判断集群正常与否

判断会话等待

检查以下会话等待,主要最多的会话等待没有以下几项即可

 SQL> 
094456.693:   1  select inst_id,event,count(*)
094456.693:   2  from gv$session
094456.693:   3  where WAIT_CLASS!='IDLE'
094456.693:   4* group by inst_id,event
094456.698: 
094456.698:    INST_ID EVENT                                                              COUNT(*)
094456.698: ---------- ---------------------------------------------------------------- ----------
094456.698:          2 Space Manager: slave idle wait                                           52
094456.698:          2 gcs yield cpu                                                             3
094456.698:          2 Data Guard: Timer                                                         1
094456.723:          2 GCR sleep                                                                 2
094456.723:          2 wait for unread message on broadcast channel                              3
094456.723:          2 pman timer                                                                1
094456.723:          2 gcs remote message                                                        3
094456.723:          2 REPL Capture/Apply: RAC AQ qmn coordinator                                1
094456.723:          2 lreg timer                                                                1
094456.723:          1 lreg timer                                                                1
094456.723:          2 PX Deq: Execution Msg                                                     1
094456.723:          1 Space Manager: slave idle wait                                           51
094456.723:          1 pmon timer                                                                7
094456.728:          1 VKTM Logical Idle Wait                                                    1
094456.728:          1 DIAG idle wait                                                            2
094456.728:          1 SCM slave idle                                                            1
094456.728:          1 gcs remote message                                                        3
094456.728:          1 heartbeat redo informer                                                   1
094456.728:          1 smon timer                                                                1
094456.728:          2 class slave wait                                                          6
094456.728:          2 VKTM Logical Idle Wait                                                    1
094456.728:          2 Data Guard: Gap Manager                                                   1
094456.728:          2 ges remote message                                                        2
094456.728:          1 OFS idle                                                                  1
094456.728:          1 PING                                                                      1
094456.728:          1 pman timer                                                                1
094456.728:          1 ges remote message                                                        2
094456.728:          1 Streams AQ: qmn coordinator idle wait                                     1
094456.728:          2 LMS CR slave timer                                                        3
094456.733:          2 DIAG idle wait                                                            2
094456.733:          2 PING                                                                      1
094456.733:          2 smon timer                                                                1
094456.733:          1 Streams AQ: waiting for time management or cleanup tasks                  1
094456.733:          1 Data Guard: Timer                                                         1
094456.733:          2 Streams AQ: qmn coordinator idle wait                                     1
094456.733:          1 watchdog main loop                                                        5
094456.733:          1 wait for unread message on broadcast channel                              3
094456.733:          2 ASM cluster membership changes                                            1
094456.733:          2 OFS idle                                                                  1
094456.733:          2 VKRM Idle                                                                 1
094456.733:          2 Streams AQ: waiting for time management or cleanup tasks                  1
094456.733:          1 rdbms ipc message                                                        32
094456.733:          1 ASM background timer                                                      1
094456.733:          1 GCR sleep                                                                 2
094456.733:          1 class slave wait                                                          6
094456.733:          1 Data Guard: Gap Manager                                                   1
094456.733:          2 rdbms ipc message                                                        32
094456.733:          2 pmon timer                                                                7
094456.733:          2 watchdog main loop                                                        5
094456.733:          2 heartbeat redo informer                                                   1
094456.733:          2 AQPC idle                                                                 1
094456.733:          2 SQL*Net message from client                                               6
094456.733:          1 ASM cluster membership changes                                            1
094456.733:          1 LMS CR slave timer                                                        3
094456.733:          1 gcs yield cpu                                                             3
094456.733:          1 VKRM Idle                                                                 1
094456.733:          2 ASM background timer                                                      1
094456.733:          2 SCM slave idle                                                            1
094456.733:          2 Streams AQ: qmn slave idle wait                                           1
094456.733:          1 SQL*Net message from client                                             187
094456.733:          1 Streams AQ: qmn slave idle wait                                           1
094456.733:          1 PX Deq: Execution Msg                                                     1
094456.738:          1 PX Deq: Execute Reply                                                     1
094456.738:          1 AQPC idle                                                                 1
094456.738:          1 REPL Capture/Apply: RAC AQ qmn coordinator                                1
094456.738: 

查看连接数 并与数据库配置对比

select inst_id,status,count(*) from gv$session
group by inst_id,status order by 1,3; 
show parameter process

判断集群和DG状态

RAC

su - grid
crsctl check crs 判断几个核心任务是否online
crsctl stat res -t 查看所有资源是否online
crsctl query css votedisk 看所有投票盘是否在线 
asmcmd lsdg  看freeMB 和 offline disk

DG

select * from gv$dataguard_stats order by inst_id 查看应用时间和传输时间
select process,status form v$managed_standby; 看是否有MRO0进程(该进程负责同步日志)
select database_role,to_char(current_scn) from gv$database; 查看两个库的scn