问题:
ceph 集群运行运行过程中出现警告1 daemons have recently crashed
排查
1、查询最新 crash 信息 ceph crash ls-new
[root@node1 ~]# ceph crash ls-new
ID ENTITY NEW
2023-xx-xx_01:35:33.350541Z_b929855b-f264-42df-89ef-bcdb31f6bc18 mgr.node1 *
2、根据 crash ID 查询具体详细 ceph crash info 2023-xx-xx_01:35:33.350541Z_b929855b-f264-42df-89ef-bcdb31f6bc18
3、根据报错解决问题
4、确认crash并关闭健康警告
#批量清除crash
ceph crash archive-all
#根据 crash-id 清除单个crash
ceph crash archive <crash-id>
补充:
查看内核日志dmesg -T