ceph替换坏的日志盘

发布时间 2023-04-24 10:56:36作者: XU-NING

客户环境,sdX为日志盘损坏,现更换新得sdX盘,目前6个osd down执行以下操作

第一步:

ssd 以 /dev/sd<X> 指代,分区序号以 <Y> 指代(在 ssd 创建的第一个分区, Y 即是 1,创建第 2 个时, Y 即是 2),命令:

sgdisk -n <Y>:0:+5G -t <Y>:45B0969E-9B03-4F30-B4C6-B4B80CEFF106 -c <Y>:"ceph journal" /dev/sd<X>

实例:

# sgdisk -n 1:0:+5G -t 1:45B0969E-9B03-4F30-B4C6-B4B80CEFF106 -c 1:"ceph journal" /dev/sdX

# sgdisk -n 2:0:+5G -t 2:45B0969E-9B03-4F30-B4C6-B4B80CEFF106 -c 2:"ceph journal" /dev/sdX

# sgdisk -n 3:0:+5G -t 3:45B0969E-9B03-4F30-B4C6-B4B80CEFF106 -c 3:"ceph journal" /dev/sdX

# sgdisk -n 4:0:+5G -t 4:45B0969E-9B03-4F30-B4C6-B4B80CEFF106 -c 4:"ceph journal" /dev/sdX

# sgdisk -n 5:0:+5G -t 5:45B0969E-9B03-4F30-B4C6-B4B80CEFF106 -c 5:"ceph journal" /dev/sdX

# sgdisk -n 6:0:+5G -t 6:45B0969E-9B03-4F30-B4C6-B4B80CEFF106 -c 6:"ceph journal" /dev/sdX

第二步:

找到journal分区的partition UUID:

#sgdisk -i 1 /dev/sdX

实例:

sgdisk -i 1 /dev/sdX

sgdisk -i 2 /dev/sdX

sgdisk -i 3 /dev/sdX

sgdisk -i 4 /dev/sdX

sgdisk -i 5 /dev/sdX

sgdisk -i 6 /dev/sdX

或者 blkid|grep sdX

第三步:

创建ceph journal日志分区:
  sgdisk -c <Y>:"ceph journal (osd.<osd_id>)" /dev/sd<X>
  #sgdisk -c 1:"ceph journal (osd.2)" /dev/sdf

实例:

#sgdisk -c 1:"ceph journal (osd.20)" /dev/sdX

#sgdisk -c 2:"ceph journal (osd.21)" /dev/sdX

#sgdisk -c 3:"ceph journal (osd.22)" /dev/sdX

#sgdisk -c 4:"ceph journal (osd.23)" /dev/sdX

#sgdisk -c 5:"ceph journal (osd.24)" /dev/sdX

#sgdisk -c 6:"ceph journal (osd.25)" /dev/sdX

第四步:

停止硬盘 osd 服务进程:

#systemctl stop osd.20

#systemctl stop osd.21

#systemctl stop osd.22

#systemctl stop osd.23

#systemctl stop osd.24

#systemctl stop osd.25

第五步:(日志盘损坏此步可省略)

将硬盘原来 journal 文件中的内容写入到 osd 数据目录,以防出现数据损坏:
#ceph-osd -i 20 --flush-journal

#ceph-osd -i 21 --flush-journal

#ceph-osd -i 22 --flush-journal

#ceph-osd -i 23 --flush-journal

#ceph-osd -i 24 --flush-journal

#ceph-osd -i 25 --flush-journal

第六步:

在 Linux 系统中,会为每一个硬盘分区创建一个以分区 uuid 命名的软链接设备文件,即 /dev/disk/by-partuuid/<journal_part_uuid> 。为这个软链接再创建一个软链接,替换原来的 journal 文件:
 #ln -sf /dev/disk/by-partuuid/<journal_part_uuid> /var/lib/ceph/osd/ceph-2/journal

实例:

#ln -sf /dev/disk/by-partuuid/<journal_part_uuid> /var/lib/ceph/osd/ceph-20/journal

#ln -sf /dev/disk/by-partuuid/<journal_part_uuid> /var/lib/ceph/osd/ceph-21/journal

#ln -sf /dev/disk/by-partuuid/<journal_part_uuid> /var/lib/ceph/osd/ceph-22/journal

#ln -sf /dev/disk/by-partuuid/<journal_part_uuid> /var/lib/ceph/osd/ceph-23/journal

#ln -sf /dev/disk/by-partuuid/<journal_part_uuid> /var/lib/ceph/osd/ceph-24/journal

#ln -sf /dev/disk/by-partuuid/<journal_part_uuid> /var/lib/ceph/osd/ceph-25/journal

第七步:

重新初始化硬盘 osd 的 journal:
 #ceph-osd -i 2 --mkjournal

实例:

#ceph-osd -i 20 --mkjournal

#ceph-osd -i 21 --mkjournal

#ceph-osd -i 22 --mkjournal

#ceph-osd -i 23 --mkjournal

#ceph-osd -i 24 --mkjournal

#ceph-osd -i 25 --mkjournal

第八步:

systemctl start ceph-osd@20

systemctl start ceph-osd@21

systemctl start ceph-osd@22

systemctl start ceph-osd@23

systemctl start ceph-osd@24

systemctl start ceph-osd@25