ceph应用-RBD、radosgw对象存储、cephfs

发布时间 2023-12-28 10:11:34作者: 蕝戀

块存储(RBD)

用于K8S、openstack、linux中直接挂载。类似使用iscsi块存储一样。

块存储使用案例

# 1. 创建存储池
# 语法:ceph osd pool create <存储池名> <PG> [<PGP>] [{replicated|erasure}]
# 	PG: 指定存储池的pg数量
# 	PGP: 指定存储池pgp数量,一般与pg相同。不填写默认就是和PG一样【可选】
# 	replicated 副本池(默认)【可选】
# 	erasure 纠错码池 (monio有用到)【可选】

# 1.1 创建名为django-web的存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool create django-web 16
pool 'django-web' created

# 1.2 查看存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd lspools
... # 此处省略,是我之前创建的存储池
3 django-web

# 2. 开启存储池的RDB功能
语法:ceph osd pool application enable <poolname> <app> {--yes-i-really-mean-it}

# 2.1 开启django-web存储池的RBD功能
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool application enable django-web rbd
enabled application 'rbd' on pool 'django-web'

# 2.2 通过rbd命令对存储池进行初始化
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd pool init -p django-web

# 3. 此时还不能直接挂载rbd,还需要创建img对象。
#   一个存储池可以有多个镜像img,需要通过img进行进行挂载。
语法: rbd create [--pool <pool>] [--image <image>] [--image-feature <image-feature>] --size <size> [--no-progress] 
     # --image-feature用来指定镜像的特性,因为在某些linux内核较低的发行版本(如centos7),有些功能用不了,一旦你挂载的时候就会报错。)
     # 默认开启的特性:  --image-feature arg       image features
                       [layering(+), exclusive-lock(+*), object-map(+*),
                       fast-diff(+*), deep-flatten(+-), journaling(*)]
                       
                       Image Features:
                       (*) supports enabling/disabling on existing images
                       (-) supports disabling-only on existing images
                       (+) enabled by default for new images if features not specified
        
# 3.1 创建img镜像文件
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd create --pool django-web --image img001 --size 1G
# 虽然后面可以修改img的特性,不过我特意再创建多一个img镜像img002,因为等下我要用centos7来挂载,
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd create --pool django-web --image img002 --size 1G --image-feature layering

# 3.2 查看django-web存储池下的img镜像有哪些?
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd ls -p django-web
img001
img002

# 3.3 查看img的配置信息
[ceph@ceph-deploy ceph-cluster-deploy]$ rbd -p django-web --image img001 info
rbd image 'img001':
        size 1 GiB in 256 objects # 一个256个对象
        order 22 (4 MiB objects) # 每个对象4M , 4*256=1024M=1G
        id: 282f76b8b4567
        block_name_prefix: rbd_data.282f76b8b4567
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten # 这里就是开启了什么特性
        op_features: 
        flags: 
        create_timestamp: Mon Dec 25 18:56:14 2023
        
# 4. 客户端安装ceph-common包(不安装用不了挂载)
# 配置epel源、配置ceph源,然后再安装ceph-common包
yum install -y ceph-common

# 4.1 复制admin权限的认证权限keyring文件(这里暂时用admin的,后续学习cephx认证后用普通用户)
# 生产环境一般也是用普通用户,绝不会用admin权限
scp ceph.client.admin.keyring root@xxxxx:/etc/ceph

# 4.2 挂载rbd存储
usage: rbd map [--device-type <device-type>] [--pool <pool>] [--image <image>] 
               [--snap <snap>] [--read-only] [--exclusive] 
               [--options <options>] 
               <image-or-snap-spec> 
Map an image to a block device.

# 4.2.1 开始挂载
# 由于内核版本太低,有些特性不支持,所以会报错
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo rbd map -p django-web --image img001
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable django-web/img001 object-map fast-diff deep-flatten".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address
# 报错的同时还会提示你如何通过关闭img镜像的特性来解决
# rbd feature disable django-web/img001 object-map fast-diff deep-flatten

# 我这里就不关闭了,直接挂载之前创建的img002
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo rbd map -p django-web --image img002
/dev/rbd0  # 从命令的返回结果可以看到,已经将其挂载到/deb/rbd0这个文件上

# 4.2.2 查看块设备 lsblk
[ceph@ceph-deploy ceph-cluster-deploy]$ lsblk
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0   20G  0 disk 
├─sda1            8:1    0  512M  0 part /boot
└─sda2            8:2    0 19.5G  0 part 
  └─centos-root 253:0    0 19.5G  0 lvm  /
sr0              11:0    1 1024M  0 rom  
rbd0            252:0    0    1G  0 disk 

# 4.3 格式化/deb/rbd0
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mkfs.ext4 /dev/rbd0 
mke2fs 1.42.9 (28-Dec-2013)
Discarding device blocks: 完成                            
文件系统标签=
OS type: Linux
块大小=4096 (log=2)
分块大小=4096 (log=2)
Stride=1024 blocks, Stripe width=1024 blocks
65536 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
第一个数据块=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376

Allocating group tables: 完成                            
正在写入inode表: 完成                            
Creating journal (8192 blocks): 完成
Writing superblocks and filesystem accounting information: 完成

# 4.4 挂载到指定目录
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mkdir /mnt/ceph-rbd-dir
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mount /dev/rbd0 /mnt/ceph-rbd-dir/

# 4.5 切换到root用户,往挂载的目录写入数据进行测试
[root@ceph-deploy ~]# echo hahaha >  /mnt/ceph-rbd-dir/1.txt
[root@ceph-deploy ~]# cat /mnt/ceph-rbd-dir/1.txt
hahaha

# 配置开机挂载
systemctl enable rc-local
vim /etc/rc.local
# 添加以下两行命令
rbd map -p django-web --image img001 [--id <ceph用户名,不需要client前缀>]
mount /dev/rbd0 /mnt/ceph-rbd-dir/

对象存储ceph radosgw(RGW)

RGW提供restful接口,客户端通过请求api接口进行交互,从而进行数据的增删改查。
这种一般都是开发app调用。

使用前提

需将要某个节点添加为rgw节点

通过7480端口来访问对象存储,并提供key来访问校验权限。

案例-对象存储搭建

# 比如:将ceph-node2 添加为rgw节点

# 1. 在需要成为rgw节点的服务器上安装 ceph-radosgw 包
yum install -y ceph-radosgw

# 2. 在ceph-deploy节点上添加某个节点成为rgw节点
# 这个操作,将会在rgw节点上添加一个名为:ceph-radosgw@rgw.<节点名>的服务,并设置为开机启动
# 然后rgw节点上会监听7480端口
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy rgw create ceph-node2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy rgw create ceph-node2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  rgw                           : [('ceph-node2', 'rgw.ceph-node2')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f56cd7c2cf8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function rgw at 0x7f56cde08050>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts ceph-node2:rgw.ceph-node2
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2 
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph_deploy.rgw][INFO  ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.rgw][DEBUG ] remote host will use systemd
[ceph_deploy.rgw][DEBUG ] deploying rgw bootstrap to ceph-node2
[ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node2][WARNIN] rgw keyring does not exist yet, creating one
[ceph-node2][DEBUG ] create a keyring file
[ceph-node2][DEBUG ] create path recursively if it doesn't exist
[ceph-node2][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-rgw --keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring auth get-or-create client.rgw.ceph-node2 osd allow rwx mon allow rw -o /var/lib/ceph/radosgw/ceph-rgw.ceph-node2/keyring
[ceph-node2][INFO  ] Running command: sudo systemctl enable ceph-radosgw@rgw.ceph-node2
[ceph-node2][WARNIN] Created symlink from /etc/systemd/system/ceph-radosgw.target.wants/ceph-radosgw@rgw.ceph-node2.service to /usr/lib/systemd/system/ceph-radosgw@.service.
[ceph-node2][INFO  ] Running command: sudo systemctl start ceph-radosgw@rgw.ceph-node2
[ceph-node2][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph_deploy.rgw][INFO  ] The Ceph Object Gateway (RGW) is now running on host ceph-node2 and default port 7480

# 3. 另外,启动rgw后会默认生成与其相关的pool
[root@ceph-node2 ~]# ceph osd lspools
... # 此处省略其他pool
3 django-web
4 .rgw.root
5 default.rgw.control
6 default.rgw.meta
7 default.rgw.log

# 4. 通过ceph -s 也可以看到有一个rgw
[root@ceph-node2 ~]# ceph -s 
  cluster:
    id:     f1da3a2e-b8df-46ba-9c6b-0030da25c73e
    health: HEALTH_WARN
            application not enabled on 1 pool(s)
            too few PGs per OSD (29 < min 30)
 
  services:
    mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
    mgr: ceph-node2(active), standbys: ceph-node3, ceph-node1
    osd: 9 osds: 9 up, 9 in
    rgw: 1 daemon active
 
  data:
    pools:   7 pools, 88 pgs
    objects: 211  objects, 37 MiB
    usage:   9.2 GiB used, 36 GiB / 45 GiB avail
    pgs:     88 active+clean
 

rgw高可用搭建(使对象存储对外提供高可用服务)

之前只将ceph-node2添加为rgw节点,目前只是单节点,没有实现高可用。

一般生产环境可以添加多个节点成为rgw节点,然后通过nginx反代多个rgw节点的7480端口。

# 切换到ceph-deploy节点,并添加ceph-node1为rgw节点
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy rgw create ceph-node1
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy rgw create ceph-node1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  rgw                           : [('ceph-node1', 'rgw.ceph-node1')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fb193e27cf8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function rgw at 0x7fb19446d050>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts ceph-node1:rgw.ceph-node1
[ceph-node1][DEBUG ] connection detected need for sudo
[ceph-node1][DEBUG ] connected to host: ceph-node1 
[ceph-node1][DEBUG ] detect platform information from remote host
[ceph-node1][DEBUG ] detect machine type
[ceph_deploy.rgw][INFO  ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.rgw][DEBUG ] remote host will use systemd
[ceph_deploy.rgw][DEBUG ] deploying rgw bootstrap to ceph-node1
[ceph-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node1][WARNIN] rgw keyring does not exist yet, creating one
[ceph-node1][DEBUG ] create a keyring file
[ceph-node1][DEBUG ] create path recursively if it doesn't exist
[ceph-node1][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-rgw --keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring auth get-or-create client.rgw.ceph-node1 osd allow rwx mon allow rw -o /var/lib/ceph/radosgw/ceph-rgw.ceph-node1/keyring
[ceph-node1][INFO  ] Running command: sudo systemctl enable ceph-radosgw@rgw.ceph-node1
[ceph-node1][WARNIN] Created symlink from /etc/systemd/system/ceph-radosgw.target.wants/ceph-radosgw@rgw.ceph-node1.service to /usr/lib/systemd/system/ceph-radosgw@.service.
[ceph-node1][INFO  ] Running command: sudo systemctl start ceph-radosgw@rgw.ceph-node1
[ceph-node1][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph_deploy.rgw][INFO  ] The Ceph Object Gateway (RGW) is now running on host ceph-node1 and default port 7480

# 查看ceph集群状态,可以发现rgw变成了2个
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph -s
  cluster:
    id:     f1da3a2e-b8df-46ba-9c6b-0030da25c73e
    health: HEALTH_WARN
            application not enabled on 1 pool(s)
 
  services:
    mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
    mgr: ceph-node2(active), standbys: ceph-node1, ceph-node3
    mds: cephfs-test-1/1/1 up  {0=ceph-node3=up:active}, 1 up:standby
    osd: 9 osds: 9 up, 9 in
    rgw: 2 daemons active
 
  data:
    pools:   9 pools, 120 pgs
    objects: 234  objects, 37 MiB
    usage:   9.2 GiB used, 36 GiB / 45 GiB avail
    pgs:     120 active+clean

修改radosgw内置的citeweb服务的端口7480

# vim /etc/ceph.conf

[client.rgw.<节点名>]
rgw_host=<节点名 或者 节点IP地址>
rgw_frontends="civetweb port=8880" # 这里只是修改http端口, 如果需要https端口,要修改为:
rgw_frontends="civetweb port=8880+8443s"

# 添加ssl证书
rgw_frontends="civetweb port=8880+8443s ssl_certificate='pem证书'

# civetweb默认的请求处理线程数
num_threads=50 # 默认就是50
access_log_file="log path" # 设置access log记录位置
error_log_file="log path" # 设置error log记录文件

# 重启节点上的rgw服务
systemctl restart ceph-radosgw@rgw.<节点名>

文件系统存储ceph-fs

性能不如块存储,一般能用块存储就用块存储。
类似NFS,ceph-fs只不过是使用ceph协议并挂载存储使用而已。相对NFS性能更好。

cephfs需要使用mds服务(ceph-mds -> metadata-server)

另外:创建好ceph-fs后,会在各个mon节点上监听6789端口。挂载的时候可以随意挂任意节点mon的6789,也可以搞个haproxy或nginx来代理。也可以在挂载的时候同时指定多个mon节点。

案例-cephfs搭建与使用

这个案例的mds服务是单点,后期需要解决单点mds服务的问题

1. 在成为mds的节点上安装ceph-mds包

yum install -y ceph-mds

2. 在ceph-deploy节点上,将某节点添加为ceph-mds节点

# 2. 在ceph-deploy节点上,将某节点添加为ceph-mds节点
# 此操作会将成为mds节点的服务器上添加服务:ceph-mds@<节点名>
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy mds create ceph-node3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy mds create ceph-node3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f93208594d0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function mds at 0x7f9320aa7ed8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  mds                           : [('ceph-node3', 'ceph-node3')]
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-node3:ceph-node3
[ceph-node3][DEBUG ] connection detected need for sudo
[ceph-node3][DEBUG ] connected to host: ceph-node3 
[ceph-node3][DEBUG ] detect platform information from remote host
[ceph-node3][DEBUG ] detect machine type
[ceph_deploy.mds][INFO  ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-node3
[ceph-node3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node3][WARNIN] mds keyring does not exist yet, creating one
[ceph-node3][DEBUG ] create a keyring file
[ceph-node3][DEBUG ] create path if it doesn't exist
[ceph-node3][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-node3 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-node3/keyring
[ceph-node3][INFO  ] Running command: sudo systemctl enable ceph-mds@ceph-node3
[ceph-node3][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-node3.service to /usr/lib/systemd/system/ceph-mds@.service.
[ceph-node3][INFO  ] Running command: sudo systemctl start ceph-mds@ceph-node3
[ceph-node3][INFO  ] Running command: sudo systemctl enable ceph.target

3. 此时mds还不能使用,需要创建存储池用于保存mds的数据

# 3. 此时mds还不能使用,需要创建存储池用于保存mds的数据
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph mds stat
, 1 up:standby

# 3.1 创建metedata专用的存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool create cephfs-metadata-pool 16
pool 'cephfs-metadata-pool' created

# 3.2 创建存储实际数据的存储池
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph osd pool create cephfs-data-pool 16
pool 'cephfs-data-pool' created

4. 创建ceph-fs文件系统

# 4. 创建ceph-fs文件系统
# make new filesystem using named pools <metadata> and <data>
语法: ceph fs new <fs_name> <metadata> <data> {--force} {--allow-dangerous-metadata-overlay}
	<metadata>: 指定metadata存储池
	<data>: 指定data的存储池

# 4.1 
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs new cephfs-test cephfs-metadata-pool cephfs-data-pool
new fs with metadata pool 8 and data pool 9

5.查看ceph fs

# 5.查看ceph fs
# 通过ceph fs status查看
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+------------+---------------+-------+-------+
| Rank | State  |    MDS     |    Activity   |  dns  |  inos |
+------+--------+------------+---------------+-------+-------+
|  0   | active | ceph-node3 | Reqs:    0 /s |   10  |   13  |
+------+--------+------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
|         Pool         |   type   |  used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 2286  | 11.1G |
|   cephfs-data-pool   |   data   |    0  | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+
MDS version: ceph version 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)

# 5.1 通过ceph fs ls查看
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs ls
name: cephfs-test, metadata pool: cephfs-metadata-pool, data pools: [cephfs-data-pool ]

# 5.2 通过ceph mds stat 查看
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph mds stat
cephfs-test-1/1/1 up  {0=ceph-node3=up:active}

6. 挂载ceph-fs到客户端

# 6. 挂载ceph-fs到客户端
# 6.1 需要挂载的客户端,需要安装ceph-common,
yum install -y ceph-common

# 6.2 并且需要授权文件,这里就复制admin的,生产环境不要用admin,和上面的rbd和radosgw一样。
scp 

# 6.3 使用mount.ceph挂载ceph-fs
[ceph@ceph-deploy ceph-cluster-deploy]$ mount.ceph -h
mount.ceph monaddr1[,monaddr2,...]:/[subdir] dir [-o options ]
	src用于指定服务器和端口, 端口不指定默认为6789
		1. 主机名 或者 主机名:端口
		2. ip  或 ip:端口
options:
        -h: Print this help
        -n: Do not update /etc/mtab
        -v: Verbose
        ceph-options: refer to mount.ceph(8)
        	ceph-options常用选项:
        	name=xxx # 用于指定cephx中的用户,不填的话,默认为guest
        	secret=xxx # 用于指定cephx中用户的key
        	secretfile=xx # 用于直接指定密钥文件,这样就不需要指定secret了,而且也更安全。(这个key文件不是keyring,而是keyring中的key后面的那串base64, 直接复制或者用ceph auth print-key <keyring>来输出也可以)

[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mount.ceph 192.168.2.121:6789:/ /mnt/ceph-fs-dir -o name=admin,secret=AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg==

# 6.4 写入测试数据
[root@ceph-deploy ceph-cluster-deploy]$ echo 123456 > /mnt/ceph-fs-dir/123.txt

# 6.5. 查看挂载情况:
[root@ceph-deploy ~]# df -Th
文件系统                类型      容量  已用  可用 已用% 挂载点
devtmpfs                devtmpfs  898M     0  898M    0% /dev
tmpfs                   tmpfs     910M     0  910M    0% /dev/shm
tmpfs                   tmpfs     910M  9.6M  901M    2% /run
tmpfs                   tmpfs     910M     0  910M    0% /sys/fs/cgroup
/dev/mapper/centos-root xfs        20G  2.4G   18G   12% /
/dev/sda1               xfs       509M  144M  366M   29% /boot
tmpfs                   tmpfs     182M     0  182M    0% /run/user/0
/dev/rbd0               ext4      976M  2.6M  907M    1% /mnt/ceph-rbd-dir
192.168.2.121:6789:/    ceph       12G     0   12G    0% /mnt/ceph-fs-dir


# 7. 拓展:添加到fstab分区表中,以便开机自行挂载。
# 每次都要手动挂麻烦?要不写到rc-local中,要不就添加到fstab分区表
[ceph@ceph-deploy ceph-cluster-deploy]$ sudo mount.ceph 192.168.2.121,192.168.2.122,192.168.2.123:/ /mnt/ceph-fs-dir/ -o name=admin,secret=AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg==

# fstab分区表添加以下内容:
# _netdev: 这种网络类型的fs都要加,不然开机会起不来,比如NFS也一样。因为启动的时候网络还没有启动完毕,所以会找不到网络,然后就卡。
# noatime:并发大的场景如果经常更新atime会浪费性能,这个看需求吧。noatime就是每次访问不更新文件的atime,可以优化性能。
# defaults: 是默认的挂载属性,包含:rw, suid, dev, exec, auto, nouser, and async.
# 【格式】:
# <mon节点:端口>,[<mon节点:端口>,...]:/    <挂载点目录>    ceph    defaults,_netdev,noatime,name=<ceph账号>,secretfile=<账号的key文件>|secret=账号的key    0 0

# 先备份fstab分区表
[root@ceph-deploy ~]# cp /etc/fstab{,.bak}

# 在fstab添加以下内容
#
# /etc/fstab
# Created by anaconda on Thu Dec 21 23:51:13 2023
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root /                       xfs     defaults        0 0
UUID=4b1bb372-7f34-48f6-8852-036ee6dfd125 /boot                   xfs     defaults        0 0
# 挂载ceph-fs by swq
192.168.2.121,192.168.2.122,192.168.2.123:/    /mnt/ceph-fs-dir    ceph    defaults,_netdev,noatime,name=admin,secret=AQCAGoVlpj0zERAA5dhEHlg/a5TyQhPPlTigUg==  0  0

# 取消之前手动挂载的ceph-fs
[root@ceph-deploy ~]# umount /mnt/ceph-fs-dir/

# 先手动测试下fstab是否能够正常挂载
[root@ceph-deploy ~]# mount -a

# 没有报错,然后查看是否挂载了?
[root@ceph-deploy ~]# df -Th
文件系统                                    类型      容量  已用  可用 已用% 挂载点
devtmpfs                                    devtmpfs  898M     0  898M    0% /dev
tmpfs                                       tmpfs     910M     0  910M    0% /dev/shm
tmpfs                                       tmpfs     910M  9.6M  901M    2% /run
tmpfs                                       tmpfs     910M     0  910M    0% /sys/fs/cgroup
/dev/mapper/centos-root                     xfs        20G  2.4G   18G   12% /
/dev/sda1                                   xfs       509M  144M  366M   29% /boot
tmpfs                                       tmpfs     182M     0  182M    0% /run/user/0
192.168.2.121,192.168.2.122,192.168.2.123:/ ceph       12G     0   12G    0% /mnt/ceph-fs-dir
[root@ceph-deploy ~]# ll /mnt/ceph-fs-dir/
总用量 1
-rw-r--r-- 1 root root 7 12月 25 21:17 123.txt

mds服务高可用、高性能

ceph-fs是通过mon节点+端口6789来来访问的,虽然mon节点可以高可用,但是实际上提供mds的节点之前只添加了一个。

因此我们需要添加更多的mds实现高可用,甚至是实现两组《一主一从》

高可用:一主多备。

​ 这个实现比较简单,只需要添加多个mds服务节点即可实现,默认就是一主多备

高可用+高性能:两组《一主一从》的mds同时提供服务。

​ 需要手动设置一些参数和配置项。

高可用实现

# 查看当天的ceph中的mds服务
[root@ceph-deploy ~]# ceph -s
  cluster:
    id:     f1da3a2e-b8df-46ba-9c6b-0030da25c73e
    health: HEALTH_WARN
            application not enabled on 1 pool(s)
 
  services:
    mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
    mgr: ceph-node2(active), standbys: ceph-node1, ceph-node3
    # 发现只有一个ceph-node3是mds服务。
    mds: cephfs-test-1/1/1 up  {0=ceph-node3=up:active}
    osd: 9 osds: 9 up, 9 in
    rgw: 1 daemon active
 
  data:
    pools:   9 pools, 120 pgs
    objects: 234  objects, 37 MiB
    usage:   9.2 GiB used, 36 GiB / 45 GiB avail
    pgs:     120 active+clean

# 使用ceph fs status 也可以查看。
[root@ceph-deploy ~]# ceph fs status
cephfs-test - 0 clients
===========
+------+--------+------------+---------------+-------+-------+
| Rank | State  |    MDS     |    Activity   |  dns  |  inos |
+------+--------+------------+---------------+-------+-------+
|  0   | active | ceph-node3 | Reqs:    0 /s |   11  |   14  |
+------+--------+------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
|         Pool         |   type   |  used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7179  | 11.1G |
|   cephfs-data-pool   |   data   |    7  | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+

# 进入ceph-deploy节点,并切换到ceph用户。
# 添加多一个ceph-node2作为mds服务
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph-deploy mds create ceph-node2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ceph/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /bin/ceph-deploy mds create ceph-node2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f9c6190e4d0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function mds at 0x7f9c61b5ced8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  mds                           : [('ceph-node2', 'ceph-node2')]
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-node2:ceph-node2
[ceph-node2][DEBUG ] connection detected need for sudo
[ceph-node2][DEBUG ] connected to host: ceph-node2 
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph_deploy.mds][INFO  ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-node2
[ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node2][WARNIN] mds keyring does not exist yet, creating one
[ceph-node2][DEBUG ] create a keyring file
[ceph-node2][DEBUG ] create path if it doesn't exist
[ceph-node2][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-node2 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-node2/keyring
[ceph-node2][INFO  ] Running command: sudo systemctl enable ceph-mds@ceph-node2
[ceph-node2][WARNIN] Created symlink from /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-node2.service to /usr/lib/systemd/system/ceph-mds@.service.
[ceph-node2][INFO  ] Running command: sudo systemctl start ceph-mds@ceph-node2
[ceph-node2][INFO  ] Running command: sudo systemctl enable ceph.target

高性能实现

当前环境:
ceph-node2 、 ceph-node3都是mds节点。

添加两个节点成为mds节点
ceph-node1、 ceph-deploy(因为我没机器了所以用的ceph-deploy,生产环境不要随意混用服务器...)

# 执行命令,将两个节点分别添加为mds节点。
ceph-deploy mds create ceph-node1
ceph-deploy mds create ceph-deploy

# 查看ceph-fs状态
# 可以发现,目前还是属于一主多备的状态。
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+------------+---------------+-------+-------+
| Rank | State  |    MDS     |    Activity   |  dns  |  inos |
+------+--------+------------+---------------+-------+-------+
|  0   | active | ceph-node3 | Reqs:    0 /s |   11  |   14  |
+------+--------+------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
|         Pool         |   type   |  used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7179  | 11.1G |
|   cephfs-data-pool   |   data   |    7  | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|  ceph-node2 |
|  ceph-node1 |
| ceph-deploy |
+-------------+

# 修改mds的最大活跃数,实现多个主提供服务
语法:ceph fs set <ceph-fs-name> max_mds <NUM>
ceph fs set cephfs-test max_mds 2

# 再次查看ceph-fs状态
# 可以发现变成了2个rank提供服务。
[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+-------------+---------------+-------+-------+
| Rank | State  |     MDS     |    Activity   |  dns  |  inos |
+------+--------+-------------+---------------+-------+-------+
|  0   | active |  ceph-node3 | Reqs:    0 /s |   11  |   14  |
|  1   | active | ceph-deploy | Reqs:    0 /s |    0  |    0  |
+------+--------+-------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
|         Pool         |   type   |  used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7835  | 11.1G |
|   cephfs-data-pool   |   data   |    7  | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|  ceph-node2 |
|  ceph-node1 |

手动指定mds的主备关系

[ceph@ceph-deploy ceph-cluster-deploy]$ ceph fs status
cephfs-test - 0 clients
===========
+------+--------+-------------+---------------+-------+-------+
| Rank | State  |     MDS     |    Activity   |  dns  |  inos |
+------+--------+-------------+---------------+-------+-------+
|  0   | active |  ceph-node3 | Reqs:    0 /s |   11  |   14  |
|  1   | active | ceph-deploy | Reqs:    0 /s |    0  |    0  |
+------+--------+-------------+---------------+-------+-------+
+----------------------+----------+-------+-------+
|         Pool         |   type   |  used | avail |
+----------------------+----------+-------+-------+
| cephfs-metadata-pool | metadata | 7835  | 11.1G |
|   cephfs-data-pool   |   data   |    7  | 11.1G |
+----------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|  ceph-node2 |
|  ceph-node1 |
+-------------+

默认情况下,由mon节点来控制哪些节点成为mds主节点,哪些作为备用节点,当出现故障时接替失效的mds。

假设我们现在希望,ceph-node2作为ceph-node3的备用mds、ceph-node1作为ceph-deploy的备用mds。

ceph提供了一些选项用于控制standby状态的mds如何接替主mds。

  • mds_standby_replay:true | false
    • true表示当前MDS守护进程将持续读取某个特定的Up状态的rank的元数据日志,从而持有相关rank的元数据缓存,并在此rank失效时加速故障切换;
    • 一个Up状态的rank仅能拥有一个replay守护进程,多出的会被自动降级为正常的非replay型MDS。
  • mds_standby_for_name:指定当前mds是那个mds的备用服务(也就是指定其主是哪个节点啊)
  • mds_standby_for_rank:设置当前MDS进程仅备用于指定的rank,它不会接替任何其它失效的rank;不过,在有着多个CephFS的场景中,可联合使用下面的参数来指定为哪个ceph-fs文件系统的rank进行冗余。(正常来讲一个ceph集群只建议创建一个ceph-fs,然后再里面创建不同的文件夹用来区分业务,如果你真的创建了多个ceph-fs就可以用这个参数)
  • mds_standby_for_fscid:联合mds_standby_for_rank参数的值协同生效;
    • 同时设置了mds_standby_for_rank:备用于指定fscid的指定rank;
    • 未设置mds_standby_for_rank时:备用于指定fscid的任意rank;

我们可以通过修改ceph.conf来指定这样的主备关系。

# vim /etc/ceph.conf

# 添加下面的配置
[mds.ceph-node2] # 表示配置ceph-node2这个mds节点
mds_standby_replay=true 
mds_standby_for_name=ceph-node3

[mds.ceph-node1]
mds_standby_replay=true 
mds_standby_for_name=ceph-deploy

# 重新分发配置文件到节点上
ceph-deploy --overwrite-conf config push ceph-node2
ceph-deploy --overwrite-conf config push ceph-node1
ceph-deploy --overwrite-conf config push ceph-node3
ceph-deploy --overwrite-conf config push ceph-deploy

# 重启备mds节点上的mds服务
systemctl restart ceph-mds@<节点名>