Ceph添加删除OSD并做磁盘智能分组

移除OSD

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
[root@ceph-node1 ~]# ceph osd tree   
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 120.07668 root default
-3 40.02556 host ceph-node1
0 hdd 3.63869 osd.0 up 1.00000 1.00000
1 hdd 3.63869 osd.1 up 1.00000 1.00000
2 hdd 3.63869 osd.2 up 1.00000 1.00000
3 hdd 3.63869 osd.3 up 1.00000 1.00000
4 hdd 3.63869 osd.4 up 1.00000 1.00000
5 hdd 3.63869 osd.5 up 1.00000 1.00000
6 hdd 3.63869 osd.6 up 1.00000 1.00000
7 hdd 3.63869 osd.7 up 1.00000 1.00000
8 hdd 3.63869 osd.8 up 1.00000 1.00000
9 hdd 3.63869 osd.9 up 1.00000 1.00000
10 hdd 3.63869 osd.10 up 1.00000 1.00000
-5 40.02556 host ceph-node2
11 hdd 3.63869 osd.11 up 1.00000 1.00000
12 hdd 3.63869 osd.12 up 1.00000 1.00000
13 hdd 3.63869 osd.13 up 1.00000 1.00000
14 hdd 3.63869 osd.14 up 1.00000 1.00000
15 hdd 3.63869 osd.15 up 1.00000 1.00000
16 hdd 3.63869 osd.16 up 1.00000 1.00000
17 hdd 3.63869 osd.17 up 1.00000 1.00000
18 hdd 3.63869 osd.18 up 1.00000 1.00000
19 hdd 3.63869 osd.19 up 1.00000 1.00000
20 hdd 3.63869 osd.20 up 1.00000 1.00000
21 hdd 3.63869 osd.21 up 1.00000 1.00000
-7 40.02556 host ceph-node3
22 hdd 3.63869 osd.22 up 1.00000 1.00000
23 hdd 3.63869 osd.23 up 1.00000 1.00000
24 hdd 3.63869 osd.24 up 1.00000 1.00000
25 hdd 3.63869 osd.25 up 1.00000 1.00000
26 hdd 3.63869 osd.26 up 1.00000 1.00000
27 hdd 3.63869 osd.27 up 1.00000 1.00000
28 hdd 3.63869 osd.28 up 1.00000 1.00000
29 hdd 3.63869 osd.29 up 1.00000 1.00000
30 hdd 3.63869 osd.30 up 1.00000 1.00000
31 hdd 3.63869 osd.31 up 1.00000 1.00000
32 hdd 3.63869 osd.32 up 1.00000 1.00000
[root@ceph-node1 ~]# ceph osd crush reweight osd.10 0
reweighted item id 10 name 'osd.10' to 0 in crush map

该阶段 ceph 会自动将处于osd.10 中的数据迁移到其他状态正常的 OSD 上,所以在执行完成后,需要使用 ceph -w 查看数据迁移流程。等到不再有输出后,数据迁移完毕。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[root@ceph-node1 ~]# ceph -w
cluster:
id: 9d734902-fd68-4943-b263-11b515918ed4
health: HEALTH_WARN
35959/1031274 objects misplaced (3.487%)
application not enabled on 1 pool(s)

services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node1(active)
osd: 33 osds: 33 up, 33 in; 58 remapped pgs
rgw: 1 daemon active

data:
pools: 15 pools, 1232 pgs
objects: 343.8 k objects, 1.3 TiB
usage: 3.8 TiB used, 116 TiB / 120 TiB avail
pgs: 35959/1031274 objects misplaced (3.487%)
1174 active+clean
48 active+remapped+backfill_wait
10 active+remapped+backfilling

io:
client: 2.2 MiB/s rd, 6.4 MiB/s wr, 37 op/s rd, 414 op/s wr
recovery: 129 MiB/s, 0 keys/s, 29 objects/s


2019-06-19 14:40:02.973752 mon.ceph-node1 [WRN] Health check update: 36020/1031271 objects misplaced (3.493%) (OBJECT_MISPLACED)
2019-06-19 15:05:49.970861 mon.ceph-node1 [WRN] Health check update: 2472/1032033 objects misplaced (0.240%) (OBJECT_MISPLACED)
2019-06-19 15:11:44.998352 mon.ceph-node1 [WRN] Health check update: 815/1032162 objects misplaced (0.079%) (OBJECT_MISPLACED)
2019-06-19 15:17:00.165353 mon.ceph-node1 [INF] Health check cleared: OBJECT_MISPLACED (was: 12/1032399 objects misplaced (0.001%))
[root@ceph-node1 ~]# ceph -s
cluster:
id: 9d734902-fd68-4943-b263-11b515918ed4
health: HEALTH_OK

services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node1(active)
osd: 33 osds: 33 up, 33 in
rgw: 1 daemon active

data:
pools: 15 pools, 1232 pgs
objects: 344.2 k objects, 1.3 TiB
usage: 3.8 TiB used, 116 TiB / 120 TiB avail
pgs: 1232 active+clean

io:
client: 68 KiB/s rd, 14 MiB/s wr, 45 op/s rd, 572 op/s wr
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
[root@ceph-node1 ~]# systemctl stop ceph-osd@10
[root@ceph-node1 ~]# ceph osd tree|grep osd.10
10 hdd 0 osd.10 down 1.00000 1.00000
[root@ceph-node1 ~]# ceph osd out osd.10
marked out osd.10.
[root@ceph-node1 ~]# ceph osd tree|grep osd.10
10 hdd 0 osd.10 down 0 1.00000
[root@ceph-node1 ~]# ceph osd crush remove osd.10
removed item id 10 name 'osd.10' from crush map
[root@ceph-node1 ~]# ceph auth del osd.10
updated
[root@ceph-node1 ~]# ceph osd rm 10
removed osd.10
[root@ceph-node1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 116.43799 root default
-3 36.38687 host ceph-node1
0 hdd 3.63869 osd.0 up 1.00000 1.00000
1 hdd 3.63869 osd.1 up 1.00000 1.00000
2 hdd 3.63869 osd.2 up 1.00000 1.00000
3 hdd 3.63869 osd.3 up 1.00000 1.00000
4 hdd 3.63869 osd.4 up 1.00000 1.00000
5 hdd 3.63869 osd.5 up 1.00000 1.00000
6 hdd 3.63869 osd.6 up 1.00000 1.00000
7 hdd 3.63869 osd.7 up 1.00000 1.00000
8 hdd 3.63869 osd.8 up 1.00000 1.00000
9 hdd 3.63869 osd.9 up 1.00000 1.00000
-5 40.02556 host ceph-node2
11 hdd 3.63869 osd.11 up 1.00000 1.00000
12 hdd 3.63869 osd.12 up 1.00000 1.00000
13 hdd 3.63869 osd.13 up 1.00000 1.00000
14 hdd 3.63869 osd.14 up 1.00000 1.00000
15 hdd 3.63869 osd.15 up 1.00000 1.00000
16 hdd 3.63869 osd.16 up 1.00000 1.00000
17 hdd 3.63869 osd.17 up 1.00000 1.00000
18 hdd 3.63869 osd.18 up 1.00000 1.00000
19 hdd 3.63869 osd.19 up 1.00000 1.00000
20 hdd 3.63869 osd.20 up 1.00000 1.00000
21 hdd 3.63869 osd.21 up 1.00000 1.00000
-7 40.02556 host ceph-node3
22 hdd 3.63869 osd.22 up 1.00000 1.00000
23 hdd 3.63869 osd.23 up 1.00000 1.00000
24 hdd 3.63869 osd.24 up 1.00000 1.00000
25 hdd 3.63869 osd.25 up 1.00000 1.00000
26 hdd 3.63869 osd.26 up 1.00000 1.00000
27 hdd 3.63869 osd.27 up 1.00000 1.00000
28 hdd 3.63869 osd.28 up 1.00000 1.00000
29 hdd 3.63869 osd.29 up 1.00000 1.00000
30 hdd 3.63869 osd.30 up 1.00000 1.00000
31 hdd 3.63869 osd.31 up 1.00000 1.00000
32 hdd 3.63869 osd.32 up 1.00000 1.00000

定位硬盘槽位

  • 查询服务器RAID阵列卡型号

    1
    2
    3
    4
    # cat /proc/scsi/scsi
    Host: scsi0 Channel: 02 Id: 00 Lun: 00
    Vendor: AVAGO    Model: **MR9361-8i **       Rev: 4.68
      Type:   Direct-Access                    ANSI  SCSI revision: 05
  • 下载、安装阵列卡管理工具

    1
    2
    3
    [root@ceph-node1 ~]# wget https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/MR_SAS_Unified_StorCLI_007.1017.0000.0000.zip
    [root@ceph-node1 ~]# unzip MR_SAS_Unified_StorCLI_007.1017.0000.0000.zip
    [root@ceph-node1 ~]# rpm -ivh Linux/storcli-007.1017.0000.0000-1.noarch.rpm
  • 查询磁盘对应的序列号

    1
    2
    [root@ceph-node1 ~]# smartctl -a /dev/sdl|grep "Serial Number"
    Serial Number: ZC1AC3KZ
  • 导出阵列卡配置信息

    1
    2
    [root@ceph-node1 ~]# cd /opt/MegaRAID/storcli
    [root@ceph-node1 ~]# ./storcli64 /call/eall/sall show all > slot.txt
  • 定位磁盘对应的槽位

    1
    2
    3
    4
    5
    6
    [root@ceph-node1 ~]# grep -2 ZC1AC3KZ slot.txt
    Drive /c0/e8/s5 Device attributes :
    =================================
    SN = ZC1AC3KZ
    Manufacturer Id = ATA
    Model Number = ST4000NM0035-1V4107
  • 点亮硬盘定位指示灯

    1
    [root@ceph-node1 ~]# storcli64 /c0/e8/s5 start locate
  • 关闭硬盘定位指示灯

    1
    [root@ceph-node1 ~]# storcli64 /c0/e8/s5 stop locate

添加OSD

1
2
[root@ceph-node1 ~]# ceph-deploy --ceph-conf /etc/ceph/ceph.conf disk zap ceph-node1 /dev/sde
[root@ceph-node1 ~]# ceph-deploy --ceph-conf /etc/ceph/ceph.conf --overwrite-conf osd create --data /dev/sde ceph-node1

crush class分组

  • 对ssd磁盘设置crush class为ssd

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    [root@ceph-node1 ~]# ceph osd crush rm-device-class osd.10
    done removing class of osd(s): 10
    [root@ceph-node1 ~]# ceph osd tree|grep osd.10
    10 3.63869 osd.10 up 1.00000 1.00000
    [root@ceph-node1 ~]# ceph osd crush set-device-class ssd osd.10
    set osd(s) 10 to class 'ssd'
    [root@ceph-node1 ~]# ceph osd tree|grep osd.10
    10 ssd 3.63869 osd.10 up 1.00000 1.00000
    [root@ceph-node1 ~]# ceph osd crush class ls
    [
    "hdd",
    "ssd"
    ]
  • 创建crush rule并分别指定class为hdd和ssd

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    [root@ceph-node1 ~]# ceph osd crush rule create-replicated replicated_hdd_rule default host hdd
    [root@ceph-node1 ~]# ceph osd crush rule create-replicated replicated_ssd_rule default host ssd
    [
    {
    "rule_id": 0,
    "rule_name": "replicated_rule",
    "ruleset": 0,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
    {
    "op": "take",
    "item": -1,
    "item_name": "default"
    },
    {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "host"
    },
    {
    "op": "emit"
    }
    ]
    },
    {
    "rule_id": 2,
    "rule_name": "replicated_ssd_rule",
    "ruleset": 2,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
    {
    "op": "take",
    "item": -12,
    "item_name": "default~ssd"
    },
    {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "host"
    },
    {
    "op": "emit"
    }
    ]
    },
    {
    "rule_id": 3,
    "rule_name": "replicated_hdd_rule",
    "ruleset": 3,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
    {
    "op": "take",
    "item": -2,
    "item_name": "default~hdd"
    },
    {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "host"
    },
    {
    "op": "emit"
    }
    ]
    }
    ]
  • 指定crush_rule创建pool

    1
    2
    3
    [root@ceph-node1 ~]# ceph osd pool create test_hdd_rule 12 12 replicated_hdd_rule
    [root@ceph-node1 ~]# ceph osd pool create test_ssd_rule 12 12 replicated_ssd_rule
    [root@ceph-node1 ~]# ceph osd pool set test_ssd_rule size 1
  • 修改pool的crush_rule

    1
    [root@ceph-node1 ~]# ceph osd pool set pool_name crush_rule replicated_hdd_rule

参考资料:

坚持原创技术分享,您的支持将鼓励我继续创作!
0%