Ceph Nautilus Installation Guide ¶

Contents

Ceph Nautilus Installation Guide

Ceph Version 14 - Nautilus

Installation Guide for CentOS 7.

Release Version 1.4

CHAPTER 1 - WHAT IS CEPH?

Ceph is a license free, open source storage platform that ties together multiple storage servers to provide interfaces for object, block and file-level storage in a single, horizontally scalable storage cluster, with no single point of failure.

Ceph clusters consist of several different types of services which will be explained below :

1.1 - ANSIBLE ADMINISTRATOR NODE

T his type of node is where ansible will be configured and run from. Any node in the cluster can functi on as the ansible node. This node provide the following functions:

Centralized storage cluster management
Ceph configuration files and keys
Optionally, local repositories for installing Ceph on nodes that cannot access the Internet

1.2 - MONITOR NODES

Each monitor node runs the monitor daemon ( ceph-mon ), which maintains a master copy of the cluster map. The cluster map includes the cluster topology. A client connecting to the Ceph cluster retrieves the current copy of the cluster map from the monitor which enables the client to read from and write data to the cluster. It’s Important to note that Ceph can run with one monitor; however, it is highly suggested to have three monitors to ensure high availability.

1.3 - OSD NODES

Each Object Storage Device (OSD) node runs the Ceph OSD daemon ( ceph-osd ), which interacts with logical disks attached to the node . Simply put, an OSD node is a server, and an OSD itself is an HDD or SSD inside the server. Ceph stores data on these OSDs. Ceph can run with very few OSD nodes, where the minimum is three , but production clusters realize better performance beginning at modest scales, for example 5 OSD nodes in a storage cluster . Ideally, a Ceph cluster has multiple OSD nodes, allowing isolated failure domains by creating the CRUSH map.

1.4 - MANAGER NODES

Each Manager node runs the MGR daemon ( ceph-mgr ), which maintains detailed information about placement groups, process metadata and host metadata in lieu of the Ceph Monitor—significantly improving performance at scale. The Ceph Manager handles execution of many of the read-only Ceph CLI queries, such as placement group statistics. The Ceph Manager also provides the RESTful monitoring APIs. The manager node is also responsible for dashboard hosting, giving the user real time metrics, as well as the capability to create new pools, exports, etc.

1.5 - MDS NODES

Each Metadata Server (MDS) node runs the MDS daemon ( ceph-mds ), which manages metadata related to files stored on the Ceph File System (CephFS). The MDS daemon also coordinates access to the shared cluster. The MDS daemon maintains a cache of CephFS metadata in system memory to accelerate IO performance. This cache size can be grown or shrunk based on workload, allowing linearly scaling of performance as data grows. The service is required for CephFS to function.

1.6 - OBJECT GATEWAY NODES

Ceph Object Gateway node runs the Ceph RADOS Gateway daemon ( ceph-radosgw ), and is an object storage interface built on top of librados to provide applications with a RESTful gateway to Ceph Storage Clusters. The Ceph Object Gateway supports two interfaces:

S3 - Provides object storage functionality with an interface that is compatible with a large subset of the Amazon S3 RESTful API.
Swift - Provides object storage functionality with an interface that is compatible with a large subset of the OpenStack Swift API.

Below is a diagram showing the architecture of the above services, and how they communicate on the networks.

The cluster network relieves OSD replication and heartbeat traffic from the public network.

CHAPTER 2 - REQUIREMENTS FOR INSTALLING CEPH

Before starting the installation and configuration of your Ceph cluster, there are a few requirements that need to be met.

2.1 - HARDWARE REQUIREMENTS

As mentioned before, there are minimum quantities required for the different types of nodes. Below is a table showing the minimum number required to achieve a highly available Ceph cluster. It is important to note that the MON’s, MGR’s, MDS’s, FSGW’s , and RGW’s can be either virtualized or on physical hardware.

Pool Type	OSD	MON	MGR	MDS	FSGW	RGW
2 Rep / 3 Rep	3	3	2	2	2	2
Erasure Coded	3	3	2	2	2	2

2.2 - OPERATING SYSTEM

45Drives requires that Ceph Naut i l us be deployed on a minimal installation of CentOS 7.6 or newer. Every node in the cluster should be running the same version to ensure uniformity.

2.3 - NETWORK CONFIGURATION

As seen in the Figure in Chapter 1, all Ceph nodes require a public network. It is required to have a network interface card configured to a public network where Ceph clients can reach Ceph monitors and Ceph OSD nodes.

45Drives recommends having a second network interface card configured as a backend private network so that Ceph can conduct heart-beating, peering, replication, and recovery on a network separate from the public network. It is recommended to configure network b ond ing on each network interface card across the cluster. Choice of bonding mode will vary depending on needs. 45Drives recommends, either bonding mode 1 (Active-Backup) and 4 (LACP) for the cluster nodes.

2.4 - CONFIGURING FIREWALLS

By default w hen installing Ceph using these ansible packages , it will open the required firewall ports on the appropriate nodes using firew alld .

If using iptables or requiring manual firewall configuration, the following is a table for reference showing the default ports / ranges which are required for each c e ph daemon as well as services used for real time metrics. These must be open before you begin installing the cluster.

Note the cluster role column, that will determine which hosts need the ports opened. It corresponds with the group names in the ansible inventory file.

Ceph Daemon / Service	Firewall Port	Protocol	Firewalld Service Name	Cluster Role
ceph-osd	6800-7300	TCP	ceph	osds
ceph-mon	6789,3300	TCP	ceph, ceph-mon	mons
ceph-mgr	6800-7300	TCP	ceph	mgrs
ceph-mds	6800	TCP	ceph	mdss
ceph-radosg w [1]	8080	TCP		rgws
Ceph Prometheus Exporter	9283	TCP		mgrs
Node Exporter	9100	TCP		metric
Prometheus Server	9090	TCP		metric
Alertmanage r	9091	TCP		metric
Grafana Server	3000	TCP		metric
nfs	2049	TCP	nfs	nfss
rpcbind	111	TCP/UDP	rpc-bind	nfss
corosync	5404-5406	UDP		nfss
pacemaker	2224	TCP		nfss
samba	137,138	UDP	samba	smbs
samba	139,445	TCP	samba	smbs
CTDB	4379	TCP/UDP	ctdb	smbs
iSCSI Target	3260	TCP		iscsigws
iSCSI API Port	5000	TCP		iscsigws
iSCSI Metric Exporter	9287	TCP		iscsigws

2.5 - CONFIGURING PASSWORDLESS SSH

Generate an SSH key pair on the Ansible administrator node and distribute the public key to all other nodes in the storage cluster so that Ansible can access the nodes without being prompted for a password.

Perform the following steps from the Ansible administrator node, as the root user.

Generate the SSH key pair, accept the default filename and leave the passphrase empty:

[root @cephADMIN ~] $ ssh-keygen

2. Copy the public key to all nodes in the storage cluster:

[root@cephADMIN ~]$ ssh- copy -id root@ $HOST_NAME

Replace $HOST_NAME with the host name of the Ceph nodes.

Example

[root @cephADMIN ~] $ ssh-copy-id root @cephOSD1

2.6 - INSTALL CEPH-ANSIBLE-45D

45Drives provides a slightly modified ceph-ansible repository on GitHub. From the Ansible administrator node, the first thing to do is to pull down the ceph-ansible archive and run the admin-setup script.

[root @cephADMIN ~] $ cd /etc/yum.repos.d/ [root @cephADMIN ~] $ curl -LO http: / /images.45drives.com/ceph/rpm/ceph _45drives.repo [root @cephADMIN ~] $ yum install ceph-ansible- 45 d

[root @cephADMIN ~] $ touch /usr/share/ceph-ansible/hosts

[root @cephADMIN ~] $ ln -s /usr/share/ceph-ansible/hosts /etc/ansible

This will set up the Ansible environment for a 45Drives Ceph cluster.

CHAPTER 3 - DEPLOYING CEPH NAUTILUS

This chapter describes how to use the Ansible application to deploy a Ceph cluster and other components, such as Metadata Servers, File System Gateways, Ceph Object Gateways etc.

3.1 - PREREQUISITES

Prepare the cluster nodes. On each node verify:

Passwordless SSH configured from Ansible Node to all other nodes
Network ing configured
All nodes must be reachable from each on the public network
All OSDS nodes must be reachable on the cluster network as well

3.2 - INSTALLING A CEPH CLUSTER

Navigate to the /usr/share/ceph-ansible/ directory

[root @cephADMIN ~] # cd /usr/share/ceph-ansible/

Edit the hosts file and place the hostnames under the correct blocks. It is common to collocate the Ceph Manager ( ceph_mgr ) with the Ceph Monitor nodes. If you have a lengthy list with sequential naming you can use a range such as OSD_[1:10] . See hosts.sample for a full list of host groups available.

[mons] MONITOR_1 MONITOR_2 MONITOR_3 [mgrs] MONITOR_1 MONITOR_2 MONITOR_3 [osds] OSD_1 OSD_2 OSD_3

Ensure that Ansible can reach the Ceph hosts. Until this finishes with success, stop here and verify the connectivity to each host in your host file.

[root @cephADMIN ceph-ansible] # ansible all -m ping

Edit the group_vars/all.yml file:

[root @cephADMIN ceph-ansible] # vim group_vars/all.yml

Below is a table that includes the minimum parameters that have to be updated.

Option	Value	Required	Notes
monitor_interfa ce	The interface that the Monitor nodes listen to (eth0, bond0, etc)	1 of the 3	monitor_interfa ce , monitor_address , or monitor_address _ block is required
public_network	The IP address and netmask of the Ceph public network	Yes	In the form of: 192.168.0.0/16
cluster_network	The IP address and netmask of the Ceph cluster network	No, defaults to public_network	In the form of: 10.0.0.0/24
hybrid_cluster	Are there SSD and HDD OSDs in the cluster ?	No, defaults to false	In the form of true or false

Run the ansible-playbook to configure device alias’.

[root @cephADMIN ceph-ansible] # ansible-playbook device-alias.yml

Run the ansible-playbook to generate-osd-vars.yml to populate device variables. This will use every disk present in the chassis. If you want to exclude certain drives manually remove them from the host_vars/ file.

[root @cephADMIN ceph-ansible] # ansible-playbook generate-osd-vars.yml

Run the ceph-ansible playbook to build the cluster. When it finishes, the core components of the cluster are deployed and can be verified by running “ceph -s” from one of the monitor nodes.

[root @cephADMIN ceph-ansible] # ansible-playbook core2.yml

3.3 - INSTALLING METADATA SERVERS (CephFS )

Metadata Server daemons are necessary for deploying a Ceph File System. This section will show you using Ansible, how to install a Ceph Metadata Server (MDS)

Add a new section [mdss] to the / usr/share/ceph-ansible /hosts file:

[mdss] cephMDS1 cephMDS2 cephMDS3

Run the cephfs.yml playbook and install and configure the Ceph Metadata Servers.

[ root @cephADMIN ceph-ansible] # ansible-playbook cephfs.yml

Verify the file system from one of the cluster nodes

[root @cephMON ~] # ceph fs status

3.4 - INSTALLING THE CEPH OBJECT GATEWAY

The Ceph Object Gateway, also known as the RADOS gateway, is an object storage interface built on top of the librados API to provide applications with a RESTful gateway to Ceph storage clusters.

Add a new section, [rgws] to the / usr/share /ceph-ansible/hosts file . Be sure to define the IP for each rgw as well.

[rgws] cephRGW1 radosgw_address= 192.168.18.53 cephRGW2 radosgw_address= 192.168.18.56

The default port will be 80 , change the following line in group_vars/all.yml if another is desired:

radosgw_civetweb_port: 8080

Below is an example section of the group_vars/all.yml.

## Rados Gateway options # radosgw_frontend_type: beast radosgw_civetweb_port: 8080 radosgw_civetweb_num_threads: 100 radosgw_civetweb_options: “num_threads= {{ radosgw_civetweb_num_threads }} ” # For additional civetweb configuration options available such as SSL, logging, # keepalive, and timeout settings, please see the civetweb docs at # https://github.com/civetweb/civetweb/blob/master/docs/UserManual.md radosgw_frontend_port: ” {{ radosgw_civetweb_port if radosgw_frontend_type == ‘civetweb’ else ‘8080’ }} ” radosgw_frontend_options: ” {{ radosgw_civetweb_options if radosgw_frontend_type == ‘civetweb’ }} “

Run the rgws.yml playbook to install the RGWs.

[root @cephADMIN ceph-ansible] # ansible-playbook radosgw.yml

Verify with:

[root@cephADMIN ceph-ansible]# curl -g http://cephRGW1:8080 <? xml version= “1.0” encoding= “UTF-8” ?> <ListAllMyBucketsResult xmlns= “http://s3.amazonaws.com/doc/2006-03-01/” ><Owner><ID> anonymous </ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllM yBucketsResult>

3.4.1 Configure Haproxy for RGW Load Balancing

Since each object gateway instance has its own IP address, HAProxy and keepalived can be used to balance the load across Ceph Object Gateway servers.

Another use case for HAProxy and keepalived is to terminate HTTPS at the HAProxy server. You can use an HAProxy server to terminate HTTPS at the HAProxy server and use HTTP between the HAProxy server and the RGWs.

Add a new section, [rgwloadbalancers] to the /usr/share/ceph-ansible/hosts file. The RGW nodes themselves can be used or other CentOS servers

[rgwloadbalancers] cephRGW1 cephRGW2

Edit group_vars/rgwloadbalancers.yml and specify
Virtual IP(s)
Virtual IP Netmask
Virtual IP interface

haproxy_frontend_port: 80 haproxy_frontend_ssl_port: 443 haproxy_frontend_ssl_certificate: haproxy_ssl_dh_param: 4096 haproxy_ssl_ciphers: - EECDH+AESGCM - EDH+AESGCM haproxy_ssl_options: - no-sslv3 - no-tlsv10 - no-tlsv11 - no-tls-tickets virtual_ips: - 192.168.18.57 virtual_ip_netmask: 16 virtual_ip_interface: eth0

3.5 - CONFIGURING THE SMB GATEWAYS

There are two cases that will be covered in this section. They are:

CephFS + Samba + Active Directory Integration
CephFS + Samba + Local Users

The SMB Gateways can be physical hardware or virtual machines . CTDB will be configured with a floating IP for access to the Samba share.

The CephFS volume will be mounted on each gateway at /mnt/cephfs/ - and then the gateways share out that directory via SMB.

Edit the hosts file to include the File System Gateways in the [smbs] block.

[smbs] smb1 smb2

3.5.1 CephFS + Samba + Active Directory Integration Edit the group_vars/smbs.yml file to choose the samba role

# Roles samba_server: true samba_cluster: true domain_member: true

Edit group_vars/smbs.yml to edit active_directory_info

active_directory_info: workgroup: ‘SAMDOM’ idmap_range: ‘100000 - 999999’ realm: ‘SAMDOM.COM’ winbind_enum_groups: yes winbind_enum_users: yes winbind_use_default_domain: yes domain_join_user: ‘’ domain_join_password: ‘’

Edit group_vars/smbs.yml to edit ctdb_public_addresses

ctdb_public_addresses : - vip_address : ‘192.168.103.10’ vip_interface : ‘eth0’ subnet_mask : ‘16’ - vip_address : ‘192.168.103.11’ vip_interface : ‘eth0’ subnet_mask : ‘16’

Edit group_vars/smbs.yml to edit samba_shares

samba_shares : - name : ‘share1’ path : ‘{{ shared_storage_mountpoint }}/fsgw/share1’ writeable : ‘yes’ guest_ok : ‘no’ comment : “comment for share1” - name : ‘share2’ path : ‘{{ shared_storage_mountpoint }}/fsgw/share2’ writeable : ‘yes’ guest_ok : ‘no’ comment : “comment for share2”

3.5.2 CephFS + Samba + Local Users

If AD is not being used, set domain_member to false.

Edit the group_vars/smbs.yml file to choose the samba role

# Roles samba_server: true samba_cluster: true domain_member: false

Edit group_vars/smbs.yml to edit samba_shares

samba_shares : - name : ‘share1’ path : ‘{{ shared_storage_mountpoint }}/fsgw/share1’ writeable : ‘yes’ guest_ok : ‘no’ comment : “comment for share1” - name : ‘share2’ path : ‘{{ shared_storage_mountpoint }}/fsgw/share2’ writeable : ‘yes’ guest_ok : ‘no’ comment : “comment for share2”

Edit group_vars/smbs.yml to edit ctdb_public_addresses

ctdb_public_addresses : - vip_address : ‘192.168.103.10’ vip_interface : ‘eth0’ subnet_mask : ‘16’ - vip_address : ‘192.168.103.11’ vip_interface : ‘eth0’ subnet_mask : ‘16’

Run the smb.yml playbook:

[root @cephADMIN ceph-ansible] # ansible-playbook smb.yml

3.5.2 Samba Overrides

In either case, all smb.conf global config options can be overridden using “/etc/samba/overrides.conf”. This is meant to be used when making user (no ansible) changes or when needing an option that is not defined in the playbooks. The override.conf assumes the same syntax as the main smb.conf.

For example if not using the recommended centralized share management , you could define a share in /etc/samba/overrides.conf

[global]

log level = 3

[share1] path = /mnt/cephfs/fsgw/share1 comment = comment for share1 valid users = user1 Write list = user1

3.5.3 Centralized Share Management

Samba offers a registry based configuration system to complement the original text-only configuration via smb.conf. The “net conf” command offers a dedicated interface for reading and modifying the registry based configuration.

3.5.3.1 Adding share

To create a new share “share1” using net conf

net conf addshare share1 $PATH writable=[ yes | no ] guest_ok=[ yes | no ] “comment”

To add extra parameters like “valid users” and “write list”:

net conf setparm share1 $PATH “valid users” “@readonly,@trusted”

net conf setparm share1 $PATH “write list” “@trusted”

3.5.3.2 Removing a share

To remove a share called “share1” using net conf

net conf delshare share1

3.5.3.3 Listing current shares

To show all defined shares

net conf list

To show a specific share named “share1”

net conf show share1

3.7 - CONFIGURING CEPHFS WITH NFS GANESHA

3.7.1 - Prerequisites

An Ansible deployed ceph cluster
Ceph File-System created, called “cephfs” for this example
Node(s) to act as NFS Gateway
NFS Gateways can be physical hardware or virtual machines
Password-less SSH access from ansible node

3.7.2 - Active-Active Configuration

Active-Active NFS

No floating IP. Shares accessible from every gateway IP.
HA only possible if application can multipath.
Useful for highly concurrent use cases.

First thing to do is edit the / usr/share /ceph-ansible/group_vars/nfss.yml file. NFS Ganesha can be setup on top of Filesystem or Object, so you need to specify in the file:

nfs_file_gw: true nfs_obj_gw: false

Set the backend driver to “rados_cluster” in /usr/share/ceph-ansible/group_vars/nfss.yml .

# backend mode , either rados_kv,rados_ng, or rados_cluster # Default (rados_ng) is for single gateway/active-passive use.

# rados_kv is obsoleted by rados_ng # rados_cluster is for active-active nfs cluster .

# Requires ganesha-grace-db to be initialized ceph_nfs_rados_backend_driver: “rados_cluster”

Now add the NFS Gateway hostnames to the /root/ceph-ansible-45d/hosts file

[nfss] cephNFS1 cephNFS2

Next run the nfs.yml playbook:

[root @cephADMIN ceph-ansible] # ansible-playbook nfs.yml

This playbook will install all necessary packages as well as setup the default export.

Setting up all of your exports can be done from the dashboard which will be setup in the next section.

3.7.3 - Active-Passive Configuration

Active-Passive

Floating IP. Service only running on 1 of the gateways at a time.
HA possible for all clients

First thing to do is edit the /usr/share/ceph-ansible/group_vars/nfss.yml file. NFS Ganesha can be setup on top of Filesystem or Object, so you need to specify in the file:

nfs_file_gw: true nfs_obj_gw: false

Set the backend driver to “rados_ng” in /usr/share/ceph-ansible/group_vars/nfss.yml .

# backend mode , either rados_kv,rados_ng, or rados_cluster # Default (rados_ng) is for single gateway/active-passive use.

# rados_kv is obsoleted by rados_ng # rados_cluster is for active-active nfs cluster .

# Requires ganesha-grace-db to be initialized ceph_nfs_rados_backend_driver: “rados_ng”

Specify the floating IP the NFS-Ganesha Gateway will be reachable from. in /usr/share/ceph-ansible/group_vars/nfss.yml .

ceph_nfs_floating_ip_address: ‘192.168.18.73’ ceph_nfs_floating_ip_cidr: ‘16’

Now add the NFS Gateway hostnames to the /usr/share/ceph-ansible/hosts file

[nfss] cephNFS1 cephNFS2

Next run the nfs.yml playbook:

[root @cephADMIN ceph-ansible] # ansible-playbook nfs.yml

This playbook will install all necessary packages as well as setup the default export.

Setting up all of your exports can be done from the Ceph dashboard.

Set the following ceph configuration setting to allow nfs to failover properly.

[root @cephADMIN ceph-ansible] # ceph config set mds mds_cap_revoke_eviction_timeout 10

3.7 - CONFIGURING RBD + iSCSI

3.7.1 Prerequisites

An Ansible deployed ceph cluster
Node(s) to act as iSCSI Gateway
iSCSI Gateways can be physical hardware or virtual machines.
iSCSi Gateways can be co-located on the OSDs nodes.
Password-less SSH access from the ansible node.

3.7.2 Ceph iSCSI Installation

See Knowledge Base article for more detail. Ceph iSCSI Configuration

Add iSCSI gateways hostnames to /usr/share/ceph-ansible/hosts

[iscsigws] iscs i1 iscs i2 iscs i3

Next run the iscsi.yml playbook

[root @cephADMIN ceph-ansible] # ansible-playbook iscsi.yml

3.7.3 Ceph iSCSI Configuration

Configuration on the iSCSI nodes is to be done on the iSCSI gateways and the ceph dashboard.

From one of the iSCSI nodes, create the initial iSCSI gateways with gwcli . Note the first time gwcli is run you will be promoted with the warning below, it can be ignored as gwcli will create an initial preferences file if not present.

[root@iscsi1 ~] # gwcli Warning: Could not load preferences file /root/ .gwcli/prefs.bin. >

Create iSCSI target of for the cluster

[root@iscsi1 ~]# gwcli > /> cd /iscsi-target > /iscsi-target> create iqn.2003-01.com.45drives.iscsi-gw:iscsi-igw

Create the first iSCSI gateway. It has to be the node you are running this command on.

[root@iscsi1 ~]# gwcli > cd /iscsi-targets/iqn.2003-01.com.45drives.iscsi-gw:iscsi-igw/gateways /iscsi-target.. .7283 /gateways> create iscsi1 .45 lab.com 192.168 .*.*

The Ceph Administration dashboard is recommended to finish iSCSI configuration. See Section 5 before proceeding here.

CHAPTER 4 - EXPANDING THE CLUSTER

CHAPTER 5 - CONFIGURING THE MANAGEMENT DASHBOARDS

5.1 Installing Ceph Dashboard

Using Ansible, the steps below will be install and configure the metric collection/alert stack.This will also configure and start the ceph management UI.

By default the metric stack will be installed to the first node running the manager service in your cluster. Optionally to specify another server to host this stack use the group label “metrics”

[metrics] metric1

The Ceph Dashboard is hosted by ceph-mgr service. The dashboard playbook will also install haproxy on the metric server, this way the dashboard will be reachable from any of the nodes running the ceph-mgr service as well as the metric server itself.

Enable/disable, set port, protocol, and cert in group_vars/all.yml

dashboard_enabled: true # When true HAProxy will be installed on the server in the metric group dashboard_haproxy: true dashboard_haproxy_port: 80 dashboard_haproxy_protocol: http dashboard_haproxy_cert:

Run the /usr/share/ceph-ansible/dashboard.yml file:

[root @cephADMIN ceph-ansible] # ansible-playbook dashboard.yml

Below is a table of the default ports for the dashboards that were configured. These can be modified in /usr/share/ceph-ansible/group_vars/all.yml file

Name	Default Port
Grafana	3000/tcp
Prometheus	9090/tcp
Alertmanager	9091/tcp
Ceph Dashboard	8234/tcp

CHAPTER 6 - MANAGING STORAGE POOLS VIA CLI

It is recommended to create storage pools in the Ceph Dashboard . The below chapter will go into detail on the specifics of creating pool.

Before creating pools, there are a few things to consider. First thing is the number of placement groups. Second being what type of pool is to be created - replicated or erasure coded.

6 .1 - Placement Groups

Sizing placement groups is very important. You can always increase the size of placement groups later but never decrease it. Increasing the number of placement groups will cause the data to begin migrating to be evenly spread across all placement groups. This will put a strain on cluster performance until the migration is complete.

It is best to use the pg_calculator to have the proper number of placement groups. If you’re unsure what to choose, start with 64 placement groups per pool, and then increase the number of placement groups at a later date (ideally before you put data on the pool).

A quick rule of thumb:

Less than 5 OSDs, set pg_num to 128
Between 5 and 10 OSDs, set pg_num to 512
Between 10 and 50 OSDs, set pg_num to 1024
If you have more than 50 OSDs, understand the tradeoffs and use the calculator

6 .2 - Pool Types

Ceph stores data in pools and there are two types of pools:

Replicated
Erasure-coded

Ceph uses the replicated pools by default, meaning the Ceph copies every object from a primary OSD node to one or more secondary OSDs. Erasure coding is a method of storing an object where the erasure code algorithm breaks the object into data chunks ( k ) and coding chunks ( m ), and stores those chunks in different OSDs. Erasure coding uses storage capacity more efficiently than replication. The n-replication approach maintains n copies of an object (3x by default in Ceph), whereas erasure coding maintains only k + m chunks. For example, 3 data and 2 coding chunks use 1.5x the storage space of the original object.

Below is a table showing pool type, required number of OSD nodes, and storage efficiency.

Pool Type	Storage Efficiency	Minimum # of OSD Nodes	Recommended # of OSD Nodes
2-replication	50%	3	3+
3-replication	33%	3	3+
2+1 Erasure Coded	66%	4	5
4+2 Erasure Coded	66%	8	10
8+2 Erasure Coded	80%	12	14
8+4 Erasure Coded	66%	16	20

* NOTE - Minimum # of OSD Nodes can withstand ‘m’ number of failures, after that I/O will stop until you recover that OSD node. The Recommended # of OSD node gives that extra cushion of protection and keeps I/O going while you recover the failed OSD node.

6 .3 - Creating a Pool

Below is the syntax required for creating a ceph pool:

ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] [crush-ruleset-name] ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure [erasure-code-profile]

A few things to note:

Variable	Description	Type	Required?	Default Value
pool-name	Name of the pool. Must be unique.	String	Yes
pg-num	Total number of placement groups.	Integer	Yes	8
pgp-num	Total number of placement groups for placement purposes. pg-num=pgp- num	Integer	Yes	8
replicated\| erasure	Pool type. Replication level will be set later, erasure-cod e profile needs to be defined at time of pool creation.	String	No	replicated
crush-rules et-name	Name of a CRUSH ruleset to use for this pool. Ruleset must exist.	String	No	For replicated pools it is the ruleset specified by the osd pool default crush replicated ruleset config variable. This ruleset must exist. For erasure pools it is erasure-cod e if the default erasure code profile is used or {pool-name} otherwise. This ruleset will be created implicitly if it doesn’t already exist.
erasure-cod e-profile	It must be an existing profile as defined by osd erasure-cod e-profile set .	String	No
expected-nu m-objects	The expected number of objects for this pool.	Integer	No	0

Example: This corresponds to a cluster with 120 OSDs, a replicated pool.

ceph osd pool create tank_data 8192 8192 replicated ceph osd pool create tank_metadata 256 256 replicated

CHAPTER 7 - UPGRADING A CEPH CLUSTER

There are two types of updates when it comes to a Ceph cluster.

Minor Updates
Major Updates

Both types can be completed without cluster downtime, but release notes should be reviewed in both cases.

7.1 - MINOR UPDATES

Minor updates are minor bug fixes released every 4-6 months. These are quick updates that can be done safely by simply running a yum update .

If a new kernel is installed, a reboot will be required to take effect. If there is no kernel update you can stop here.

If there is a new kernel, set osd flag noout and norebalance to prevent the cluster from trying to heal itself while the nodes reboot one by one.

ceph osd set flag noout ceph osd set flag norebalance

Then reboot each node one at a time. Do not reboot the next node until the prior is up and back in the cluster. After each node is rebooted, unset the flags set earlier when you’re all done.

ceph osd unset flag noout ceph osd unset flag norebalance

7.2 - MAJOR UPDATES

Major updates are applied with ansible. To upgrade to the next major release edit the group_vars/all.yml in the ceph-ansible-45d directory. In the INSTALL heading, find the line ceph_stable_release , replace the existing release with the next stable version. For example: updating from Mimic (13.2.X) to Nautilus (14.2.X)

[root@cephADMIN ~]# vim /root/ceph-ansible-45d-1.2/group_vars/all.yml > Change “ceph_stable_release: mimic” > To “ceph_stable_release: nautilus”

Now, run the rolling-updates.yml playbook.

[root @cephADMIN ceph-ansible -45 d -1.2 ] # ansible-playbook infrastructure-playbooks/rolling-updates.yml

It will take some time, but all data is up and accessible during the update. You can verify all nodes are running the new version by running the following command:

[root @cephADMIN ~] # ceph versions

[1] This port is user specified. It defaults to 8080

Ceph Nautilus Installation Guide ¶

ceph-ansible

Navigation

Related Topics

Ceph Nautilus Installation Guide¶

Ceph Nautilus Installation Guide ¶