Work in progress sofree.us 2025Q3 Ceph handson

Timeframe

Per the title, Q3 of 2025. Some Saturday. Probably August or September.

Goals

Learners will install/configure a working Ceph cluster and gain some knowledge of its components. The hard way. No automatic container deployment or similar.

Generate a cluster UUID
Select server IP addresses
- A quick diversion into public and cluster (private) networks as used by Ceph
Install Ceph application packages
Create a very basic cluster config file
use monmaptool to create a single entry cluster monitor map (monmaptool --create --fsid UUID-FROM-ABOVE
deploy a single monitor for the cluster (ceph-mon --mkfs ...)
Start the monitor up, check on cluster state
Add additional monitors on other nodes
Install manager components
Add OSD components (At last, some actual storage!)
Create some storage pools.
On another (client) system, make use of some of that storage. RBD is the easiest to demonstrate. RGW needs additional components.
Choose your own adventure:
- RGW, the Ceph S3 object gateway (and also OpenStack Swift)
  - Install radosgw
  - Create a user (radosgw-admin)
  - Make sure it is available over the internet (this is going to be pre-work on Aaron's part)
  - Install a client program
  - Create a bucket
  - Put some objects
  - List objects
  - Read object from the S3 storage. Compare with original.
- CephFS
  - Install MDS components
  - Create a filesystem
  - Create some client identities
  - Put client keyring on a client server
  - Mount the filesystem and start writing files
Stretch goals
- Replicate RBD data from one cluster to another
- Replicate RGW data from one cluster to another
- Replicate CephFS data from one cluster to another

Hardware available

The following is at Aaron's home data center, currently unused:

6x Dell R630 servers, each with:
- 2x Intel Xeon E5-2630v4 processors
- 64 Gibytes of RAM
- 2x 800Gbyte SAS SSDs (for OS, applications, and Ceph Bluestore WAL/DB devices)
- 4x 1.8Tbyte SAS HDDs (Ceph cluster data disks)
- 1x 2-port Mellanox 56Gbits/sec Ethernet card (and sufficient switch ports and cabling for these)
- 2x Intel 10GbaseT twisted pair Ethernet ports (I do not have switch ports for 10GbaseT, but have plenty of 1000baseT)
- 2x Intel 1000baseT twisted pair Ethernet ports (and sufficient switch ports and cabling)
- iDRAC remote console, power control, etc
3x Dell R720XD servers, each with:
- 2x (1x?) Intel Xeon E5-26?? processors
- 32 Gibytes of RAM
- 2x 120Gbyte SATA SSDs (rear-mounted) for the OS and applications
- 6 or 7x 3Tbyte SAS HDDs (Ceph cluster data disks)
- 2x 400Gbyte SAS SSDs (Ceph Bluestore WAL/DB devices)
- 2x 10Gbits/sec SFP+ (fiber optic) Ethernet adapters (and sufficient switch ports and cabling for these, but might need Intel branded SFP+ transceivers)
- 2X 1000baseT twisted pair Ethernet ports (and sufficient switch ports and cabling)
- iDRAC remote console, power control, etc

Potential Cluster end states

Option 1: A single cluster with all 9 servers as members. This would probably make most sense with only 2 or 3 learners.
Option 2: Two clusters. One with mixed hardware consisting of 3x R630 servers and 3x R720 servers. And a second cluster of 3x R630 servers. Makes sense with 4-6 learners in groups of 2 or 3. And demonstrates Ceph's flexibility in building from non-identical servers.
Option 3: Two clusters. But this time all nodes in each cluster are similar. 6x R630s in one cluster. And 3x R720s in the second cluster. Makes sense with 4-6 learners in groups of 2 or 3.
Option 4: Three clusters. Each cluster with consistent hardware. Makes most sense with a third group of learners.

Work in progress sofree.us 2025Q3 Ceph handson

Contents

Timeframe

Goals

Hardware available

Potential Cluster end states

Navigation menu

Work in progress sofree.us 2025Q3 Ceph handson

Timeframe

Goals

Hardware available

Potential Cluster end states

Navigation menu

Search