Work in progress sofree.us 2025Q3 Ceph handson
		
		
		
		Jump to navigation
		Jump to search
		
Timeframe
Per the title, Q3 of 2025. Some Saturday. Probably August or September.
Goals
Learners will install/configure a working Ceph cluster and gain some knowledge of its components. The hard way. No automatic container deployment or similar.
- Generate a cluster UUID
 - Select server IP addresses
- A quick diversion into public and cluster (private) networks as used by Ceph
 
 - Install Ceph application packages
 - Create a very basic cluster config file
 - use monmaptool to create a single entry cluster monitor map (monmaptool --create --fsid UUID-FROM-ABOVE
 - deploy a single monitor for the cluster (ceph-mon --mkfs ...)
 - Start the monitor up, check on cluster state
 - Add additional monitors on other nodes
 - Install manager components
 - Add OSD components (At last, some actual storage!)
 - Create some storage pools.
 - On another (client) system, make use of some of that storage. RBD is the easiest to demonstrate. RGW needs additional components.
 - Choose your own adventure:
- RGW, the Ceph S3 object gateway (and also OpenStack Swift)
- Install radosgw
 - Create a user (radosgw-admin)
 - Make sure it is available over the internet (this is going to be pre-work on Aaron's part)
 - Install a client program
 - Create a bucket
 - Put some objects
 - List objects
 - Read object from the S3 storage. Compare with original.
 
 - CephFS
- Install MDS components
 - Create a filesystem
 - Create some client identities
 - Put client keyring on a client server
 - Mount the filesystem and start writing files
 
 
 - RGW, the Ceph S3 object gateway (and also OpenStack Swift)
 - Stretch goals
- Replicate RBD data from one cluster to another
 - Replicate RGW data from one cluster to another
 - Replicate CephFS data from one cluster to another
 
 
Hardware available
The following is at Aaron's home data center, currently unused:
- 6x Dell R630 servers, each with:
- 2x Intel Xeon E5-2630v4 processors
 - 64 Gibytes of RAM
 - 2x 800Gbyte SAS SSDs (for OS, applications, and Ceph Bluestore WAL/DB devices)
 - 4x 1.8Tbyte SAS HDDs (Ceph cluster data disks)
 - 1x 2-port Mellanox 56Gbits/sec Ethernet card (and sufficient switch ports and cabling for these)
 - 2x Intel 10GbaseT twisted pair Ethernet ports (I do not have switch ports for 10GbaseT, but have plenty of 1000baseT)
 - 2x Intel 1000baseT twisted pair Ethernet ports (and sufficient switch ports and cabling)
 - iDRAC remote console, power control, etc
 
 - 3x Dell R720XD servers, each with:
- 2x (1x?) Intel Xeon E5-26?? processors
 - 32 Gibytes of RAM
 - 2x 120Gbyte SATA SSDs (rear-mounted) for the OS and applications
 - 6 or 7x 3Tbyte SAS HDDs (Ceph cluster data disks)
 - 2x 400Gbyte SAS SSDs (Ceph Bluestore WAL/DB devices)
 - 2x 10Gbits/sec SFP+ (fiber optic) Ethernet adapters (and sufficient switch ports and cabling for these, but might need Intel branded SFP+ transceivers)
 - 2X 1000baseT twisted pair Ethernet ports (and sufficient switch ports and cabling)
 - iDRAC remote console, power control, etc
 
 
Potential Cluster end states
- Option 1: A single cluster with all 9 servers as members. This would probably make most sense with only 2 or 3 learners.
 - Option 2: Two clusters. One with mixed hardware consisting of 3x R630 servers and 3x R720 servers. And a second cluster of 3x R630 servers. Makes sense with 4-6 learners in groups of 2 or 3. And demonstrates Ceph's flexibility in building from non-identical servers.
 - Option 3: Two clusters. But this time all nodes in each cluster are similar. 6x R630s in one cluster. And 3x R720s in the second cluster. Makes sense with 4-6 learners in groups of 2 or 3.
 - Option 4: Three clusters. Each cluster with consistent hardware. Makes most sense with a third group of learners.