FileShareDFSProjectcephvsgluster < CF < TWiki

CF Web>InternalProjects>SecureWeb-basedFile-sharingSystem-backingStorage>FileShareDFSProjectcephvsgluster (2017-06-05, LoriPaniak)

Filesystem decision matrix: Ceph vs Gluster

ceph+dm-crypt+ZFS RAIDZ2 OSD, flash journal 2-replication	Completely tunable OSD count per chassis to CPU than OSD-per-HDD Reduced peak IOPs: total OSDs =27 vs 108 in 3-replication above 1MB seq read (32 files): 1.7GB/s 4KB random read (32 files) IOPS: 1MB seq write (32 files): 630MB/s (flash write rate limited) 4KB random write (32 files) IOPS:	cephFS(kernel) RBD object	full data scrub ZFS data protection/scrub compression	Need dm-crypt under ZFS	220TB	<0.01% chance of data loss in 5yrs	single drive failure fully manged at ZFS layer	ceph FS maturing, features converging	No interaction with dm-crypt and ceph OSD management in nominal operation, isolated drive failure Recommend moving journals to NVMe Intel P3700/Optane flash for performance and flash endurance reasons in the first year of operation.	Reduced performance but reliable data storage and read.
	performance	products	features	support for configuration	usable capacity	reliability (worst case)	rebuild/resilver	project	monitor/maintenance
ceph+dm-crypt+xfs 3-replication	CephFS 1MB seq read (48 files): 3.6-1.8GB/s CephFS 1MB seq write (48 files): 1GB/s CephFS 4KB random write (single client) 1850 IOPs CephFS 4KB random read (single client) 3500 IOPs	cephFS(kernel) RBD object	full data scrub	fully supported at creation	198TB	2-drive fail before OSD migration = offline Simultaneous (before migration) single-drive fail on each DFS server =data loss 3-drive fail = data loss 1-drive fail to offline with building loss %RB% Estimate probability of data loss in 5yrs @ 0.1-1%	internal network for rebuild one drive of data over int network/drive loss for OSD migration + additional on rebuild	ceph FS maturing, features converging
gluster+ZFS-Z2+dm-crypt 2-replication+arbiter	glusterfs 1MB seq read (12 files): 2.2GB/s gluster 1MB seq write (6 files): 1GB/s	glusterfs(FUSE) libgfapi(block)/iSCSI ganesha-NFS	full data scrub compression snapshots	No native ZFS encryption. Need custom solution underlying ZFS - races	216TB	3-drive fail on single DFS before resilver = loss of building Min cluster 4-drive fail to data loss	ZFS/PCIe traffic only for pool resilver. 3-drive fail pool loss = migration of entire pool over service net	project currently refactoring. Changes for small files, substantial changes: DHT2, gluster4	ZED+smartd single command for drive replace
ceph+dm-crypt+md 3+1RAID5 OSD, flash journal 2-replication	Better match of OSD count per chassis to CPU than OSD-per-HDD Reduced peak IOPs: total OSDs =27 vs 108 in 3-replication above 1MB seq read (32 files): 2.7GB/s 4KB random read (32 files) IOPS: 6100 1MB seq write (32 files): 630MB/s (flash write rate limited) 4KB random write (32 files) IOPS: 2000	cephFS(kernel) RBD object	full data scrub	Need md layer under Ceph managed dm-crypt/XFS	222TB	<0.01% chance of data loss in 5yrs	single drive failure fully manged at md layer	ceph FS maturing, features converging	No interaction with dm-crypt and ceph OSD management in nominal operation, isolated drive failure Recommend moving journals to NVMe Intel P3700-class flash for performance and flash endurance reasons in the first year of operation.	Not acceptable option. On-disk corruption is transmitted to Ceph users. Ceph scrubs detect corruption but cannot reliably repair.

-- LoriPaniak - 2016-11-01

Edit | Attach | ~~Watch~~ | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions

Topic revision: r9 - 2017-06-05 - LoriPaniak

CF

Information in this area is meant for use by CSCF staff and is not official documentation, but anybody who is interested is welcome to use it if they find it useful.

Other Webs

My links
- People
- CERAS
- WatForm
- Tetherless lab
- Ubuntu Main.HowTo
- eDocs
- RGG NE notes
- RGG
- CS infrastructure
- Grad images

Home - this site is powered by TWiki(R)

Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback