Meeting: 2016-06-17 11:00

Attended: ldpaniak (project manager), a2brenna, cscflab, gxshen, hchotara

Objectives:

  • Decide on details of DFS configuration, software and hardware.
  • Learn more about existing network filesystems in use in CS and Math.

Already under work:

  • high-speed dedicated storage network (Devon/Dan/Lori)
  • Options for active-active NFS service from DFS (cscflab-Nathan)
  • (Re-)Build of Ceph cluster with latest version on Ubuntu 16.04 (a2brenna)

To be determined:

  • Expected typical/max load of FileShare/OwnCloud gateway in production - how much hardware do we need for active/passive configuration? Containers OK?
  • Stats of filesize distribution for current CS NetApp usage (gxshen). Apparently lots of small files in student environment.

Discussion:

  • Review of Math DFS work with hchotara. Some investigation into Ceph.
  • Latest Math NetApp purchase: 2 controller heads + 24xSSD read accelerator ~$186k.
  • Math is not currently using Kerberized NFS. Using SMB/CIFS.
  • One limitation of NetApp in past was file indexing at logon for Apple clients. Should aim for DFS here to be able to handle ~30 workers of random reads at reasonable performance.
  • Dedup saves 30% capacity for CS environment.
  • Discussion of Ceph configuration on current test cluster:
    • three nodes: each with OSD(backing storage) as single volume from RAID controller, MDS, monitor daemon.
    • Placement map specified in configuration gives redundancy characteristics for a Ceph pool (~filesystem). This produces a crush map for distribution of data on OSDs. Can have different redundancy characteristics for different pools. Unclear if distribution of data can change autonomously in unexpected ways.
    • Ceph characteristics: no compression, no dedup, yes to snapshots.

Decisions:

  • Gluster and Ceph options for software layer here will use the same backing hardware. Proceed with quote procurement for following spec:
 
3x Large storage servers
each
SSG-6048R-E1CR36L 
2x Intel E5-2640v4 CPU
8x 32GB DDR4 ECC RDIMM
2x Intel S3510 240GB SSD
MCP-220-82609-0N rear drive kit
36x 6TB SAS2 7200RPM 3.5" HDD (eg. Seagate ST6000NM0034)
Mellanox dual-port 50GbE x16 MCX416A-GCAT
2x 2m 50GbE QSFP28 cables MCP1600-C002
3yr depot warranty (5yr option)

Spares (as less expensive):
SSG-6048R-E1CR36L barebone chassis
or 
2x PSU
1x motherboard
1x front drive backplane
1x rear drive backplane

Additionally
2x sticks of RAM as above
4x HDD as above
  • DFS block product/LUKS at service/filesystem at service is block device model. a2brenna has demonstrated resizing (grow) at each level for RBD/LUKS/BTRFS.
  • All end-user services will be mediated by systems/containers/VMs attached to the DFS 40GbE ring network. No direct end-user access to DFS products will be provided.
  • Mediating services decouples service upgrades, maintenance, security and configuration from DFS core functionality. This modularity will be invaluable as the number and type of services using DFS products increases.
  • Mediating services allows for per-service level of data encryption and isolation.

Next meeting: 2016-06-17-1100

Deliverables: cscflab (Nathan Fish): Investigate options for multi-server (parallel) NFS system that utilizes DFS products (block, glusterfs, cephFS, etc) to provide Kerberized NFS to Math/CS client systems.

a2brenna: Build latest Ceph system on 16.04 with three nodes. Demonstrate cluster configuration at command line including features (eg. add/remove OSD etc.)

ldpaniak: Use gluster 3.8 to investigate DFS features with NSR https://www.gluster.org/community/roadmap/4.0/

-- LoriPaniak - 2016-06-09

Topic revision: r1 - 2016-06-09 - LoriPaniak
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback