Rethinking Storage System Design in Distributed NVRAM+RDMA Clusters

dc.contributor.advisorVarman, Peter J.en_US
dc.creatorLiu, Qingyueen_US
dc.date.accessioned2020-12-10T17:31:40Zen_US
dc.date.available2021-12-01T06:01:11Zen_US
dc.date.created2020-12en_US
dc.date.issued2020-12-03en_US
dc.date.submittedDecember 2020en_US
dc.date.updated2020-12-10T17:31:40Zen_US
dc.description.abstractRecent advances in hardware technologies raise new opportunities for architecting storage systems to exploit emerging NVRAM memory devices, fast remote-memory RDMA networking, and large numbers of processor cores. These technologies provide new opportunities for creating scalable high-throughput data management systems with low latency and strong consistency guarantees. In this thesis, we investigate the design space for distributed storage systems based on these emerging technologies. The design focuses on three components: a novel high-performance and strongly consistent data access protocol, new communication abstractions, and QoS controls. We present Telepathy, a novel data access protocol for distributed key-value storage systems. Telepathy supports replicated data storage for fault tolerance and guarantees strong consistency while supporting high-volume concurrent read/write access. Our read protocol can perform (largely) silent consistent reads from any of the replica nodes holding an object, while our write protocol exploits remote atomics and non-volatile buffers to silently resolve write contention. For inter-server communication, we present a new distributed communication channel (DCC) that separates control and data communication directly at the RNIC. By using different RDMA semantics, our scheme avoids frequent remote processor interruption, and improves latency, throughput, CPU utilization, and memory usage. For QoS control, we design a new algorithm to support QoS for applications using one-sided data access operations. A silent token dispatch mechanism is designed to inform storage nodes of the real-time throughput of connected clients, and adaptively change the token distribution to guarantee clients meet their target reservations with small overhead. Our experiments on an RDMA-enabled cluster using YCSB benchmarks show that our distributed key-value store can achieve microsecond-range reads and writes with small tail latencies, GBps-range data access bandwidth, low CPU utilization, and strong data consistency guarantees. The system also supports QoS reservations with only minor performance impact.en_US
dc.embargo.terms2021-12-01en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationLiu, Qingyue. "Rethinking Storage System Design in Distributed NVRAM+RDMA Clusters." (2020) Diss., Rice University. <a href="https://hdl.handle.net/1911/109637">https://hdl.handle.net/1911/109637</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/109637en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectDistributed storageen_US
dc.subjectNVRAMen_US
dc.subjectRDMAen_US
dc.subjectQoSen_US
dc.titleRethinking Storage System Design in Distributed NVRAM+RDMA Clustersen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentElectrical and Computer Engineeringen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LIU-DOCUMENT-2020.pdf
Size:
5 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.6 KB
Format:
Plain Text
Description: