Accountability for distributed systems

Haeberlen, Andreas

Accountability for distributed systems

Files

304987486.pdf (7.9 MB)

Date

2009

Authors

Haeberlen, Andreas

Abstract

Nodes in a distributed system can fail for many reasons, such as bugs, misconfigurations, hardware failures, intrusions, or insider attacks. Once a node has become faulty, its behavior can change arbitrarily. In benign cases, the node might simply stop; in less benign cases, it might actively try to subvert the rest of the system. A reliable distributed system must have a way to handle such faults. In this thesis, we explore a novel approach to this problem, which is based on accountability. In an accountable system, each node records its past actions in a tamper-evident log, and nodes inspect each other's log for signs of misbehavior. When nodes become faulty, the other nodes can eventually detect this, and they can obtain evidence that irrefutably links the fault to a faulty node. At the same time, correct nodes can always defend themselves against any false accusations. We characterize the class of faults that can be detected with our approach, and we show that it includes any fault that causally affects at least one correct node. We also present a set of techniques for enforcing accountability, including an algorithm for tamper-evident logs, and two techniques for detecting faults in the log: One relies on state machine replay to check a node's behavior against a reference implementation, while the other checks the logs against a declarative specification of the expected behavior. Each of these techniques can be applied to a wide range of distributed systems. To demonstrate that accountability is widely applicable, we have added it to several different types of systems, including a decentralized email system, a server-based file system, a peer-to-peer content distribution system, the Internet's interdomain routing system, and two multi-player games. In each case, accountability was able to detect a variety of problems that were previously reported in the literature. This shows that accountability is very general and can supersede a number of existing defenses. Our evaluation shows that accountability is practical, that its overhead is reasonable, and that it can scale to large numbers of nodes.

Advisor

Druschel, Peter

Degree

Doctor of Philosophy

Type

Thesis

Keywords

Computer science, Applied sciences, Accountability, Distributed systems, Fault detection, Reliability Security

Citation

Haeberlen, Andreas. "Accountability for distributed systems." (2009) Diss., Rice University. https://hdl.handle.net/1911/103683.

Rights

Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.

Citable link to this page

https://hdl.handle.net/1911/103683

Collections

Rice University Theses and Dissertations

Full item page