Fault Recovery in Theseus OS

Date
2021-02-10
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

This work describes the implementation and evaluation of fault recovery in the Theseus Operating System (OS), a new OS developed from scratch. Theseus features a modular structure, a collection of tiny modules that minimize the states they hold for each other. Theseus is implemented using a safe programming language, Rust, and leverages the compiler to ensure type and memory safety to achieve isolation among tasks. Fault recovery is essential in Theseus as a faulty task can potentially corrupt any OS structure, in the absence of hardware provided isolation. We implement a series of fault recovery mechanisms on Theseus that take increasingly drastic measures to recover, if recovery was unsuccessful at the previous stage. At first we fully unwind and restart faulty tasks. If the fault is persistent, we replace potentially corrupted modules by loading fresh copies of those modules from the disk to a different location in memory. We evaluate Theseus’s ability to recover from faults by stress testing our fault recovery implementation in the presence of hardware faults. Furthermore, we show that Theseus can recover from faults occurring in core OS components, e.g., those that necessarily exist within a microkernel, which goes beyond the capabilities of existing works.

Description
Degree
Master of Science
Type
Thesis
Keywords
Theseus, fault recovery, Rust, OS
Citation

Godawatte Liyanage, Namitha. "Fault Recovery in Theseus OS." (2021) Master’s Thesis, Rice University. https://hdl.handle.net/1911/113886.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page