Rixner, ScottCox, Alan L2022-12-212022-12-212022-122022-12-02December 2Zhu, Weixi. "Virtual Memory Management for Emerging Accelerators and Large-memory Applications." (2022) Diss., Rice University. <a href="https://hdl.handle.net/1911/114178">https://hdl.handle.net/1911/114178</a>.https://hdl.handle.net/1911/114178Today, the operating system (OS) is called upon to support a variety of applications that process large amounts of data using an ever growing collection of specialized hardware accelerators. Nonetheless, current OSes still fail to (1) ease the development of drivers for new accelerators that need access to in-memory data and to (2) provide efficient access to that data by both the CPU and accelerators. Applications need virtual memory abstractions to securely isolate data and hide hardware details of the CPU and accelerators. Currently, OS memory management is designed for managing the CPU's memory and cannot be directly used for many accelerators. However, the absence of better OS memory management support for devices affects driver authors in terms of ease of development. They implement ad-hoc and specialized virtual memory management that reinvents many existing mechanisms from OS memory management. Unfortunately, the large complexity of virtual memory management hinders the implementation of an efficient one, so accelerator users may suffer from bad performance. Furthermore, the continued growth of data set sizes amplifies the performance impact of hardware limitations of both the CPU and accelerators. These limitations can be alleviated independently with innovative optimizations for OS memory management and drivers' ad-hoc memory management. However, this further complicates the difficulties of sharing these innovations. This thesis presents GMEM, generalized memory management, that refactors OS memory management to provide a high-level interface for both the CPU and emerging accelerators to share existing memory management mechanisms and innovative optimizations. For instance, the GMEM-based driver of a simulated device takes less than 100 hardware-independent LoC to provide a similar virtual memory abstraction to that from Nvidia's GPU driver. Additionally, this thesis presents two innovative memory management optimizations for FreeBSD and Nvidia's GPU driver in response to applications' larger and larger memory footprint. For example, its optimization for Nvidia's GPU driver enables a deep learning application to obtain 60% higher training throughput. These two innovations are to be merged with mainstream FreeBSD and Nvidia's GPU driver respectively, but more importantly, they are sharable via GMEM.application/pdfengCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.Virtual memory managementLarge-memory applicationsAcceleratorsFreeBSDNvidia GPUVirtual Memory Management for Emerging Accelerators and Large-memory ApplicationsThesis2022-12-21