Effective Techniques for Managing Intermediate-Sized Superpages
dc.contributor.advisor | Cox, Alan L | en_US |
dc.creator | Solomon, Eliot Hutton | en_US |
dc.date.accessioned | 2024-08-30T18:38:23Z | en_US |
dc.date.created | 2024-08 | en_US |
dc.date.issued | 2024-08-09 | en_US |
dc.date.submitted | August 2024 | en_US |
dc.date.updated | 2024-08-30T18:38:23Z | en_US |
dc.description | EMBARGO NOTE: This item is embargoed until 2026-08-01 | en_US |
dc.description.abstract | Translation lookaside buffers (TLBs) are pieces of hardware that cache the results of expensive address translations, improving the performance of the virtual memory system. Design constraints make it impossible for TLBs to store more than a few thousand entries, so "superpages" allow the operating system to instruct the TLB to cache a larger block of memory using a single entry. For small, frequently used memory objects like files and shared libraries, it can be difficult for the operating system to appropriately trade off the memory fragmentation induced by creating a 2 MB superpage with the performance benefits that doing so provides. Because of this, we investigate emerging hardware support for smaller “intermediate-sized” superpages. The first phase of our work explores PTE Coalescing, a feature of AMD Ryzen processors that transparently forms 16 KB or 32 KB superpages from aligned and contiguous groups of 4 KB base pages. We develop a custom microbenchmark to infer details of PTE Coalescing’s hardware implementation. We then determine that the contiguity generated by the Linux and FreeBSD physical memory allocators is insufficient to enable much coalescing and that reservation-based allocation is a good technique for generating additional contiguity to enhance PTE Coalescing. In the second phase of our work, we introduce the first production system capable of simultaneously managing two superpage sizes for file-backed and anonymous mappings by implementing support in the FreeBSD kernel for non-transparent 64 KB superpages on the ARM architecture using the latter’s Contiguous bit feature. We observe a 13.83% improvement in an exec() microbenchmark, a 6.83% boost in Node.js rendering performance, and a 11.18% speedup in a compilation-centric workload. More aggressive superpage promotion policies can further increase the performance benefits; we can boost the speedup to 15.67% using the right policy for the compilation-heavy workload. | en_US |
dc.embargo.lift | 2026-08-01 | en_US |
dc.embargo.terms | 2026-08-01 | en_US |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.citation | Solomon, Eliot Hutton. Effective Techniques for Managing Intermediate-Sized Superpages. (2024). Masters thesis, Rice University. https://hdl.handle.net/1911/117838 | en_US |
dc.identifier.uri | https://hdl.handle.net/1911/117838 | en_US |
dc.language.iso | eng | en_US |
dc.rights | Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder. | en_US |
dc.subject | computer systems | en_US |
dc.subject | virtual memory | en_US |
dc.subject | address translation | en_US |
dc.subject | TLB | en_US |
dc.subject | translation lookaside buffer | en_US |
dc.subject | huge page | en_US |
dc.subject | superpage | en_US |
dc.subject | FreeBSD | en_US |
dc.title | Effective Techniques for Managing Intermediate-Sized Superpages | en_US |
dc.type | Thesis | en_US |
dc.type.material | Text | en_US |
thesis.degree.department | Computer Science | en_US |
thesis.degree.discipline | Engineering | en_US |
thesis.degree.grantor | Rice University | en_US |
thesis.degree.level | Masters | en_US |
thesis.degree.name | Master of Science | en_US |