Browsing by Author "Solomon, Eliot Hutton"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Embargo Effective Techniques for Managing Intermediate-Sized Superpages(2024-08-09) Solomon, Eliot Hutton; Cox, Alan LTranslation lookaside buffers (TLBs) are pieces of hardware that cache the results of expensive address translations, improving the performance of the virtual memory system. Design constraints make it impossible for TLBs to store more than a few thousand entries, so "superpages" allow the operating system to instruct the TLB to cache a larger block of memory using a single entry. For small, frequently used memory objects like files and shared libraries, it can be difficult for the operating system to appropriately trade off the memory fragmentation induced by creating a 2 MB superpage with the performance benefits that doing so provides. Because of this, we investigate emerging hardware support for smaller “intermediate-sized” superpages. The first phase of our work explores PTE Coalescing, a feature of AMD Ryzen processors that transparently forms 16 KB or 32 KB superpages from aligned and contiguous groups of 4 KB base pages. We develop a custom microbenchmark to infer details of PTE Coalescing’s hardware implementation. We then determine that the contiguity generated by the Linux and FreeBSD physical memory allocators is insufficient to enable much coalescing and that reservation-based allocation is a good technique for generating additional contiguity to enhance PTE Coalescing. In the second phase of our work, we introduce the first production system capable of simultaneously managing two superpage sizes for file-backed and anonymous mappings by implementing support in the FreeBSD kernel for non-transparent 64 KB superpages on the ARM architecture using the latter’s Contiguous bit feature. We observe a 13.83% improvement in an exec() microbenchmark, a 6.83% boost in Node.js rendering performance, and a 11.18% speedup in a compilation-centric workload. More aggressive superpage promotion policies can further increase the performance benefits; we can boost the speedup to 15.67% using the right policy for the compilation-heavy workload.