Optimizing Compiler Heuristics with Machine Learning

Grubisic, Dejan

Optimizing Compiler Heuristics with Machine Learning

Files

GRUBISIC-DOCUMENT-2024.pdf (8.39 MB)

Date

2024-04-17

Authors

Grubisic, Dejan

Abstract

Compiler technology is crucial for enhancing the performance and efficiency of modern software. The complexity of novel computer architectures, the ever-evolving software landscape, and the ever-growing scale of computation have made manual optimization techniques increasingly difficult and time-consuming. To address this, machine learning (ML) can recognize intricate patterns and automatically tailor code generation and optimization strategies for specific hardware configurations, significantly enhancing program performance. This thesis demonstrates these ideas.

First, we showcase the use of reinforcement learning in optimizing tensor computations with LoopTune. LoopTune optimizes tensor traversal order while using the ultra-fast lightweight code generator LoopNest to perform hardware-specific optimizations. With a novel graph-based representation and action space, LoopTune speeds up LoopNest by 3.2x, generating an order of magnitude faster code than TVM, 2.8x faster than MetaSchedule, and 1.08x faster than AutoTVM, consistently performing at the level of the hand-tuned library Numpy.

Second, we pioneer the use of large language models (LLMs) in compiler optimization. Our model generates optimization in seconds, achieving a 3.0% improvement in reducing instruction counts over the compiler, outperforming two state-of-the-art baselines that require thousands of compilations. Even more, the model shows surprisingly strong code reasoning abilities, generating compilable code 91% of the time and perfectly emulating the output of the compiler 70% of the time.

Third, we evaluate feedback-directed LLMs that use compiler feedback collected in inference time to improve generated code. We evaluate three feedback formats with various degrees of information, which all outperform the original model by 0.11%, 0.4%, and 0.53%. We further combine this approach with temperature-based sampling and iterative compilation. Sampling techniques show superior performance, reaching 98% of autotuner's performance over the compiler given the budget of 100 samples.

Fourth, we present Priority Sampling, a simple deterministic LLM sampling technique that produces unique samples ordered by the model’s confidence. Priority Sampling outperforms Nucleus Sampling for any number of samples, reducing the code size further than the original model and achieving a 5% reduction over -Oz instead of 2.87%. Moreover, it outperforms the autotuner used for the generation of labels for the training of the original model in just 30 samples.

Advisor

Mellor-Crummey, John

Degree

Doctor of Philosophy

Type

Thesis

Keywords

machine learning, compilers, large language models, reinforcement learning

Citation

Grubisic, Dejan. Optimizing Compiler Heuristics with Machine Learning. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116179

Rights

Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.

Citable link to this page

https://hdl.handle.net/1911/116179

Collections

Rice University Theses and Dissertations

Full item page