Corpus-Driven Systems for Program Synthesis and Refactoring

Date
2019-04-18
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Programming is a difficult task. Programmers need to deal with small details inside overly complex computer programs. Sometimes it is inevitable for programmers to make small mistakes. To deal with this problem, software engineering techniques and formal method based techniques have been proposed to help facilitate programming. These techniques include various software engineering methodologies, design patterns, sophisticated testing methods, program repair algorithms, model checking algorithms, and program synthesis methods. In this thesis, we propose two additional corpus-driven systems for program synthesis and refactoring. We first introduce program splicing, a programming methodology that aims to automate the workflow of copying, pasting, and modifying code available online. Here, the programmer starts by writing a “draft” that mixes unfinished code, natural language comments, and correctness requirements. A program synthesizer that interacts with a large, searchable database of program snippets is used to automatically complete the draft into a program that meets the requirements. Our evaluation uses the system in a suite of everyday programming tasks, and includes a comparison with a state-of-the-art competing approach as well as a user study. The results point to the broad scope and scalability of program splicing and indicate that the approach can significantly boost programmer productivity. Next, we propose an algorithm that automates the process of API refactoring, where the goal is to rewrite an API call sequence into another sequence that only uses the API calls defined in the target library without modifying the functionality. We solve the problem of API refactoring by combining the techniques of API translation and API sequence synthesis. We evaluated our algorithm on a diverse set of benchmark problems, and our algorithm can refactor API sequences with high accuracy. In addition, we conducted a user study which indicates that our algorithm can help human developers with API refactoring.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Program Synthesis, Big Data, Machine Learning
Citation

Lu, Yanxin. "Corpus-Driven Systems for Program Synthesis and Refactoring." (2019) Diss., Rice University. https://hdl.handle.net/1911/105342.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page