Simon Larsén

Spork: Move-enabled structured merge for Java

Abstract

With software development becoming an increasingly parallel process utilizing the branching and merging features of decentralized version control systems to a high degree, the shortcomings of unstructured merge techniques are becoming more tangible. Structured merge, which typically operates on an abstract syntax tree instead of raw text, presents major improvements over unstructured merge in terms of avoiding unnecessary conflicts. There are however aspects of structured merge that can be improved, including slow execution, problems dealing with refactorings such as renamings and moves of code blocks, as well as poor preservation of source code formatting. In this thesis, we implement and evaluate Spork, a structured merge tool for the Java programming language. Spork is built using move-enabled diff and merge algorithms that can resolve refactoring-related conflicts automatically. The algorithms used are also highly efficient, making Spork scale well with larger merges. We also aim to better preserve source code formatting by reusing the original source code from the different file revisions, as a tool that fundamentally alters source code formatting is not usable in practice. We conduct an experimental evaluation comparing Spork to the state of the art structured merge tool JDime on 890 merge scenarios from 119 different open source Java projects, comprising a total of 1740 file merges. We find that Spork and JDime are equally fast in the median case, with median runtimes of 1.45 and 1.48 seconds per file merge, respectively. Below the median, Spork is slightly slower than JDime, but above the median it is much faster, indicating that Spork scales better with larger merges. Additionally, Spork produces fewer but slightly larger conflicts, mostly due to known issues. We also find that there is no statistically significant difference between the tools' abilities to semantically reproduce merges as committed by the developers, indicating that Spork and JDime are on par when it comes to correctness. We also show that Spork's move-enabled algorithms allows it to automatically resolve merges with renamings that cause conflicts for JDime. Put together, these results constitute strong evidence that move-enabled merge is feasible for use in a structured merge tool for the Java programming language. We also find that Spork's reuse of source code makes it more than 4 times better than JDime at preserving formatting in the median case. Although the results are promising, there are unresolved issues with conflicts related to moves and deletes that reduce Spork's performance and reliability, which necessitates further work.