Guiding Greybox Fuzzing with Mutation Testing. Vasudev Vikram, Isabella Laybourn, Ao Li, Nicole Nair, Kelton OBrien, Rafaello Sanna, and Rohan Padhye. [PDF|ACM dl] Distinguished paper award!
The Surprising Efficacy of Linear Probabilistic Model Checking. Kelton OBrien and Markus Kuppe. Should be ready for submission sometime this winter/spring (of 2024), preview of content below.
This past summer I worked at Microsoft Research on TLA+, a temporal logic for specifying, verifying, and model checking algorithms. There I extended TLA+ to be able to reason about probabilities and particular lengths of time. In TLA+, one can generally only reason in absolutes. Either something occurs or it does not. There is no notion of probability, and no way to talk about how long something takes, so reasoning about probabilistic programs, or even the performance of a system directly is impossible (how do you deal with message loss? Likelyhood of particular operations in real world usage?). My contribution was to create a new temporal logic based on TLA+, extend it with new temporal and probabilistic operators to be able to reason about and verify these properties. I also wrote a transpiler from this new temporal logic to TLA+, which uses a novel algorithm to determine the number of runs required from the statistical model checking verification. This ends up being very difficult due to some properties of TLA+ (mainly its linearity), but with some clever optimizations and work-arounds, it still ends up being a powerful reasoning and verification tool. One thing that's special about TLA+ is that it is a very usable system --- Markus Kuppe teaches a TLA+ workshop that only takes a couple days and teaches you pretty much everything you need to know. Because you foward most of the complexity of verification to model checking, the verifier can focus just on writing specifications and properties they want to verify. My extensions to the system perserve this (I would say); I only add 2 new operators to the language, and a couple rules specifying how they can interact. I'm currently working on a paper on this work with Microsoft Research.
I am currently an undergraduate student researcher for the MELT research group, there I work on the silver attribute grammar system. I am currently developing an extension to the way that parsing and lexing is handled to increase polymorphism and ease of use, by making it possible to have polymorphic concrete syntax. That is, allowing things like, e.g, a comma seperated list paramaterized non-terminal, rather than seperate comma seperated lists for expressions, arguments, etc. More details to come as the project develops!
The summer of 2022 I did an REU at Carnegie Mellon, where I worked on mu2 with Rohan Padhye. Mu2, is a mutation analysis based fuzzer designed to to generate test cases. Fuzzers work by generating lots of inputs to a program, mutating old inputs to generate new inputs, and saving them if they increase code coverage. Fuzzers have been traditionally used for finding bugs in programs, but a set of inputs that has high code coverage also serves as a good test suite! Along this line, Mu2 improves fuzzers as tools for generating test suites by also keeping track of mutation score as a metric, a measure that is used for assessing the quality of the tests. This is a very expensive operation which requires the program to be run hundreds of times instead of just once, so my work involved optimizing Mu2 with dynamic analysis. You can check out the paper covering Mu2 above.
Also at MELT, I worked on the copper parser generator. My work consisted of implementing the paper Finding Counterexamples from Parsing Conflicts by Isradisaikul and Myers for Copper. I worked on this project very early on in my research career (spesifically when I was 18), so this was my introduction to parsing, and the research process in general. This was a large project that touched a lot of the LALR parsing apparatus, and significantly improved usability of the parser and all tools that use it. Copper is used in the PL course at UMN, and these changes have made it so that the students can focus on the important ideas, rather than the details of the LALR parsing algorithm.