Carter Yagemann

Assistant Professor of Computer Science and Engineering at the Ohio State University with interests in automated vulnerability discovery, root cause analysis, exploit prevention, and cyber-physical security.

Bunkerbuster to Appear in CCS'21


My coauthors and I will be presenting the paper, Automated Bug Hunting With Data-Driven Symbolic Root Cause Analysis, at CCS 2021. Below is a preview of the abstract:

The increasing cost of successful cyberattacks has caused a mindset shift, whereby defenders now employ proactive defenses, namely software bug hunting, alongside existing reactive measures (firewalls, IDS, IPS) to protect systems. Unfortunately the path from hunting bugs to deploying patches remains laborious and expensive, requires human expertise, and still misses serious memory corruptions. Motivated by these challenges, we propose bug hunting using symbolically reconstructed states based on execution traces to achieve better detection and root cause analysis of overflow, use-after-free, double free, and format string bugs across user programs and their imported libraries. We discover that with the right use of widely available hardware processor tracing and partial memory snapshots, powerful symbolic analysis can be used on real-world programs while managing path explosion. Better yet, data can be captured from production deployments of live software on end-host systems transparently, aiding in the analysis of user clients and long-running programs like web servers.

We implement a prototype of our design, Bunkerbuster, for Linux and evaluate it on 15 programs, where it finds 39 instances of our target bug classes, 8 of which have never before been reported and have lead to 1 EDB and 3 CVE IDs being issued. These 0-days were patched by developers using Bunkerbuster’s reports, independently validating their usefulness. In a side-by-side comparison, our system uncovers 8 bugs missed by AFL and QSYM, and correctly classifies 4 that were previously detected, but mislabeled by AddressSanitizer. Our prototype accomplishes this with 7.21% recording overhead.

The code and data for this project will be made available on my public Github account.