Barnum is an offline control flow attack detection system that applies deep learning on hardware execution traces to model a program's behavior and detect control flow anomalies. Our implementation analyzes document readers to detect exploits and ABI abuse. Recent work has proposed using deep learning based control flow classification to build more robust and scalable detection systems. These proposals, however, were not evaluated against different kinds of control flow attacks, programs, and adversarial perturbations.
We investigate anomaly detection approaches to improve the security coverage and scalability of control flow attack detection. Barnum is an end-to-end system consisting of three major components: 1) trace collection, 2) behavior modeling, and 3) anomaly detection via binary classification. It utilizes Intel® Processor Trace for low overhead execution tracing and applies deep learning on the basic block sequences reconstructed from the trace to train a normal program behavior model. Based on the path prediction accuracy of the model, Barnum then determines a decision boundary to classify benign vs. malicious executions.
We evaluate against 8 families of attacks to Adobe Acrobat Reader and 9 to Microsoft Word on Windows 7. Both readers are complex programs with over 50 dynamically linked libraries, just-in-time compiled code and frequent network I/O. Barnum shows its effectiveness with 0% false positive and 2.4% false negative on a dataset of 1,250 benign and 1,639 malicious PDFs. Barnum is robust against evasion techniques as it successfully detects 500 adversarially perturbed PDFs.
Source Code & Documentation
- Barnum Tracer - Collects PT traces from a KVM hypervisor.
- Barnum Learner - Processes the traces and builds models for classification.
- Barnum MLsploit - Module to integrate Barnum with the MLsploit framework.
Data & Models
Barnum can analyze the control flow of several kinds of programs for anomalies. The following links are to data and models for classifying PDF and Microsoft Word documents. These samples were traced on a Windows 7 virtual machine. The traced applications were Adobe Acrobat Reader 9.3 and Microsoft Word 2010.
Note: We cannot release the entire dataset used in the paper. As a result, users will encounter a false negative rate closer to 6% compared to the paper's 2.4%. We apologize for the inconvenience.
Unfortunately, we cannot release the original document malware used in our evaluation. Below are links to lists of hashes to help other researchers construct similar datasets.