Large-scale cyber attack traffic can present challenges to identify which packets are relevant and what attack behaviors are present. Existing works on Host or Flow Clustering attempt to group similar behaviors to expedite analysis, often phrasing the problem as offline unsuper-vised machine learning. This work proposes online processing to simultaneously segment traffic observables and generate attack behavior models that are relevant to a target. The goal is not just to aggregate similar attack behaviors, but to provide situational awareness by grouping relevant traffic that exhibits one or more behaviors around each asset. The seemingly clustering problem is recast as a supervised learning problem: classifying received traffic to the most likely attack model, and iteratively introducing new models to explain received traffic. A graph-based prior is defined to extract the macroscopic attack structure, which complements security-based features for classification. Malicious traffic captures from CAIDA are used to demonstrate the capability of the proposed attack segmentation and model generation (ASMG) process.
展开▼