Thursday, May 17

CyberLanguage: Applications of Natural Language Processing to CyberSecurity

Thu, May 17, 3:30 PM - 5:00 PM
Lake Fairfax A

Modeling Machine-to-Machine Cyber Data as Discrete Sequences of Activity (304363)

*Bartley Richardson, KeyW

Keywords: cyber language, Markov, sequence analysis, machine learning

As devices continue to gain network connectivity and require less interactivity from human operators, the nature of network transmissions is shifting and data is being created faster and with more fidelity than ever before. One way to view this data is in the context of a cyber language, analogous but semantically/syntactically different than a natural language. After sequences are constructed over a large dataset (e.g., PB of flow-like data), unsupervised machine learning and deep learning techniques are used to model communication, identifying typical behavior, and flagging unlikely events. This is accomplished by learning models that represent typical behavior (e.g., conversations or sequences) and then applying the model to score new incoming sequences to assign a likelihood score. One technique we use is variable order, hidden Markov models to learn machine-to-machine conversations. This work presents context for the foundations of this new approach to cyber anomaly detection as well as the enabling analytic techniques.

Online Program

Modeling Machine-to-Machine Cyber Data as Discrete Sequences of Activity (304363)

ASA Meetings Department