Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 301 - Natural Language Processing Applications in Defense and National Security
Type: Topic Contributed
Date/Time: Wednesday, August 5, 2020 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistics in Defense and National Security
Abstract #312745
Title: Classifying Documents Through the Use of Artificial Intelligence
Author(s): Kelly Townsend* and Alex Firpi
Companies: Johns Hopkins University, Applied Physics Laboratory and Johns Hopkins University Applied Physics Lab
Keywords: Artificial Intelligence; Recurrent Neural Network; Long-short term memory; Natural Language Processing

The Johns Hopkins University Applied Physics Laboratory hypothesized that machine learning could be used to determine the security classification level of textual documents. To do this, military weapon performance reports were parsed by paragraph, retaining the portion-marking. The classification levels explored were UNCLASSIFIED, CONFIDENTIAL, SECRET and SECRET//FORMERLY RESTRICTED DATA. This data was then divided into training, validation and testing data sets. Next, a Long-Short Term Memory network was applied in an attempt to predict classification level based on the text within the paragraph. Nearly eighty percent accuracy was achieved for the testing and validation datasets. Upon further review of the paragraphs misclassified by the algorithm, it was determined that the algorithm was finding some mistakes and inconsistencies in the original markings. The ultimate goal is to develop a tool that will recommend portion-marking given input text. This tool would allow us to portion-mark documents in a more consistent way, and could potentially help identify areas of increased classification by aggregation.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program