Abstract:
|
Analysis of newswire data and current events is a time-consuming and tedious process. To automate and systematize the process, work has been performed on extracting entities (i.e., people, places, locations) from text and forming social networks. The resulting networks and entities can be used to quickly assess current events and actors. In this work, we examine the properties of these networks from a statistical and machine learning viewpoint. We consider two tasks: (1) network learning, and (2) network inference. For network learning, we investigate methods for constructing a network that reduces spurious connections and works well for an inference task. For network inference, the problem of role prediction in the network is considered. We apply collective classification techniques to predict roles given partially labeled data sets. Methods to characterize classification performance are also discussed. Experiments on a New York Times data set demonstrate that network characterization provides insight into prediction performance, and automatically constructed networks can be successfully used to predict roles.
|