All Times ET
Keywords: short text, representation learning, attention mechanism, convolutional neural network
In this paper, we address two challenges in domain-specific short text representation; namely, low semantic granularity and limited amount of training data. Specifically, in some domains, text representation should differentiate semantic meaning at detailed level, and hence the general cutting-edge language model (e.g., BERT) barely works very well. Additionally, misspelling in short text poses a challenge to general word level language model. To capture both syntactic and semantic similarity, we introduce the character-level language model to learn domain-specific short text representation. Furthermore, we use Convolutional Neural Network (CNN), Long Short-term Memory (LSTM), Attention Mechanism and the combinations of them for feature extraction. Our proposed solution shows that CNN combined with Attention Mechanism has the best performance. Moreover, the dense “character embedding” outperforms one-hot vector for encoding short text. Finally, we address the limited amount of training data in specific domain by one-shot learning, in which small amount of text data can form large amount of text pairs data to train the Siamese Neural Network.