Abstract:
|
Though Machine learning-based methods have been widely in stock mid-price movement prediction for high accuracy, the relevant feature engineering strategies are usually ignored. In this paper, we propose three novel strategies to make good use of high-frequency data and ameliorate their existing data issues simultaneously. We design an extensive collection of handcrafted features taking the long-term historical price effect into consideration and creatively use the lost information in the data thinning process. Moreover, we propose a new prediction framework, which enables us to randomly subsample and integrate various training models. Finally, we perform head-to-head experimental evaluations on real data to show the improvement of model efficiency. We find out that by improving the quality of input data, our modelling strategies enhance not only the prediction accuracy but also the interpretability, robustness as well.
|