Abstract:
|
Many longitudinal data -- in fields ranging from economics to the biomedical sciences, or the geosciences -- comprise both smooth and irregular elements. We consider scenarios in which an underlying smooth curve is composed not just with Gaussian errors, but also with irregular spikes that (a) are themselves of interest, and (b) can negatively affect our ability to characterize the underlying curve. For such scenarios, we propose an approach that, combining regularized spline smoothing and an Expectation-Maximization algorithm, allows one to both identify spikes and estimate the smooth component. Imposing some assumptions on the error distribution, we prove the convergence of EM estimates to the true population parameters. Next, we demonstrate the performance of our proposal on finite samples and its robustness to assumptions violations through simulations. Finally, we apply our proposal to the analysis of two-time series data: one concerns the annual heatwaves index in the US over the past 100 years, the other concerns the weekly electricity consumption in Ireland. We characterize the underlying smooth trends in both dataset, as well as identify the irregular/extreme behaviors.
|