Abstract

Many sensors have  been deployed in the physical world, generating massive geo-tagged time series data. In reality, we usually lose readings of sensors at some unexpected moments because of sensor or communication errors. Those missing rea­dings do not only affect real-time monitoring but also com­promise the performance of further data analysis. In this paper, we propose a spatio-temporal multi-view-based learning (ST-MVL) method to collectively fill missing readings in a collection of geo-sensory time series data, considering 1) the temporal correlation between readings at different timestamps in the same series and 2) the spatial correlation between different time series. Our meth­od combines empirical statistic models, consisting of Inverse Dis­tance Weighting and Simple Expone­ntial Smooth­ing, with data-driven algorithms, com­prised of User-based and Item-based Collaborative Filtering. The former models handle the general missing cases based on empirical assumptions derived from his­tory data over a long period, stand­ing for two global views from a spatial and temporal perspective respe­ctively. The latter algorithms deal with special cases where empirical assumptions may not hold, based on recent contexts of data, denoting two local views from a spatial and temporal perspective respectiv­ely. The predictions of the four views are aggregated to a final value in a multi-view learning algorithm. We evaluate our method based on Beijing air quality and meteorological data, finding our model’s advan­tages beyond ten baseline approaches.

The data and codes have been released!

flyer_IJCAI16_missing values