The increasing popularity of GPS device has boosted many Web applications where people can upload, browse and exchange their GPS tracks. In these applications, spatial or temporal search function could provide an effective way for users to retrieve specific GPS tracks they are interested in. However, existing spatial-temporal index for trajectory data has not exploited the characteristic of user behavior in these online GPS track sharing applications. In most cases, when sharing a GPS track, people are more likely to upload GPS data of the near past than the distant past. Thus, the interval between the end time of a GPS track and the time it is uploaded, if viewed as a random variable, has a skewed distribution. In this paper, we first propose a probabilistic model to simulate user behavior of uploading GPS tracks onto an online sharing application. Then we propose a flexible spatio-temporal index scheme, referred to as Compressed Start-End Tree (CSE-tree), for large-scale GPS track retrieval. The CSE-tree combines the advantages of B+ Tree and dynamic array, and maintains different index structure for data with different update frequency. Experiments using synthetic data show that CSE-tree outperforms other schemes in requiring less index size and less update cost while keeping satisfactory retrieval performance.

Download a free taxi trajectory dataset