The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles and animals. Many techniques have been proposed for processing, managing and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics. Following a roadmap from the derivation of trajectory data, to trajectory data preprocessing, to trajectory data management, and to a variety of mining tasks (such as trajectory pattern mining, outlier detection, and trajectory classification), the survey explores the connections, correlations and differences among these existing techniques. This survey also introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors, to which more data mining and machine learning techniques can be applied. Finally, some public trajectory datasets are presented. This survey can help shape the field of trajectory data mining, providing a quick understanding of this field to the community.
Yu Zheng. Trajectory Data Mining: An Overview. ACM Transactions on Intelligent Systems and Technology (ACM TIST). 2015, vol. 6, issue 3.
Before using trajectory data, we need to deal with a number of issues, such as noise filtering, segmentation, and map-matching. This stage is called trajectory preprocessing, which is a fundamental step of many trajectory data mining tasks.
Many online applications require instantly mining of trajectory data (e.g. detecting traffic anomalies), calling for effective data management algorithms that can quickly retrieve particular trajectories satisfying certain criteria (such as spatio-temporal constraints) from a big trajectory corpus. There are usually two major types of queries: the nearest neighbors and range queries. The former is also associated with a distance metric, e.g. the distance between two trajectories. Additionally, there are two types (historical and recent) of trajectories, which need different managing methods.
Here is the outline of this section.
Objects move continuously while their locations can only be updated at discrete times, leaving the location of a moving object between two updates uncertain. To enhance the utility of trajectories, a series of research tried to model and reduce the uncertainty of trajectories. On the contrary, a branch of research aims to protect a user’s privacy when the user discloses her trajectories.
The huge volume of spatial trajectories enables opportunities for analyzing the mobility patterns of moving objects, which can be represented by an individual trajectory containing a certain pattern or a group of trajectories sharing similar patterns. In this section, we survey the literature that is concerned with four categories of patterns: moving together patterns, trajectory clustering, frequent sequential patterns and periodic patterns.
Using supervised learning approaches, we can classify trajectories or segments of a trajectory into some categories, which can be activities (like hiking and dining) or different transportation modes, such as walking and driving.
Different from trajectory patterns that frequently occur in trajectory data, trajectory outliers (a.k.a. anomalies) can be items (a trajectory or a segment of trajectory) that is significantly different from other items in terms of some similarity metric. It can also be events or observations (represented by a collection of trajectories) that do not conform to an expected pattern (e.g. a traffic congestion caused by a car accident). Section 8 introduces anomaly detection from trajectory data.
besides studying trajectories in its original form, we can transform trajectories into other formats, such as graph, matrix and tensor. The new representations of trajectories expand and diversify the approaches for trajectory data mining, leveraging existing mining techniques, e.g. graph mining, collaborative filtering (CF), matrix factorization (MF), and tensor decomposition (TD).
Vice President and Chief Data Scientist, JD Technology Group