GeoLife: Building Social Networks Using Human Location History

Established: February 6, 2009

GeoLife is a location-based social-networking service, which enables users to share life experiences and build connections among each other using human location history. Dr. Yu Zheng started this project in 2007 with his team.

Application Scenarios

  • GeoLife enables user to share travel experience using GPS trajectories.
  • By mining multiple users’ location histories, GeoLife can discover the top most interesting locations, classical travel sequences and travel experts in a given geospatial region, hence enable a generic travel recommendation.
  • By understanding individual location history, GeoLife can measure the similarity between users and perform personalized friend & location recommendation.


1. Sharing life experiences with GPS trajectories

Application Scenario:

By uploading your GPS data and associated multimedia content like photos to the website of GeoLife, you can interact with your trajectory like playing a video. First, you can enjoy and memorize your past experiences on a map. Second, you can share it with your friends. Thus, your friends can know where you have been, see what you saw and understand the whole journey within a few seconds. It is more intuitive and convenient than writing and reading a blog.

Difficulty: How to identify a user’s transportation modes

First, users change their transportation modes in a trip, e.g., drive to a place and then start walking. I.e., a GPS trajectory would contain multiple kinds of transportation modes. Second, the velocity of a mode suffers from the variable traffic condition.

Solution: Learning transportation mode based on GPS data (WWW2008)

First, weproposal a method to partition a GPS trajectory into segments of different transportation modes. Second, we identify a set of features being independent of velocity. Third, these features are fed into a classification model and output the probability of each segment being different transportation modes. Fourth, we learn a implied road map from multiple users’ GPS data and perform a post-processing.

2. Generic travel recommendation

Application Scenario:

By mining multiple users’ location histories, GeoLife can automatically discover the top most interesting locations and classical travel sequences in a given geographical region. The information can enable generic travel recommendation, which helps users understand an unfamiliar city within a short period and plan a trip with minimal effort.

Difficulty: (1) How to infer the interest level of a location, (2) How to calculate a user’s travel experience, and (3) how to detect classical sequences in a given geographical region.

First, the interest of a location does not only depend on the number of usersvisiting this location but also lie in these users’ travel experiences. Intrinsically, various people have different degrees of knowledge about a geospatial region. For instance, the local people of Beijing are more capable than overseas tourists of finding out high quality restaurants and famous shopping malls in Beijing.Second, an individual’s travel experience is region-related. You are familiar with Seattle, but, you might know little about Beijing. I am a travel expert in Beijing, but I have no idea about New York city.

Solution: Mining Interesting locations and travel sequences (WWW2009)

User travel experience and the location interest have a mutual reinforcement relationship. The user with rich travel experiences in a region would visit many interesting places in that region, and a very interesting place in that region might be accessed by many users with rich travel experiences. More specifically, a user’s travel experience can be represented by the sum of the interests of the locations they accessed; in turn, the interest of a location can be calculated by integrating the experiences of the users visiting it. Using a power iteration method, each user’s experience and each location’s interest can be calculated. See more

3. Personalized location & friend recommendation

Application Scenario:

Log onto MyGeoLife using your Live messenger account, GeoLife can recommend you a group of users in terms of the similarity between your location histories and theirs. As people’s location histories imply to some extent their tastes and preferences, these users, called potential friends, might share similar interests with you. Without GeoLife you would never know these potential friends even you guys have passed by with each other in a street many times. With this friend list, you can conveniently deliver invitations to these persons in the community and hence sponsor, with minimal effort, a social activity such as hiking, cycling, or traveling. As they share similar interest with you, they are more likely to accept your invitation. Further, from these potential friends’ past experiences, you are more likely to discover some places that might match your tastes while have not been found by yourself. It is a personalized location recommendation.

Difficulty: (1) How to estimate the similarity between users in terms of location history and (2) how to predict a user’s interest level on an unvisited location.

What’s a shared location when you try to measure the similarity between users. A restaurant is a location, a neiborhood is also a location. Various scales of the locations carry different meaning. Also, the sequences of users’ movement in geographical spaces imply different significance.

Solution: Measure user similarity and collaborative filtering-based inference

We propose a framework, referred to as hierarchical-graph-based similarity measurement (HGSM), which uniformly models people’s location histories and effectively estimates the similarity between users. In this framework, we consider the following three factors.

  1. Sequence property of users’ movement behaviors: We take into account not only the geographic regions they accessed, but also the sequence of these regions being visited. The longer similar sequences matched between two users’ location histories, the more related these two users might be.
  2. Popularity of different locations: Analog to inverse document frequency (IDF), we consider the visited popularity of a geographical region when measuring the similarity between users. Two users accessed a location visited by a few people might be more correlated than others who share a location history accessed by many people. For instance, lots of people have visited the Great Wall, a well-known landmark in Beijing. However, it might not mean all these people are similar to one another. However, if two users visited a restaurant, which is not that famous, they might indeed share some similar preferences.
  3. Hierarchy property of geographic spaces: We mine user similarity by exploring people’s movement behaviors on different scales of geographic spaces. Users who share similar location histories on geographical spaces with fine granularity might be more correlated than others who share location histories on geo-spaces with coarse granularity.


  • Portrait of Yu Zheng

    Yu Zheng

    Vice President and Chief Data Scientist, JD Finance Group