Urban Air

Established: July 24, 2012

urbanair_logo-04Using a diversity of big data to infer and predict fine-grained air quality throughout a city, and finally tackle air pollutions.

 

urbanair_getworse_small

http://urbanair.msra.cn/                Install Mobile Apps

Many countries are suffering from air pollutions. Many cities have built a few air quality monitoring stations to inform people urban air quality every hour. Influenced by multiple complex factors, however, urban air quality is highly skewed in a city, varying by locations significantly and changing over time differently in different places. Thus, we do not know the air quality of a location without a monitoring station. We do not what the air quality at a place will be tomorrow either, let alone the root cause the air pollution.

This project aims to predict the fine-grained air quality of current time throughout a city and forecast the air quality of future time at each monitoring station. We also expect to identify the root cause of air pollution. For example, what’s the proportion of PM2.5 in the environment derived from vehicular emission. what is the spatio-temporal causality interaction between the air pollutions of different cities?

Led by Dr. Yu Zheng, Urban Air is also a sub-project of Urban Computing, which is a research theme that aims to tackle big challenges in cities by using big data.

Framework

The research has been publicly available through a “cloud + client” framework, where the cloud continuously collect real-time data, such meteorological data and air quality data. A user can access the air quality information through using a mobile client or web client.

urbanair_2dbarcode_urbanairUrban Air

urbanair_framework

(WPhone-En)     (Chinese Mobile Apps)    website: http://urbanair.msra.cn/

Step 1: Infer Fine-Grained Air Quality

The first step of this project is to infer the real-time and fine-grained air quality of arbitrary location by using two parts of data. One is the real-time and historical air quality data from existing monitoring stations. The other is five additional data sources we observed in a city, consisting of meteorological data, traffic, human mobility, POIs, and road network data. We propose a semi-supervised learning approach based on a co-training framework that consists of two separated classifiers. One is a spatial classifier based on an artificial neural network (ANN), which takes spatially-related features (e.g., the density of POIs and length of highways) as input to model the spatial correlation between air qualities of different locations. The other is a temporal classifier based on a linear-chain conditional random field (CRF), involving temporally-related features (e.g., traffic and meteorology) to model the temporal dependency of air quality in a location. Read the related publications for more details.

Publications:

[1] Yu Zheng, Furui Liu, Hsun-Ping Hsieh. U-Air: When Urban Air Quality Inference Meets Big Data. In Proceedings of the 19th SIGKDD conference on Knowledge Discovery and Data Mining (KDD 2013). (Data) (Website) (Mobile App)(Video)

[2] Yu Zheng, Xuxu Chen, Qiwei Jin, Yubiao Chen, Xiangyun Qu, Xin Liu, Eric Chang, Wei-Ying Ma, Yong Rui, Weiwei Sun. A Cloud-Based Knowledge Discovery System for Monitoring Fine-Grained Air Quality. MSR-TR-2014-40.

A Dataset is released for research purposes: download the data.

Step 2: Forecast Air Quality at Each Station

The second step is to predict the fine-grained air quality of the next 48 hours. Specifically, in the first 6 coming hours, we predict a real-valued AQI for each kind of air pollutant, at each hour, in each station. For the next 7-12, 12-24, and 24-48 hours, we predict a max-min range of the AQIs at the corresponding time interval. Our predictive model is comprised of four major components: 1) a linear regression-based temporal predictor to model the local factor of air quality, 2) a neural network-based spatial predictor modeling the global factors, 3) a dynamic aggregator combining the predictions of the spatial and temporal predictors according to the meteorological data, and 4) an inflection predictor to capture the sudden changes of air quality.

urbanair_flyer-forecast

Publication:

[1] Yu Zheng, Xiuwen Yi, Ming Li, Ruiyuan Li, Zhangqing Shan, Eric Chang, Tianrui Li. Forecasting Fine-Grained Air Quality Based on Big Data. In the Proceeding of the 21th SIGKDD conference on Knowledge Discovery and Data Mining (KDD 2015).

Data Released!!

The service of Urban Air covers 300 cities:

urbanair_changing_coverage

Step 3: Deployment of Air Quality Monitoring Stations

Given a limited budget to build a few additional air quality monitoring stations, where shall we put them? The research solves this problem from the perspective of maximizing the inference accuracy and stability.

urbanair_station-selection

Publication:

[1] Hsun-Ping Hsieh*, Shou-De Lin, Yu Zheng. Inferring Air Quality for Station Location Recommendation Based on Big Data. In the Proceeding of the 21th SIGKDD conference on Knowledge Discovery and Data Mining (KDD 2015).

Step 4: Identify the Root Cause of Air Pollution

  1. Study the correlation between vehicular emission and air quality
  2. Identify the spatio-temporal causality between air pollutants of different cities.

Publication:

[1] Julie Yixuan Zhu, Chao Zhang, Huichu Zhang, Shi Zhi, Victor O.K. Li, Jiawei Han, and Yu Zheng, pg-Causality: Identifying Spatiotemporal Causal Pathways for Air Pollutants with Urban Big Data. IEEE Transactions on Big Data. DOI: 10.1109/TBDATA.2017.2723899 (Code and Data)

[2] Julie Yixuan Zhu, Chao Zhang, Yu Zheng, Shi Zhi, Victor O.K. Li, Jiawei Han. p-Causality: Identifying Spatio-temporal Causal Pathways for Air Pollutants with Urban Big Data, arXiv

Step 5: Study the Impact of Air Pollution to People's Health

People

  • Portrait of Yu Zheng

    Yu Zheng

    Vice President and Chief Data Scientist, JD Technology Group

Publications

Videos

Media Coverage

Acknowledgements

Acknowledgement

We appreciate our partners from Microsoft Product Teams who have been working with us closely in this project.

Specifically, Jacky Hsu, Qinying Liao and their team from C+E division contribute YourWeather App. (WPhone-CN; Android-CN, IOS)

We also appreciate our partners like Stella Ye and Sandy Qi (from Bing) who made Urban Air available on Bing Map http://cn.bing.com/ditu/.

There are a few interns who have worked with us in the urban air project. We may not be able to list all of them here.

Yubiao Chen, Xuxu Chen, Hsun-Ping Hsieh, Furui Li, Zhenni Feng, Zhangqing Shang, Ruiyuan Li, Xiuwen Yi.