YJMob100K: City-Scale and Longitudinal Dataset of Anonymized Human Mobility Trajectories
Creators
- 1. MIT
- 2. Yahoo Japan Corporation
- 3. University of Tokyo
Description
The YJMob100K human mobility datasets (YJMob100K_dataset1.csv.gz and YJMob100K_dataset1.csv.gz) contain the movement of a total of 100,000 individuals across a 75 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of 80,000 individuals across a 75-day business-as-usual period, while the second dataset contains the movement of 20,000 individuals across a 75-day period (including the last 15 days during an emergency) with unusual behavior.
While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (cell_POIcat.csv.gz).
For details of the dataset, see arXiv preprint https://arxiv.org/abs/2307.03401
Researchers may use this dataset for publications and reports, as long as: 1) Users shall not carry out activities that involve unethical usage of the data, including attempts at re-identifying data subjects, harming individuals, or damaging companies, and 2) The Data Descriptor paper (https://arxiv.org/abs/2307.03401; citation below) needs to be cited when using the data for research and/or commercial purposes. Downloading this dataset implies agreement with the above two conditions.
- Yabe, T., Tsubouchi, K., Shimizu, T., Sekimoto, Y., Sezaki, K., Moro, E., & Pentland, A. (2023). Metropolitan scale and longitudinal dataset of anonymized human mobility trajectories. arXiv preprint arXiv:2307.03401. https://arxiv.org/abs/2307.03401
--- Details about the Human Mobility Prediction Challenge 2023 (ended November 13, 2023) ---
The challenge takes place in a mid-sized and highly populated metropolitan area, somewhere in Japan. The area is divided into 500 meters x 500 meters grid cells, resulting in a 200 x 200 grid cell space.
The human mobility datasets (task1_dataset.csv.gz and task2_dataset.csv.gz) contain the movement of a total of 100,000 individuals across a 90 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of a 75 day business-as-usual period, while the second dataset contains the movement of a 75 day period during an emergency with unusual behavior.
There are 2 tasks in the Human Mobility Prediction Challenge.
In task 1, participants are provided with the full time series data (75 days) for 80,000 individuals, and partial (only 60 days) time series movement data for the remaining 20,000 individuals (task1_dataset.csv.gz). Given the provided data, Task 1 of the challenge is to predict the movement patterns of the individuals in the 20,000 individuals during days 60-74. Task 2 is similar task but uses a smaller dataset of 25,000 individuals in total, 2,500 of which have the locations during days 60-74 masked and need to be predicted (task2_dataset.csv.gz).
While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (which is optional for use in the challenge) (cell_POIcat.csv.gz).
For more details, see https://connection.mit.edu/humob-challenge-2023
Files
Files
(530.8 MB)
Name | Size | Download all |
---|---|---|
md5:93af4bb0b417cd20e95fdc1d8f7b459f
|
626.6 kB | Download |
md5:900a6817ea800f6fc0146d70acf519b7
|
419.0 MB | Download |
md5:3e87860d96c57e89e126ffcd12955848
|
111.2 MB | Download |