2407 09820v1
2407 09820v1
2407 09820v1
Abstract
The rise of dockless bike-sharing systems has led to increased interest in using bike-sharing
data for urban transportation and travel behavior research. However, few studies have
focused on the individual daily mobility patterns, hindering their alignment with the
increasingly refined needs of urban active transportation planning. To bridge this gap, this
study presents a two-layer framework, integrating improved flow clustering methods and
multiple rule-based decision trees, to mine individual cyclists' daily home-work commuting
patterns from vast dockless bike-sharing trip data with users’ IDs. The effectiveness and
trip records in Shenzhen. Ultimately, based on the mining results, we obtain two categories
with-transit commuters) and some interesting findings about their daily commuting patterns.
For instance, lots of bike-sharing commuters live near urban villages and old communities
with lower costs of living, especially in the central city. Only-biking commuters have a higher
proportion of overtime than Biking-with-transit commuters, and the Longhua Industrial Park,
day). Massive commuters utilize bike-sharing for commuting to work more frequently than
for returning home, which is closely related to the over-demand for bike-sharing around
workplaces during commuting peak. Overall, this framework offers a cost-effective way to
understand residents' non-motorized mobility patterns. Moreover, it paves the way for
socio-economic attributes.
1. Introduction
and sustainable mode of transportation, which has a beneficial effect on reducing traffic
congestion, energy consumption, and air pollution (DeMaio, 2009; Handy et al., 2014). In
the past decade, the spread of the bike-sharing programs has further expanded the benefits of
cycling. For example, the convenience of mobile payments and the flexibility of station-less
rental services have made dockless bike-sharing, one of the innovative bike-sharing systems,
widely accepted and utilized worldwide (Zhang & Mi, 2018; Si et al., 2019). These bike-
sharing programs enable cycling to play an essential role in solving the first-and-last-mile
trip problem and enhancing urban transport resilience (Fishman, 2016; Cheng et al., 2021;
Teixeira et al., 2021). Therefore, how to increase the cycling willingness of residents to
promote the development of sustainable and active transportation has received extensive
research attention.
have the limitations of high cost, low timeliness, and small sample size (Li et al., 2021). With
the advent of big data era and new bike-sharing systems, the availability of GPS datasets
from bike-sharing operators have opened opportunities for cycling-related research. Existing
literature has proven that such GPS trajectory data have the advantages of objectivity, high
spatiotemporal resolution, and large sample volume (Lu & Liu, 2012). Meanwhile, many
scholars have used these data for cycling influence mechanisms analysis (Shen et al., 2018;
Ma et al., 2020; Gao et al., 2021a), travel pattern mining (Zhou, 2015; Du et al., 2019; Yao
et al., 2019; Cao et al., 2020; Zhang et al., 2021a; Gao et al., 2022), purpose inference (Xing
et al., 2020; Li et al., 2021; Ross-Perez et al., 2022), and benefit assessment (Zhang & Mi,
2018; Luo et al., 2019; Gao et al., 2021b). For instance, Shen et al. (2018) explored the factors
influencing bike usage based on nine consecutive days of bike-sharing trip records in
Singapore, and found that high land use mixtures, easy access to public transportation, and
more available cycling facilities are positively correlated with bike-sharing usage. In a study
that used a week of bike-sharing data collected in Shenzhen, Li et al. (2021) proposed a
framework for inferring the trip purpose of cyclists based on gravity models and Bayesian
rules, and revealed the spatiotemporal patterns of nine categories of travel activities.
Additionally, Zhang & Mi (2018) extracted bike-sharing usage and trip distances in Shanghai,
studies are meaningful as they deepen current understanding of the role bike-sharing play in
focused on the daily travel habits of individual cyclists, despite some leveraging datasets that
contain user IDs. To date, the most relevant research has been conducted by a limited number
of scholars who attempt to explore the travel characteristics of different user groups, utilizing
user attributes information (e.g., age, gender, and membership) available within the docked
bike-sharing trip dataset (Zhou, 2015; Yao et al., 2019; Ma et al., 2020). For example, Zhou
(2015) constructed bike flow similarity graphs and used community detection techniques to
discover the different travel trends for customers and subscribers. Although these studies
contributed to understanding the differences in travel patterns within the cycling groups,
these methods are not applicable to dockless bike-sharing trip dataset that include little
individual attribute information (for privacy concerns). Moreover, the above research merely
categorizes cycling user groups based on the similarities in user attributes, rather than
Notably, mining the individual daily mobility patterns of bike-sharing users holds
significant implications for the increasingly refined planning of active transportation system
(Ferretto et al., 2021). For instance, it can serve as a low-cost, high-coverage technique to
active travel patterns or cycling needs. Furthermore, if bike-sharing users' residential and
workplace information can be identified from individual daily mobility patterns, it would
enable the integration of various socio-economic data (e.g., housing price) to explore fine-
scale studies of cycling behaviors considering population differentiations (Xu et al., 2018;
bicycle-friendly environments.
So far, there have been some studies proposing methodological frameworks for mining
individual daily mobility patterns based on specific geotagged big data, such as cellphone
call detail records (CDR) data (Kung et al., 2014; Jiang et al., 2017; Yin et al., 2021), check-
in data (Cheng et al., 2011; Li et al., 2013; Wu et al., 2023), and smart card data (Sari Aslam
et al., 2019; Zhang et al., 2020). However, these studies' utilized geotagged data do not
include any fields related to cycling trips, and thus we cannot identify individual cycling
mobility patterns from their mining results. Additionally, due to differences in data features,
travel characteristics, and influencing factors, dockless bike-sharing trip data are not suitable
as inputs for these frameworks. For example, Jiang et al. (2017) developed an integrated
pipeline that can parse, filter, and expand the CDR data to extract human mobility patterns.
However, since most bike-sharing trip data only record cycling origins and destinations
(ODs), rather than capturing continuous trajectory like CDR data, CDR-based extraction
methods are not suitable for bike-sharing data. In a study leveraging Twitter check-in data,
Cheng et al. (2011) proposed a recursive grid search method to detect users' homes and
sharing data into check-in-like data by delineating the origin and destination of each trip, this
approach results in the loss of key cycling attribute (e.g., trip distance and duration). Thus,
using bike-sharing data to check-in-based mining methods can only exploit partial data
information. Compared with the geotagged data mentioned above, the features of smart card
data are closer to those of bike-sharing data. Recently, several studies have proposed methods
Sari Aslam et al. (2019) for detecting the residence and workplace of individuals, and a
decision tree method presented by Zhang et al. (2020) for identifying the individual stay areas.
Nevertheless, noted that the locations of transit stations in the smart card data are fixed, which
is significantly different from dockless bike-sharing. Moreover, the travel characteristics and
influencing factors of public transport also differ from those of cycling (e.g., shorter trip
distances, more affected by weather and etc.). Hence, there are limitations in using the
extraction method based on smart card data for dockless bike-sharing data.
In summary, to address the gaps in related studies, this paper will present a two-layer
framework that aims to capture the most dominant daily mobility pattern of individual
develop flow clustering methods that improved spatiotemporal constraints tailored to the
flow clusters, effectively representing the daily travel trajectories of individuals, from the
biking records that lack geocoding information. However, these trajectories identified in
decision trees that incorporate round-trip journeys, working hours, and public transportation
transfers for identifying daily commuting trips within individual spatiotemporal flow clusters
examine the effectiveness and applicability of this two-layer framework, this paper conducts
an empirical study using comparative analysis and residence location test in Shenzhen, China,
a metropolis with over one million daily bike-sharing trips. Finally, based on the extracted
results of individual commuting trips of bike-sharing users, we further analyze their daily
Preprint submitted to Elsevier Page 6
commuting characteristics and spatiotemporal patterns and discuss some meaningful findings
the most densely populated and economically prosperous regions in China. By the end of
2021, Shenzhen has a permanent population of over 17 million and a regional GDP of over
300 billion RMB (Guangdong Statistical Yearbook, 2021). The high-frequency population
mobility and booming economic activities are accompanied by huge travel demand. The
well-established public transportation systems (11 metro lines and 927 bus lines have been
opened as of 2021, Transportation Bureau of Shenzhen, 2021) and shared mobility services
(e.g., bike-sharing and ride-sharing) have played an important role in meeting residents`
travel needs. Among them, the dockless bike-sharing system was first introduced to Shenzhen
in 2016. After the initial period of market dominance and the subsequent period of policy
regulation, bike-sharing services have recently been integrated into the daily mobility of local
residents. As of July 2022, Shenzhen has over 41,000 dockless bike-sharing with an average
of approximately 1.29 million daily trips (Statistics Bureau of Shenzhen, 2022). Usage
hotspots are notably in the Futian, Nanshan, Luohu, Bao'an, and Longhua districts (Fig.1),
with the extensive bike-sharing trips data offering a rich resource for this paper to mine
The dockless bike-sharing data used in this study are collected from the Shenzhen
government data open platform (https://opendata.sz.gov.cn/). The dataset stores over 244
million riding records between January and August 2021, which includes the user IDs and
the coordinates and time information of OD. Notably, all user IDs are encrypted and no
personal privacy information can be obtained (Table 1). In addition, considering the integrity
and continuity of the raw dataset, we finally extract approximately 146 million records that
occurred on all weekdays between April 8 and August 28, 2021 for the empirical study below
(Fig.1). The exclusion of bike-sharing records during weekends and holidays is due to the
substantial occurrence of non-commuting trips during these periods, which could increase
data noise.
Moreover, this study acquired historical daily weather data for the study period
and passing bus or metro routes information) in 2021 (https://lbs.amap.com/). The former is
3. Methodology
The flowchart of this paper depicted in Fig.2. Initially, "Data filtering and identify active
bike-sharing users" step aims to eliminate abnormal cycling records and bike-sharing inactive
users to enhance data quality. Then, the two-layer novel framework is employed to mine
individual daily cycling commuting patterns, which is the centerpiece of this paper.
Subsequently, "Evaluation and validation" step intends to examine the effectiveness and
applicability of our framework through the comparative analyses of flow clustering methods
and the testing of user's residences. Finally, we aggregate and visualize the mining results of
bike-sharing users to reveal their commuting regularities and spatiotemporal patterns in the
study area.
To ensure the accuracy and authenticity of the bike-sharing data applied to this study,
anomalous or redundant records need to be cleaned. First, with reference to existing studies
(Shen et al., 2018; Gao et al., 2021), the unrealistic long-or-short distances or durations
cycling trips due to GPS drifting errors or user misoperations are eliminated. Afterwards, we
aggregate the trips of each bike-sharing user based on user IDs and exclude duplicate records
existing within the same user. Ultimately, about 2.53 million users' cycling records are
extracted.
Moreover, by tallying the number of active days (i.e., have at least one trip record within
a day) for users in the filtered weekdays data (Fig.3(a)), we also observe the issues of data
sparsity. While some users heavily rely on bike-sharing for their daily activities, others, such
as tourists or occasional cyclists, contribute sporadically to the dataset. For the latter, their
limited trips cannot adequately capture their daily cycling habits. Hence, it is necessary to
In a related study, Xu et al. (2018) defined active users in CDR data as those with at
least one record for at least half of the study period. However, for bike-sharing dataset, it is
crucial to consider the influence of weather on daily cycling. Existing studies have indicated
that rainfall can significantly restrict cycling during commuting hours, as people tend to
choose other safer transport modes (Reiss & Bogenberger, 2016; Shen et al., 2018). Similarly,
in the dataset we used, bike-sharing usage is generally observed to be lower on drizzly and
rainy days (Fig.3(b)). Hence, expanding on the approach of Xu et al. (2018), we exclude the
rain-impacted weekdays to establish the threshold for active bike-sharing users, calculated as
half of the total weekdays during the data collection period minus the number of drizzly and
rainy days. In this study, with 100 weekdays and 21 drizzly and rainy days, the threshold is
set at 29 days, thus identifying approximately 0.75 million active bike-sharing users for
subsequent processing.
Fig.3 (a) Histogram of the number of active days for bike-sharing users on weekdays; (b)
Fig. 4 shows the diagram of our two-layer framework. In Layer 1, to address the lack of
(ISTFCs) representing the user's daily cycling trajectories. In Layer 2, multiple rule-based
decision trees that integrate round-trip journeys, working hours, and public transportation
transfers is built to identify of bike-sharing commuting behaviors from the ISTFCs extracted
in Layer 1.
In this paper, the methods in Layer 1 can be divided into three essential steps: Spatial
This step aims to extract the daily trajectories of individual bike-sharing users from the
spatial perspective. In this study, we apply the spatial flow clustering method proposed by
Gao et al. (2020) and make enhancements based on the travel characteristics of bike-sharing.
In the original method, spatial dissimilarity 𝑆𝐷𝑖𝑗 is the key indicator for clustering, which is
calculated as follows:
2 2 [1]
𝑆𝐷𝑖𝑗 = √𝑠𝑑𝑖𝑗𝑜 + 𝑠𝑑𝑖𝑗𝑑
where 𝑠𝑑𝑖𝑗𝑜 and 𝑠𝑑𝑖𝑗𝑑 respectively represent the spatial dissimilarity between the OD of
𝑑𝑖𝑠𝑡(𝑂𝑖 , 𝑂𝑗 )
𝑠𝑑𝑖𝑗𝑜 =
𝛼 × min(𝑙𝑒𝑛𝑖 , 𝑙𝑒𝑛𝑗 )
, 𝛼 × min(𝑙𝑒𝑛𝑖 , 𝑙𝑒𝑛𝑗 ) ≤ 200 [2]
𝑑𝑖𝑠𝑡(𝐷𝑖 , 𝐷𝑗 )
𝑠𝑑𝑖𝑗𝑑 =
{ 𝛼 × min(𝑙𝑒𝑛𝑖 , 𝑙𝑒𝑛𝑗 )
where 𝑑𝑖𝑠𝑡(𝑂𝑖 , 𝑂𝑗 ) and 𝑑𝑖𝑠𝑡(𝐷𝑖 , 𝐷𝑗 ) denote the Euclidean distance between the same
endpoints of two flows. 𝑙𝑒𝑛𝑖 and 𝑙𝑒𝑛𝑗 are the lengths of two flows, respectively. 𝛼 is a size
coefficient which sets the radius of the boundary circle together with 𝑚𝑖𝑛(𝑙𝑒𝑛𝑖 , 𝑙𝑒𝑛𝑗 ), as
displayed in Fig.4(a). In this paper, referring to existing research (Gao et al., 2020, Liu et al.,
However, note that the formula of 𝑆𝐷𝑖𝑗 determines that the radius of the boundary circle
clustering. Although this feature has limited impact on regional-level flow studies, for
individual-level related studies, it introduces the noise into clustering results and increases
the uncertainty into the extent of individual's daily activities. For example, with
𝑚𝑖𝑛(𝑙𝑒𝑛𝑖 , 𝑙𝑒𝑛𝑗 )=3000 m, the boundary circle radius is 900 m, covering an area of 2.54 km2.
Hence, to obtain more realistic individual biking flows, we cap the maximum radius of
boundary circle at 200 m, following precedents in bike-sharing research (Yang et al., 2019;
Li et al., 2021). After settings parameters, flows 𝑖 and 𝑗 are deemed spatially similar
Then, taking all riding records of each active bike-sharing user as input, spatial flow
clustering is performed according to the algorithm proposed by Gao et al. (2020). Finally,
each individual spatial flow cluster (ISFC) can be denoted as {𝐼𝐷𝑈𝑆𝐸𝑅 , 𝐼𝐷𝐼𝑆𝐹𝐶 , (𝑂, 𝐷), 𝑛},
where 𝐼𝐷𝐼𝑆𝐹𝐶 is the unique identifier of each ISFC, (𝑂, 𝐷) are the OD medoids of all biking
flows in ISFC, and 𝑛 is the count of flows in ISFC. Notably, given that some ISFCs may
include insufficient trips to represent a user's daily patterns, we set a minimum threshold for
the number of biking flows in each user's ISFCs: one-fifth of the number of weekdays with
recorded bike-sharing usage. Only the ISFCs that satisfy the threshold requirement are
Based on the results of the spatial flow clustering approach, this step further improves
the spatiotemporal flow clustering method proposed by Yao et al. (2018) to extract individual
user's daily mobility patterns from the temporal perspective. The core of their method is the
𝑇𝑖 ∩ 𝑇𝑗
𝑡𝑠𝑖𝑗 = [3]
𝑇𝑖 ∪ 𝑇𝑗
where 𝑇𝑖 = [𝑜𝑡𝑖 , 𝑑𝑡𝑖 ] and 𝑇𝑗 = [𝑜𝑡𝑗 , 𝑑𝑡𝑗 ] denote the time spans of flows 𝑖 and 𝑗 in the same
time spans of 𝑖 and 𝑗 overlap, 𝑡𝑠𝑖𝑗 is greater than zero. For instance, when 𝑇𝑖 = [8: 00, 8: 40]
and 𝑇𝑗 = [8: 15,8: 50], 𝑇𝑖 ∩ 𝑇𝑗 is 25 min while 𝑇𝑖 ∪ 𝑇𝑗 is 50 min, then 𝑡𝑠𝑖𝑗 is 0.5.
It is noteworthy that, due to the individual-level focus and the average of 3.6 bike-
sharing trips per weekday among active users, our study deems it is impractical to calculate
𝑡𝑠𝑖𝑗 for travel flows on specific adjacent dates, as conducted by Yao et al (2018). Instead, this
timeframe. Simultaneously, this strategy is also more conducive to capturing the genuine
mobility of bike-sharing users, because most residents follow regular daily travel patterns,
especially commuting trips. For instance, suppose that 𝑇𝑖 above occurs on Monday and 𝑇𝑗 on
Friday, we still assume that their time spans overlap. Moreover, previous research has
validated the application of the temporal similarity indicator in taxi trip data (Yao et al., 2018).
Nevertheless, bike-sharing trips are typically shorter (the average cycling duration for the
dataset we used is around 10 min), which can result in a zero temporal similarity even if the
travel times of the two biking flows are sufficiently close (e.g., when 𝑇𝑖 =[8:05,8:15] and
𝑇𝑗 (i.e., 𝑇𝑖 = [𝑜𝑡𝑖 − 𝛽, 𝑑𝑡𝑖 + 𝛽] and 𝑇𝑗 = [𝑜𝑡𝑗 − 𝛽, 𝑑𝑡𝑗 + 𝛽]) to ensure that the time-adjacent
cycling flows can be identified and clustered. In this study, 𝛽 is set to 30 min (more details
that the travel times of flows 𝑖 and 𝑗 are adjacent when 𝑡𝑠𝑖𝑗 ≥0.5.
Later, we use the biking records including in each user's ISTC as input and execute the
spatiotemporal flow clustering algorithm by Yao et al. (2018). Ultimately, each ISTFC can
be denoted as {𝐼𝐷𝑈𝑆𝐸𝑅 , 𝐼𝐷𝐼𝑆𝐹𝐶 , 𝐼𝐷𝐼𝑆𝑇𝐹𝐶 , (𝑂, 𝐷), 𝑛′ , 𝑇𝑜 , 𝑇𝑑 } , where 𝐼𝐷𝐼𝑆𝐹𝐶 is the unique
identifier of the ISFC to which the ISTFC belongs, 𝐼𝐷𝐼𝑆𝑇𝐹𝐶 is the unique identifier of each
ISTFC. 𝑛′ is the number of biking flows in the ISTFC, and 𝑇𝑜 and 𝑇𝑑 are the average starting
and ending time of these flows, respectively. The resulting ISTFCs are used in the subsequent
processing.
By observing the result of spatiotemporal flow clustering, we find that some ISTFCs are
spatiotemporally adjacent but not merged, as illustrated in Fig.4(c). The reasons are relevant
to two aspects: First, some bike-sharing users have multiple optional routes to and from the
same daily activity places. The locations of ODs (e.g., different entrances to an industrial
park) and the direction of their trip flows vary with the different routes, which leads to
difficulties in clustering them into the same ISTFC. Second, in "Spatial flow clustering" step,
the restriction of boundary circle may result in dividing the cycling flows into more ISFCs.
individual daily activities. To improve the utilization of biking records for these affected
users, we examine and merge neighboring ISTFCs in the last step of Layer 1.
Given a set of all ISTFCs for an active bike-sharing user 𝐹𝐶 and the size coefficient 𝛼,
the process of neighbor ISTFCs merging is shown in Algorithm 1. In short, two ISTFCs 𝐹𝐶𝑖
(1) The temporal similarity 𝑡𝑠𝑖𝑗 is not less than 0.5, which is consistent with
(2) The boundary circle at the same endpoints of 𝐹𝐶𝑖 and 𝐹𝐶𝑗 must intersect (i.e., the
distance between these endpoints should be less than twice of 𝛼 × 𝑚𝑖𝑛(𝑙𝑒𝑛𝑖 , 𝑙𝑒𝑛𝑗 )),
and the radius of the boundary circle is calculated consistent with "Spatial flow
clustering" step.
When 𝐹𝐶𝑖 and 𝐹𝐶𝑗 satisfy the above conditions, 𝐹𝐶𝑗 is merged by 𝐹𝐶𝑖 . Meanwhile, the
Input: 𝐹𝐶 = {𝐹𝐶𝑖 |1 ≤ 𝑖 ≤ 𝑛} ← a set of all ISTFCs for an active bike-sharing user; and 𝛼
Return: A set of all ISTFCs for this user after merging 𝐹𝐶 = {𝐹𝐶𝑖 |1 ≤ 𝑖 ≤ 𝑚}.
Similarly, through the above flow clusters merging process, some ISTFCs still have
they contain. To do so, we set a minimum threshold for filtering ISTFCs: an ISTFC must
contain at least 20% of the number of biking flows within its corresponding ISFC (i.e., 𝑛′ ≥
0.2 × 𝑛). Only the ISTFCs that satisfy this threshold are deemed reliable and employed as
inputs to Layer 2.
While the ISTFCs acquired in Layer 1 capture the spatiotemporal patterns of individual
daily mobility, they lack semantic information about the associated activities. In Layer 2, we
build three rule-based decision trees considering round-trip journeys, working hours, and
public transportation transfers, aiming to mine individual commuting patterns from the
Initially, we develop the "Candidate commuting flow identifier" decision tree (Fig.5),
focusing on round-trip frequencies and working hours. This identifier aims to extract latent
daily commuting patterns, i.e., individual candidate commuting flows (ICCFs), from the
(1) The commuting behavior should be characterized by frequent and symmetrical (i.e.,
(2) There should be a substantial time interval between the biking flows in opposite
Hence, in "Candidate commuting flow identifier", given two ISTFCs 𝐹𝐶𝑖 and 𝐹𝐶𝑗 for
Specifically, we require that both 𝑑𝑖𝑠𝑡(𝑂𝑖 , 𝐷𝑗 ) and 𝑑𝑖𝑠𝑡(𝑂𝑗 , 𝐷𝑖 ) be less than boundary circle
adjacency, we further evaluate whether the time interval between them exceeds the minimum
working hours threshold 𝑇𝑤ℎ , as shown in Fig.4(d). Following Sari Aslam et al. (2019), this
paper establish 𝑇𝑤ℎ at 4 hours to effectively capturing the daily working behaviors of full-
time, part-time, and shift workers. If 𝐹𝐶𝑖 and 𝐹𝐶𝑗 satisfy the above two conditions, they are
{𝐼𝐷𝑈𝑆𝐸𝑅 , 𝐼𝐷𝐼𝐶𝐶𝐹 , 𝐼𝑆𝑇𝐹𝐶𝑒 , 𝐼𝑆𝑇𝐹𝐶𝑙 } , where 𝐼𝑆𝑇𝐹𝐶𝑒 and 𝐼𝑆𝑇𝐹𝐶𝑙 denote the ISTFCs with
earlier and later travel time, respectively. Notably, before inputting the user's ISTFCs into
the identifier, we sort them by the number of cycling flows they encompass in descending
order. This step is taken because larger flow clusters tend to encapsulate user's daily activity
patterns. Meanwhile, by prioritizing the traversal of these clusters, we also can expedite the
of a pair of ISTFCs into a single flow (Fig.4(e)). The direction of simplified ICCF (SICCF)
is set to match the ISTFC with earlier travel time (i.e., 𝐼𝑆𝑇𝐹𝐶𝑒 ). Thus, for each SICCF, its
origin (i.e., 𝑂𝐼𝐶𝐶𝐹 ) is the midpoint of the origin of 𝐼𝑆𝑇𝐹𝐶𝑒 and the destination of 𝐼𝑆𝑇𝐹𝐶𝑙 .
Conversely, 𝐷𝐼𝐶𝐶𝐹 is the midpoint of the origin of 𝐼𝑆𝑇𝐹𝐶𝑙 and the destination of 𝐼𝑆𝑇𝐹𝐶𝑒 .
Additionally, we define the following eight attributes for each SICCF to identify and analyze
(1) Departure time for the ISTFC with earlier travel time (𝑇𝑒 ): the 𝑇𝑜 of 𝐼𝑆𝑇𝐹𝐶𝑒 ;
(2) Departure time for the ISTFC with later travel time (𝑇𝑙 ): the 𝑇𝑜 of 𝐼𝑆𝑇𝐹𝐶𝑙 ;
(3) Cycling time for the ISTFC with earlier travel time (𝐶𝑇𝑒 ): the difference between 𝑇𝑜
and 𝑇𝑑 of 𝐼𝑆𝑇𝐹𝐶𝑒 ;
(4) Cycling time for the ISTFC with later travel time (𝐶𝑇𝑙 ): the difference between 𝑇𝑜
and 𝑇𝑑 of 𝐼𝑆𝑇𝐹𝐶𝑙 ;
(5) Cycling commuting distance (𝐶𝐷): the Euclidean distance between 𝑂𝐼𝐶𝐶𝐹 and 𝐷𝐼𝐶𝐶𝐹 ;
(6) Working hours (𝑊𝐻): the difference between the 𝑇𝑑 of 𝐼𝑆𝑇𝐹𝐶𝑒 and the 𝑇𝑜 of 𝐼𝑆𝑇𝐹𝐶𝑙 ;
(7) Total number of biking flows (𝑛𝑡 ): the sum of the biking flows in 𝐼𝑆𝑇𝐹𝐶𝑒 and 𝐼𝑆𝑇𝐹𝐶𝑙 ;
(8) Cycling round-trip rate (𝑅𝑟𝑡 ): the ratio of the number of biking flows in 𝐼𝑆𝑇𝐹𝐶𝑙 and
𝑛𝑡 , this indicator can measure the imbalance in commuting frequencies between the
accounting for public transit transfers, to identify latent daily transfer commuting behaviors
from users' SICCFs. This consideration arises from research indicating that transferring to
public transportation, especially the metro, is the important travel purpose of bike-sharing
(Xing et al., 2020; Li et al., 2021). Additionally, the integrated use of bike-sharing and public
transportation has attracted significant research attention recently (Ma et al., 2019; Guo &
He, 2020; Kim, 2023). Therefore, it is crucial to determine whether bike-sharing users
regularly cycle to connect with public transit for their daily commuting. The workflow of the
(1) Take the public transport station data and a user's SICCF as input, and set a maximum
referring to Liu et al. (2022), and 30 m for buses, which are deemed less attractive
(2) If the SICCF's departure time is outside the public transportation operating hours
(from 6:00 to 23:30 in our study area), it is considered not connected to public
(3) Identify the nearest public transport stations to the OD of the SICCF (i.e., 𝑂𝐼𝐶𝐶𝐹 and
both exceed 𝑇𝐷, this SICCF is deemed to not connected to public transit. Conversely,
we proceed.
(4) If 𝑑𝑖𝑠𝑡(𝑂𝐼𝐶𝐶𝐹 , 𝑆𝑜 ) < 𝑑𝑖𝑠𝑡(𝐷𝐼𝐶𝐶𝐹 , 𝑆𝑑 ), i.e., the SICCF's origin is closer to its nearest
public transport station, we still cannot conclude that this user regularly commutes
transport stations often coexist with various activity places, especially around metro
stations (Liu et al., 2022). In this case, we need further compare the distance from
the SICCF's other endpoints (i.e., 𝐷𝐼𝐶𝐶𝐹 ) to its nearest public transport station (i.e.,
𝑆𝑑 ) with this SICCF's length (i.e., 𝐶𝐷). If 𝑑𝑖𝑠𝑡(𝐷𝐼𝐶𝐶𝐹 , 𝑆𝑑 ) < 𝐶𝐷 and 𝑆𝑜 and 𝑆𝑑 are
on the same public transit line, it is argued that the SICCF is not connected to public
transit, because as the user has chosen a longer cycling route instead of a shorter
Note that for each SICCF, we employ the "Transfer commuting flow identifier" to
assess connections with bus and the metro systems. When a SICCF qualifies for connectivity
with both, the metro is prioritized over the bus (Guo & He., 2020).
Finally, we build the "Biking commuting user classifier" decision tree to identify and
categorize the most predominant daily commuting patterns among individual bike-sharing
users (Fig.7). In our study, the SICCF with the highest count of biking records (i.e., 𝑛𝑡 ) is
deemed most representative of a bike-sharing user's daily commuting patterns during the
study period and is designated as the individual daily commuting flow (IDCF). Users are
classified into two main categories: Only-biking and Biking-with-transit commuters, and the
commuters, drawing insights from relevant studies (Singleton & Clifton, 2014; Guo et al.,
2021). The definition of the different user categories in Fig.7 are outlined as follows:
(2) If the IDCF lacks a connection to public transit, this user is classified as an Only-
The OD of the IDCF represent this user's residence and workplace, respectively.
(3) If the IDCF's origin is connected to public transit, it signifies that the IDCF represents
the user's daily "last-mile" commuting to work by bicycling from a transit station (or
the "first-mile" commuting from his/her workplace to the transit station after work).
The IDCF's origin indicates the transit station where the user starts daily his/her
cycling to work, while the destination stands for his/her workplace. However, in this
scenario, the user's daily commuting process is incomplete, as it lacks the segment
where the user travels between the residence and another transfer station. Thus, we
need search for his/her remaining SICCFs that satisfy the following conditions to
The transfer station of this SICCF and the IDCF are different;
This SICFF is temporal close to the IDCF, meaning the time difference between
this SICCF's and the IDCF's 𝑇𝑒 and the time difference between this SICCF's
additional daily commuting flow (IADCF), and the process proceeds to the next step.
biking for the "last mile" from transit station to his/her workplace (or the "first mile"
from his/her workplace to transit station after work). Similarly, if the IDCF's origin
exclusively on bicycling for the "first mile" from his/her residence to transit station
(or the "last mile" from transit station to his/her residence after work).
(4) If the origin of the user's IDCF is connected to a transit station and an IADCF is
identified, this IADCF represents the user's daily "first mile" commuting by bicycling
from his/her residence to another transit station (or the "last mile" commuting when
returning home from another transfer station after work). In this scenario, by
combining the IDCF with the IADCF, the complete daily commuting pattern,
is connected to public transit while an IADCF is found, this user is also classified a
Biking-transit-biking commuter.
commuters. Moreover, there are differences in the commuting characteristics of the various
Supplemental files.
this paper will evaluate the performance of the improved spatiotemporal flow clustering
commuters.
For Layer 1, we contrast the clustering results of the original methods with our enhanced
methods using multiple indicators. For spatial flow clustering method, we computed four
indicators: the average number of biking records included in each ISFC, the average length
of ISFCs, and the average distance from the OD of each biking record to the OD of its
corresponding ISFC (later abbreviated as the average distances to ISFCs' origins and
destinations, respectively). These indicators are used to highlight the promoting of restricting
coefficients 𝛽 on the average number of biking records contained in each ISTFC and the
average maximum time interval within each ISTFC (i.e., the mean difference between the
earliest departure and the latest arrival times of the trip records within each ISTFC). This
For Layer 2, due to the unavailability of travel survey data on cycling habits within the
study area and the significant bias between the census population and the bike-sharing users,
this work decides to employ residential land use data (Accessed from Shenzhen Municipal
our extracted users’ residences. Specifically, we first extract the users who have identifiable
residential locations, then measured the distances from their residence to the actual
boundaries of residential land parcels. If a user's residence falls in the residential land, the
distance is set to zero. Lastly, we plot the cumulative percentage of users whose identified
residences are within 0 to 300 meters of the actual residential land. If the majority of users'
residence are located within or near the actual boundaries of residential land, it would
demonstrate a close match between the bike-sharing commuters' residences and the actual
residential land use distribution. Notably, this paper does not perform the same validation for
users' workplace identified, as residents in occupations have diverse workplace not limited
to office spaces or industrial parks, which is more likely to lead to omissions and
misclassifications.
aggregate and analyze their daily commuting characteristics (i.e., commuting duration and
distance, working hours, and cycling round-trips rate) and spatiotemporal patterns (i.e.,
In Table 2, we compare the evaluation indicators between the original and improved
spatial flow clustering methods. Obviously, compared to the enhanced method, the original
method exhibits a slight increase in the average number of trip records within each ISFC (an
average increase of 3 additional biking records) due to the absence of the boundary circle
constraints. Meanwhile, notable changes are observed in the average distances to ISFC's OD,
Recognizing the significant impact of the boundary circle radius constraint on longer
ISFCs, we conducted additional comparison for ISFCs exceeding lengths of 1500 m and 3000
m. The results reveal that while the average distances to the ISFC's OD maintains nearly
constant with increasing ISFC length in the enhanced method, they rise substantially in the
original method. However, the magnification of these two indicators implies significant
enhancement of spatial flow clustering introduced in this study are essential, ultimately
extracting reliable ISFCs from 95.1% (~0.71 million) of active bike-sharing users.
Table 2 Comparison of evaluation indicators between the original and improved spatial flow
Similarly, Table 3 displays the comparative results of the original and enhanced
spatiotemporal flow clustering methods. Clearly, in contrast to the original method (𝛽=0),
the improved method, incorporating an expansion coefficient 𝛽, can extract ISTFCs that
contain more trip records (averaging an increase of 7.1 records including in each ISTFC when
𝛽=30 min). This substantiates the promoting effect of the 𝛽 in mining daily spatiotemporal
trajectories from bike-sharing data, given the generally shorter travel durations for bicycle
trips. However, as 𝛽 increases further, the average number of trip records within ISTFCs
Preprint submitted to Elsevier, Page 29
shows diminishing returns, with an increase of only 0.9 records at 𝛽=90 min compared to
𝛽=60 min. Meanwhile, the average maximum time interval for each ITSFC continues to
increase with the growing of 𝛽. Yet, an excessively large average maximum time interval
could introduce biking records from other time periods into the extracted ISTFCs, potentially
elevating data noise. Hence, this study refers to the China Urban Transportation Report 2021
30 min, which is remarkably close to the average commuting duration in Shenzhen (37 min).
Ultimately, through the processes of Spatiotemporal flow clustering and Neighbor ISTFC
merging, we successfully identify reliable ISTFCs from 74.4% (~0.56 million) of active
bike-sharing users. Notably, ~0.11 million reliable ISTFCs from over 90,000 users are
accomplished through Neighbor ISTFC merging step. Collectively, these results underscore
the critical role of the aforementioned enhancements in Layer 1 in improving the quality of
spatiotemporal flow clustering methods (see Section 3.3 for descriptions of below indicators)
Avg. number of biking records 12.5 (10.3) 19.6 (14.8) 21.3 (15.3) 22.4 (15.6)
Furthermore, utilizing the rule-base decision trees from Layer 2, we have successfully
are Only-biking commuters and 25.62% are Biking-with-transit commuters. The percentage
of Biking-with-transit commuters is slightly higher than the results for transfer trips in the
studies of Xing et al. (2019) and Li et al. (2020) regarding the purpose of bike-sharing trips,
while they considered more kinds of travel activities. Within these Biking-with-transit
commuters, the share of Biking-transit-biking commuters is only 1.75% due to the stringent
filtering rules, while Biking-transit commuters (14.39%) are more prevalent than Transit-
biking commuters (9.48%), aligning with the findings of Guo et al. (2020), which suggests
that more users rely on cycling for the "first mile" from residence to transit station (or the
"last mile" from transit station to home after work). Meanwhile, given that the majority of
bike-sharing commuters daily transfer to the metros (over 96%) rather than the buses, our
Fig.8 Schematic diagram (a) and percentage (b) of different categories of bike-sharing
commuters.
a comparison is made between the distribution of their identified residences and the actual
residential land use boundaries (see Section 3.3 for details), as showed in Fig.9. The result
illustrates that 51.5% of inferred users' residences are within the residential land are, and 93.5%
are within 100 m of the residential land use. These findings indicate that the most of the
identified users’ homes are within or adjacent to the actual residential land parcels, reflecting
Fig 9. Cumulative percentage of bike-sharing users whose identified residences fall within
Fig.10 shows the distribution of commuting duration and distance for Only-biking and
transit commuters are incomplete, we cannot discuss these commuting characteristics for
them. For Only-biking commuters (Fig.10(a, b)), we find that over three-quarters have a daily
commuting duration under 10 min and distance within 1.8 km, aligning with previous
biking commuters are more likely to reside near their workplaces. For Biking-transit-biking
commuters (Fig.10(c, d)), we observe an average commuting duration exceeding 45 min and
distance over 13 km, indicating that these users tend towards complete their daily home-work
commuting across districts. Moreover, when comparing the commuting duration distribution
for different trip purposes (Fig.10(a, c)), we notice that both bike-sharing commuting groups
tend to spend more time commuting home from work, consistent with the results of Kung et
al. (2014), which can be attributed to having more intervening opportunities for other
activities (e.g., recreation, shopping and etc.) during their journey home.
Fig. 10 Distributions of commuting duration (a, c) and commuting distance (b, d) for Only-
Fig.11(a) displays the working hours distribution for all bike-sharing commuters,
excluding Biking-transit commuters, for whom we only obtain the commuting chains
between their residence and transfer station. Specially, we identify three distinct peaks for
Only-biking commuters' working hours. The largest peak occurs at 10 hours, which is longer
than the common sense of eight-hour work schedule. However, note that the working hours
we calculated are the total time from user's daily arrival to departure at the workplace,
potentially including non-working hours like lunch breaks. Thus, the actual working hours
for many individuals are likely 1 to 2 hours less than the working hours we calculated. This
indicates that the working hours for most users are in accordance with legal regulations. The
second highest peak, observed at approximately 12.5 hours, implies that some users are
actually working overtime, even if their calculated working hours include breaks. Lastly, the
smallest peak, appearing at around 5 hours, which is significantly lower than the first two
commuters exhibit a single prominent peak in their working hours, which is consistent with
the largest peak for Only-biking commuters. Furthermore, while some Biking-with-transit
commuters also work overtime, as indicated by a slight peak after 12 hours, the proportion is
far less than Only-biking commuters. This reflects that Only-biking commuters are more
tolerant of overtime than Biking-with-transit commuters, one reason for which may be their
lower commuting costs. Lastly, we discover that bike-sharing users who involved in part-
time or shift work rarely connect to public transportation for commuting. That is reasonable,
as they work around 5 hours and choosing a biking-with-transit commuting mode represents
Preprint submitted to Elsevier Page 34
excessively high proportion of their commuting duration relative to working hours
Cycling round-trip rate is an indicator that measures the regularity differences between
commuting to and from work by bike. Generally, as show in Fig.11(b), there is little
difference in the cycling round-trip rates among various kinds of commuters (due to the small
average cycling round-trip rate is around 0.6, with the lower quartile roughly 0.5, indicating
that for nearly three-quarters of bike-sharing commuters, riding to work is more regular than
riding home. In other words, the behavior of cycling to work is more likely to be observed
for most users. That could be that residents have fewer time constraints and more autonomous
activities after work. Additionally, it could also be due to the insufficient supply of bike-
sharing, which leads some users to choose alternatives for the return journey.
Fig. 11 (a) Distribution of working hours for Only-biking, Transit-biking and Biking-
transit-biking commuters; (b) Distribution of cycling round-trip rate for different kinds of
bike-sharing commuters.
In Fig.12, we present the daily commuting temporal pattens for different kinds of bike-
the sharp peak for Biking-transit and Biking-transit-biking commuters both occur before 8:00,
while the peak for Only-biking commuters is around 8:30. This result combined with the
observation in Fig.10 suggests that users with higher commuting costs tend to depart earlier,
consistent with Kung et al. (2014). Moreover, the departure time of Biking-transit commuters
is slightly later than that of Biking-transit-biking commuters, indicating that their commuting
durations are shorter overall and their workplaces are closer to transfer stations.
Regarding the temporal patterns of commuting back home (Fig.12(b)), we find that the
peak of all three kinds of commuters appear around 18:30, which reflects the standard off-
duty commuting time for most bike-sharing users. However, this also means a massive
demand for bike-sharing during the same period, especially around the workplaces. If
bicycles supply is insufficient, some users have to choose alternative transportation modes,
which explains why the cycling round-trip rate for most users are more than 0.5 (Fig.11(b)).
Furthermore, compared to commuting to work, the smoother curve and extended tail (20:00-
23:00) for commuting back home once again reflects the phenomenon of some users working
overtime, with a high proportion of Only-biking commuters, which echoes the discussion in
Fig. 12 (a) Daily temporal patterns of commuting to work for Only-biking, Biking-transit-
biking and Biking-transit commuters; (d) Daily temporal patterns of commuting home for
Fig.13(a) illustrates the density distribution of residential locations for all bike-sharing
Likewise, Fig.13(b) shows the distribution of workplace for all commuters except for Biking-
transit commuters. Generally, the spatial distributions of residential and employment area for
bike-sharing commuters are similar, with widespread dispersion and local concentrations.
Specifically, the employment hotspots are predominantly in the Futian FTZ – Futian CBD –
Luohu CBD, High-tech Park – Bao'an Center and Longhua Industrial Park, with most
residential hotspots distributed near these employment zones. This result is in line with the
mixed land use patterns in Shenzhen. Interestingly, we discover that the main residential
hotspots are in urban villages and old communities, especially in the central city. These areas
living cost (Liu et al. 2010). Concurrently, Guo et al. (2019) found that this demographic is
also the main force of dockless bike-sharing users. Moreover, the narrow roads, high-density
buildings, and mixed land use in these areas are more suitable for flexible and convenient
bicycle trips. Thus, despite the difficulties of managing and dispatching bikes within complex
urban villages and old communities, the substantial mobility demand (especially for
Moreover, we calculate the average working hours in the major job centers in Shenzhen.
It is worth noting that the calculated working hours are longer than the actual working hours
for most users, as explained in Subsection 4.2.1, yet this discrepancy does not impede inter-
regional comparison. The result shows that central city areas generally have shorter working
hours than in the suburbs areas (Fig.13(b)). Specifically, in the central city, employment
centers dominated by commercial and service industries (e.g., Luohu CBD and Futian CBD)
exhibit shorter working hours compared to those focused on high-tech industries (e.g., High-
tech Park). Notably, Huaqiang North Commercial Area has the shortest average working
hours (9.89h). Conversely, in the suburbs, Longhua Industrial Park, which is mainly
manufacturing, has the longest average working hours (10.89h), implying a higher likelihood
the metro stations they use daily, with the station size on the maps represents the number of
transit commuters (Fig.14(a)), while the spatial distribution of metro stations for them is
similar to the residential hotspots in Fig.13(a), the metro stations with high cycling-transfer
rate are mainly concentrated in the outskirts of the central city (e.g., Gushu, Minzhi,
Hongshan, etc.). This result aligns with the finding of Guo et al. (2020), revealing the
distribution of the main residences of groups use bike-sharing transfer services for across-
district commuting. As for Transit-biking commuters (Fig.14(b)), most metro stations with
high cycling-transfer rate are concentrated in central areas near mainly employment centers
(especially Nanshan and Futian districts). However, only a few stations aggregate over 900
Transit-biking commuters. This is likely to the central area's high accessibility and proximity
of businesses to metro stations (e.g., High-tech Park, Keyuan, etc.) facilitate direct walking
to work, reducing the need for bike-sharing. Moreover, we observe that some metro stations
(e.g., Bihaiwan, Gushu, Xili, etc.) have both lots of Biking-transit and Transit-biking
commuters, which reflects a mixed use of living and working spaces. Thus, bike-sharing
operators should pay attention to monitor the bicycle supply and demand around these
stations.
the biking commuting chains for Only-biking and Biking-transit-biking commuters by linking
the origin and workplace as the destination. Utilizing the spatial clustering method by Gao et
al. (2020), we present the results of primary commuting flow clusters in Fig. 15 and 16.
For Only-biking commuters (Fig.15), we discover that the commuting flow clusters are
generally short in length, averaging 1.28 km, and regularly converge from the hotspots of
residence to the nearest employment centers, in agreement with the observations in Fig. 10(b)
and Fig.13. This result suggests that dockless bike-sharing play a significant role in short-
distance commuting for residents in inner-city and suburban areas, further extending the
findings of previous studies (Li et al., 2021; Gao et al., 2022). As for Biking-transit-biking
commuters (Fig.16), the commuting flow clusters predominantly extend from the suburbs to
the central city, with an average length of over 15 km. Specifically, these users mostly live
in Bao'an and Longhua districts and daily commute by cycling to transfer with the metro that
link the suburban and central areas (especially the Shenzhen Metro 1, 4, 5, and 11 Lines),
echoing the actual situation in Shenzhen (e.g., many tech workers live near Pingzhou Station
5 Conclusion
Mining individual daily travel patterns of bike-sharing users is vital for the increasingly
refined planning of active transportation systems but remains a complex endeavor. To bridge
this address, this paper presents a two-layer framework that integrates spatiotemporal flow
clustering and rule-based decision trees, which is validated and applied to a dataset of over
200 million dockless bike-sharing trips in Shenzhen. In Layer 1, to overcome the lack of
methods with improved spatiotemporal constraints to identify users' daily trajectories from
their disordered travel records, and confirm their performance through comparative analysis
with the original methods. To the best of our knowledge, this is the first attempt to extract
individual daily mobility using spatiotemporal flow clustering models, which can be
extended to relevant studies on other travel data (e.g., taxi trip data). In Layer 2, considering
transportation transfer to construct rule-based decision trees. These decision trees can
identify the commuting behavior from users' daily cycling trajectories, thus deriving
individual daily commuting patterns. Such information can assists urban planners and bike-
sharing operators to rapidly understand residents' daily cycling patterns and demands.
Moreover, it serves as a data foundation for fine-scale research on bicycle behavior by fusing
Moreover, by applying the two-layer framework to the case study of Shenzhen, we have
obtained some encouraging findings. First, the residential and workplace locations of bike-
sharing commuters exhibit mixed distribution pattern of widespread dispersion with local
concentrations. Most commuters live in the urban villages and old communities (especially
the outskirts of the inner-city areas (e.g., near the Gushu and Hongshan Stations). Second,
some bike-sharing users show noticeable overtime patterns, with a higher proportion of Only-
centers of the study area, Longhua Industrial Park, dominated by manufacturing, has the
longest average working hours, exceeding 10 hours. Finally, we found that majority of active
users utilize bike-sharing for commuting to work more frequently than for returning home,
which is closely related to increased discretionary activities after work and the excessive
bike-sharing demand around workplaces during commuting peak. These insights deepen our
understanding of the daily mobility patterns of cycling community in megacities and provide
However, there are still some limitations that warrant further improvement in future
research. First, our framework limited to weekday commuting patterns of bike-sharing users,
not accounting for weekend trips or non-commuting activities like exercise and leisure.
Subsequent studies can leverage place data (e.g., Points of Interest) to explore the cycling
characteristics in these contexts and develop more nuanced travel chain models. Second, it is
necessary to validate mobility patterns with travel survey data, but regrettably, achieving this
goal remains unattainable in our study due to the challenges in acquiring relevant data
covering the study area's cycling population. Lastly, note that there is still a private bicycle
(including electric bike) group in urban transportation. Investigating whether their mobility
6 Funding
This study was supported by the National Science Fund for Distinguished Young
Scholars (Grant No. 42225107), the National Natural Science Foundation of China (Grant
No. 42271467).
7 References
Cao, M., Huang, M., Ma, S., Lü, G., & Chen, M., 2020. Analysis of the spatiotemporal riding
modes of dockless shared bicycles based on tensor decomposition. Int. J. Geogr. Inf. Sci.
34(11), 2225-2242.
Chen, W., Liu, X., Chen, X., Cheng, L., & Chen, J., 2023. Deciphering flow clusters from
Transportation. 1-30.
Cheng, Z., Caverlee, J., Lee, K., & Sui, D., 2011. Exploring millions of footprints in location
Cheng, L., Mi, Z., Coffman, D. M., Meng, J., Liu, D., & Chang, D., 2021. The role of bike
Du, Y., Deng, F., Liao, F., 2019. A model framework for discovering the spatio-temporal
usage patterns of public free-floating bike-sharing system. Transp. Res. Part C Emerg.
DeMaio, P., 2019. Bike-sharing: History, impacts, models of provision, and future. J. Public.
Trans., 12(4).
Ferretto, L., Bruzzone, F., & Nocera, S., 2021. Pathways to active mobility planning.
Fishman, E., 2016. Bikeshare: A review of recent literature. Transp. Rev. 36(1), 92-113.
Gao, X., Liu, Y., Yi, D., Qin, J., Qu, S., Huang, Y., & Zhang, J., 2020. A Spatial Flow
Gao, F., Li, S., Tan, Z., Wu, Z., Zhang, X., Huang, G., & Huang, Z., 2021a. Understanding
the modifiable areal unit problem in dockless bike sharing usage and exploring the
interactive effects of built environment factors. Int. J. Geogr. Inf. Sci. 35(9), 1905-1925.
Gao, K., Yang, Y., Li, A., Li, J., & Yu, B., 2021b. Quantifying economic benefits from free-
Gao, F., Li, S., Tan, Z., & Liao, S., 2022. Visualizing the Spatiotemporal Characteristics of
Dockless Bike Sharing Usage in Shenzhen, China. J. Geovis. Spat. Anal. 6(1), 1-15.
Gu, T., Kim, I., & Currie, G., 2019. To be or not to be dockless: Empirical analysis of
dockless bikeshare development in China. Transp. Res. A Policy Pract. 119, 122-147.
Guangdong Statistics Bureau., 2021. Guangdong Statistical Yearbook 2021. Available at:
Guo, Y., & He, S. Y., 2020. Built environment effects on the integration of dockless bike-
sharing and the metro. Transp. Res. Part D: Transp. Environ. 83, 102335.
Guo, Y., Yang, L., Lu, Y., & Zhao, R., 2021. Dockless bike-sharing as a feeder mode of
metro commute? The role of the feeder-related built environment: Analytical framework
Handy, S., Van Wee, B., & Kroesen, M., 2014. Promoting cycling for transport: research
Heinen, E., Van Wee, B., & Maat, K., 2010. Commuting by bicycle: an overview of the
Jiang, S., Ferreira, J., & Gonzalez, M. C., 2017. Activity-based human mobility patterns
inferred from mobile phone data: A case study of Singapore. IEEE Trans. Big Data 3(2),
208-219.
Kim, K., 2023a. Investigation of modal integration of bike-sharing and public transit in Seoul
Kung, K. S., Greco, K., Sobolevsky, S., & Ratti, C., 2014. Exploring universal patterns in
Preprint submitted to Elsevier Page 48
human home-work commuting from mobile phone data. PloS One 9(6), e96180.
Li, S., Zhuang, C., Tan, Z., Gao, F., Lai, Z., & Wu, Z., 2021. Inferring the trip purposes and
Li, L., Goodchild, M. F., & Xu, B., 2013. Spatial, temporal, and socioeconomic patterns in
the use of Twitter and Flickr. Cartogr. Geogr. Inf. Sci. 40(2), 61-77.
Liu, Y., He, S., Wu, F., & Webster, C., 2010. Urban villages under China's rapid urbanization:
Liu, Y., Gao, X., Yi, D., Jiang, H., Zhao, Y., Xu, J., & Zhang, J., 2022. Investigating Human
Travel Patterns from an Activity Semantic Flow Perspective: A Case Study within the
Fifth Ring Road in Beijing Using Taxi Trajectory Data. ISPRS Int. J. Geoinf. 11(2), 140.
Liu, S., Zhang, X., Zhou, C., Rong, J., & Bian, Y., 2022. Temporal heterogeneous effects of
Liu, Y., Wang, S., Wang, X., Zheng, Y., Chen, X., Xu, Y., & Kang, C., 2024. Towards
Lu, Y., & Liu, Y., 2012. Pervasive location acquisition technologies: Opportunities and
challenges for geospatial studies. Comput. Environ. Urban. Syst. 36(2), 105-108.
Luo, H., Kou, Z., Zhao, F., & Cai, H., 2019. Comparative life cycle assessment of station-
based and dock-less bike sharing systems. Resour. Conserv. Recycl. 146, 180-189.
Ma, X., Ji, Y., Yang, M., Jin, Y., & Tan, X., 2018. Understanding bikeshare mode as a feeder
to metro by isolating metro-bikeshare transfers from smart card data. Transp. Policy 71,
Ma, X., Ji, Y., Yuan, Y., Van Oort, N., Jin, Y., & Hoogendoorn, S., 2020. A comparison in
travel patterns and determinants of user demand between docked and dockless bike-
sharing systems using multi-sourced data. Transp. Res. A Policy Pract. 139, 148-173.
Reiss, S., & Bogenberger, K., 2016. Validation of a relocation strategy for Munich's bike
Ross-Perez, A., Walton, N., & Pinto, N., 2022. Identifying trip purpose from a dockless bike-
Sari Aslam, N., Cheng, T., & Cheshire, J., 2019. A high-precision heuristic model to detect
home and work locations from smart card data. Geo-Spat. Inf. Sci. 22(1), 1-11.
Schwanen, T., & Dijst, M., 2002. Travel-time ratios for visits to the workplace: the
relationship between commuting time and work duration. Transp. Res. A Policy Pract.
36(7), 573-592.
Shen, Y., Zhang, X., & Zhao, J., 2018. Understanding the usage of dockless bike sharing in
Singleton, P. A., & Clifton, K. J., 2014. Exploring synergy in bicycle and transit use:
Si, H., Shi, J. G., Wu, G., Chen, J., & Zhao, X., 2019. Mapping the bike sharing research
published from 2010 to 2018: A scientometric review. J. Clean. Prod. 213, 415-427.
http://jtys.sz.gov.cn/zwgk/ztzl/msss/2022gjcxxcz/jbqk/content/post_10150527.html
Teixeira, J. F., Silva, C., & e Sá, F. M., 2021. The motivations for using bike sharing during
Preprint submitted to Elsevier Page 50
the COVID-19 pandemic: Insights from Lisbon. Transport. Res. F Traf. 82, 378-399.
Wu, M., Liu, X., Qin, Y., & Huang, Q., 2023. Revealing racial-ethnic segregation with
individual experienced segregation indices based on social media data: A case study in
Xu, Y., Belyi, A., Bojic, I., & Ratti, C., 2018. Human mobility and socioeconomic status:
Xing, Y., Wang, K., & Lu, J. J., 2020. Exploring travel patterns and trip purposes of dockless
Yao, X., Zhu, D., Gao, Y., Wu, L., Zhang, P., & Liu, Y., 2018. A stepwise spatiotemporal
flow clustering method for discovering mobility trends. IEEE Access 6, 44666-44675.
Yao, Y., Jiang, X., & Li, Z., 2019. Spatiotemporal characteristics of green travel: A
Yang, Y., Heppenstall, A., Turner, A., & Comber, A., 2019. A spatiotemporal and graph-
based analysis of dockless bike sharing patterns to understand urban flows over the last
Yin, L., Lin, N., & Zhao, Z., 2021. Mining daily activity chains from large-scale mobile
Zhang, Y., & Mi, Z., 2018. Environmental benefits of bike sharing: A big data-based analysis.
learning framework for Geodemographic inference using transit smart card data.
Zhang, H., Zhuge, C., Jia, J., Shi, B., & Wang, W., 2021a. Green travel mobility of dockless
bike-sharing based on trip data in big cities: A spatial network analysis. J. Clean. Prod.
313, 127930.
Zhang, X., Shen, Y., & Zhao, J., 2021b. The mobility pattern of dockless bike sharing: A
four-month study in Singapore. Transp. Res. Part D: Transp. Environ. 98, 102961.