Community Sense and Response Systems: Your Phone As Quake Detector
Community Sense and Response Systems: Your Phone As Quake Detector
Community Sense and Response Systems: Your Phone As Quake Detector
Matthew Faulkner, Robert Clayton, Thomas Heaton, K. Mani Chandy, Monica Kohler, Julian Bunn,
Richard Guy, Annie Liu, Michael Olson, MingHei Cheng, Andreas Krause
The proliferation of smartphones and other powerful sensor- From collection to action
equipped consumer devices enables a new class of web ap- Acting on community sensor data is fundamentally different
plications: Community Sense and Response (CSR) systems. than acting on data from standard web applications or sci-
These applications are distinguished from standard web ap- entific sensors. The potential scale of raw data is vast, even
plications by the use of community-owned commercial sensor by the standards of large web applications. Data recorded
hardware. Just as social networks connect and share human- by community sensors often include signals produced by the
generated content, CSR systems work to gather, share, and people who operate them. And many of the desired appli-
act on sensory data from people’s internet-enabled devices. cations, while far-reaching, push the limits of our current
In this article, we discuss our work building the Caltech understanding of physical phenomena.
Community Seismic Network as a prototypical CSR system
harnessing accelerometers in smartphones and consumer elec- Scale. The volume of raw data that can be produced by a
tronics. We describe the systems and algorithmic challenges CSR network is astounding by any standard. Smartphones
of designing, building and evaluating a scalable network for and other consumer devices often have multiple sensors,
real-time awareness of dangerous earthquakes. and can produce continuous streams of GPS position, ac-
celeration, rotation, audio, and video data. While events
Nearly 2 million Android and iOS devices are activated ev- of interest (e.g. traffic accidents, earthquakes, disease out-
ery day, each carrying numerous sensors and a high-speed breaks) may be rare, devices must monitor continuously in
internet connection. Several recent sensing projects seek to order to detect them. Beyond obvious data heavyweights
partner with the owners of these and other consumer devices like video, rapidly monitoring even a single accelerometer or
to collect, share, and act on sensor data about phenomena microphone produces hundreds of megabytes per day. Com-
that impact the community. Coupled to cloud computing munity sensing makes possible networks containing tens of
platforms, these networks can reach an immense scale pre- thousands or millions of devices. For example, equipping
viously beyond the reach of sensor networks [6]. [5] provides taxi cabs with GPS devices or air quality sensors could eas-
an excellent overview of how the Social and Mobile Web fa- ily yield a network of 50,000 sensors in a city like Beijing
cilitate crowdsourcing data from individuals and their sensor [22]. At these scales, even collecting a small set of summary
devices. Additional applications of community and partici- statistics becomes daunting: if 500,000 sensors reported a
patory sensing include: understanding traffic flows [14, 20, brief status update once per minute, the total number of
16, 4]; identifying sources of pollution [2, 1], monitoring pub- messages would rival the daily load in the Twitter network.
lic health [18], and responding to natural disasters like hurri-
canes, floods, and earthquakes [8, 9, 11, 15]. These systems Non-traditional sensors. Community devices are also dif-
are made possible by volunteer sensors and low-cost web so- ferent than those used in traditional scientific and industrial
lutions for data collection and storage. However, as these applications. Beyond simply being lower in accuracy (and
systems mature, they will undoubtedly extend beyond data cost) than “professional” sensors, community sensors may be
collection and begin to take real-time action on the com- mobile, intermittently available, and affected by the unique
munity’s behalf. For example, traffic networks may reroute environment of an individual’s home or workplace. For ex-
traffic around an accident, or a seismic network may auto- ample, the accelerometer in a smartphone could measure
matically slow trains to prevent derailing. earthquakes, but will also observe user motion.
Partnering with the community. Community participa- Fundamental challenges. Reliable, real-time inference of
tion is ideal for seismic sensing for several reasons. First, spatial events is a core task of seismic monitoring, and also a
community participation makes possible the densely dis- prototypical challenge for any application utilizing physical
tributed sensors needed for accurately measuring shaking sensors. In the following, we outline a methodology devel-
oped to rapidly detect quakes from thousands of community performs a hypothesis test, but now using the received pick
sensors. As we will see, the computational power of commu- messages instead of the entire raw data. Results from de-
nity devices can be harnessed to overcome the cacophony of centralized hypothesis testing theory state that if the sen-
noise in community-operated hardware, and that on-device sors’ measurements are independent conditional on whether
learning yields a decentralized architecture that is scalable there is an event or not, and if the probability of the mea-
and heterogeneous, while still provides rigorous performance surements is known in each case then the asymptotically
guarantees. optimal strategy is to perform a hierarchical hypothesis test
[21]: each sensor individually performs a hypothesis test, for
2. DECENTRALIZED EVENT DETECTION some threshold τ , and picks only when
Suppose that a strong earthquake begins near a metropoli- P [ one sensor’s measurements | strong quake ]
> τ. (2)
tan area, and that a 0.1% of the population contributes ac- P [ one sensor’s measurements | no quake ]
celerometer data from a personally-owned internet-enabled
device. In Los Angeles county, this means data from 10,000 Similarly, the Cloud server performs a hypothesis test on the
noisy sensors located on a coastal basin of rock and sediment, number of picks S received at a given time, and declares a
striped with fault lines, and cross-hatched with vibration- detection when a threshold τ ′ is exceeded:
producing freeways. How could we detect the quake, and Bin(S; rT ; N )
≥ τ ′, (3)
estimate its location and magnitude as quickly as possible? Bin(S; rF ; N )
The parameters rT and rF are the true positive and false
positive pick rates for a single sensor, and Bin(·, p, N ) is
the probability mass function of the Binomial distribution.
Asymptotically optimal decision performance can be ob-
tained by using the decision rules (2) and (3) with proper
choice of the thresholds τ and τ ′ [21]. Additionally, collect-
ing picks instead of raw data may help preserve user privacy.
A natural next step is a decentralized approach. Suppose It turns out that under mild conditions, a simple anomaly
each device instead only transmits a finite summary of its detection approach that uses only the probability of an ac-
current data, called a pick message. The central server again celeration time series in the absence of a quake can obtain
the same asymptotically optimal performance. A given sen- parameters θ from a constant size coreset C can obtain ap-
sor now picks when proximately the same likelihood as learning the model from
the entire, arbitrarily large D.
P [ one sensor’s measurements | no quake ] < τ. (4)
For an appropriate choice of threshold, this can be shown to But how do we find C? [12] showed that efficient algo-
produce the same picks as the full hypothesis test, without rithms to compute coresets for projective clustering prob-
requiring us to produce a model of sensor data during future, lems (e.g. k-means and generalizations) can provide coresets
unknown quakes. For details, see [11]. for GMMs. A key insight is that while uniformly subsam-
pling the input may miss “important” regions of data, an
Learning on smartphones with Coresets adaptive sampling approach is likely to sample from “enough”
The above anomaly detection scheme makes use of the abun- regions to reliably estimate a mixture of k Gaussians; weight-
dant “normal” data, but leaves us the challenge of computing ing the samples accounts for the sampling bias. Previous
the conditional distribution. In principle, each sensor could work [13] also identified that coresets for many optimiza-
maintain a history of its observations, and periodically es- tion problems can be computed efficiently in the parallel or
timate a probabilistic model describing that data. On a streaming model, and several of those results apply here. In
mobile device, this means logging around 3GB of accelera- particular, a stream of input data can be buffered to some
tion data per month. Storing and estimating models on this constant size, and then compressed into a coreset. Careful
much data is a burden on volunteers’ smartphone resources. merging and compressing of such coresets provides an ap-
Could we accurately model a sensor’s data with (much) less proximation to the entire stream so far, while using space
storage? and update time polynomial in all the parameters, and log-
arithmic in n.
In the CSN system, the local distribution is chosen to be
a Gaussian Mixture Model (GMM) over a feature vector of Learning spatial dependencies
acceleration statistics from short time windows (similar to Quake detection in community networks requires finding a
phonemes in speech recognition). GMMs are flexible, multi- complex spatio-temporal pattern in a large set of noisy sen-
modal distributions that can be practically estimated from sor measurements. The start of a quake may only affect a
data using the simple EM algorithm [3]. In contrast to esti- small fraction of the network, so the event can easily be con-
mating a single Gaussian, which can be fit knowing only the cealed in both single-sensor measurements and network-wide
mean and variance of the data, estimating a GMM requires statistics. Data from recent high-density seismic studies,
access to all the data; formally, GMMs do not admit finite Figure 2, show that localized variations in ground structure
sufficient statistics. This precludes, for example, our abil- significantly impact the magnitude of shaking at locations
ity to compress the 3GB of monthly acceleration data and only a kilometer apart. Consequently, effective quake detec-
still recover the same GMM that would have been learned tion requires algorithms that can learn subtle dependencies
from the full data. Fortunately, it turns out that the picture among sensor data, and detect changes within groups of de-
is drastically different for approximating a GMM: a GMM pendent sensors.
can be fit to an arbitrary amount of data, with an arbitrary
approximation guarantee, using a finite amount of storage! The “classical” approach described at the start of this section
assumes that the sensors provide independent, identically
A tool from computational geometry, called a coreset, makes distributed measurements conditioned on the occurrence or
such approximations possible. Roughly, a coreset for an al- non-occurrence of an event. In this case, the fusion cen-
gorithm is a (weighted) subset of the input, such that run- ter would declare a detection if a sufficiently large number
ning the algorithm on the coreset gives a constant-factor of sensors report picks. However, in many practical appli-
approximation to running the algorithm on the full input. cations, the particular spatial configuration of the sensors
Coresets have been used to obtain approximations for a va- matters, and the independence assumption is violated. Here,
riety of geometric problems, such as k-means and k-medians the natural question arises of how (qualitative) knowledge
clustering. about the nature of the event can be exploited in order to
improve detection performance.
It turns out that many geometric coreset techniques can also
provide approximations for statistical problems. Given an Viewed as transmitting a vector x ∈ Rp through a noisy
input dataset D, we would like to find the maximum like- channel, the signal is mostly zeros (sparse), but many bits in
lihood estimate for the means and variances of a Gaussian the received vector y are flipped due to noise. We should ex-
mixture model, collectively denoted θ. A weighted set C is a pect nearby sensors to be strongly correlated during a quake.
(k, ǫ)-coreset for GMMs if with high probability the log like- If we knew groups of correlated sensors, detection could be
lihood on L(C | θ) is an ǫ approximation to the log likelihood improved by testing each group separately. This intuition
on the full data L(C | θ), for any mixture of k Gaussians: (and some desirable analytic properties) can be captured by
learning a orthonormal change-of-basis matrix that projects
(1 − ε)L(D | θ) ≤ L(C | θ) ≤ φ(D | θ)(1 + ε). the binary messages received by the server onto a coordi-
[12] showed that given input D, it is possible to sample such nate system that, roughly, aggregates groups of strongly
a coreset C whose size is independent of the size of input D correlated sensors. Given such a matrix B with columns
(i.e. only depends polynomially on the dimension of the in- bi , . . . , bp , the server declares an event when
put, the number of Gaussians k, and parameters ε, δ), with
max bTi y > τ
probability at least 1 − δ for all (non-degenerate) mixtures i
θ of k Gaussians. This implies that learning mixture model
To obtain reliable detection when the signal is weak (mea-
√
sured by the ℓ0 pseudo-norm, ||x||0 < p), traditional hy-
pothesis testing requires the error rate of each sensor (each
element of x) to decrease as the number of sensors p in-
creases. This is in stark contrast to our intuition that more
sensors should be better, and in contrast to the “numerous-
but-noisy” approach of community sensing. However, [10]
shows that if the matrix B is sparsifying, i.e. ||BT x||0 = pβ ,
||x||0 = pα , 0 < β < α < 1/2, then the test maxi bTi y > τ
gives probability of miss and false alarm that decays to
zero exponentially as a function of the “sparsification ra-
tio” ||x||0 /||BT x||0 , for any rate rF < 1/2 of pick errors.
Effectively, this allows large numbers of noisy sensors to con-
tribute to reliable detection of signals that are observed only
by a small fraction (||x||0 ) of sensors.
Figure 3: The CSN cloud maintains the persistent
Learning to sparsify. The success of the above result de- state of the network in Datastore, performs real-
pends on B’s ability to concentrate weak signals. We could time processing of pick data via Memcache, and
learn a basis B that optimizes ||BT x||0 by solving serves notifications and data products.
3. BUILDING CSN
Managing a community sensor network and processing its
data in real-time leads to challenges in scalability and data
security. Cloud computing platforms, such as Amazon EC2,
Heroku, or Google App Engine provide practical and cost-
effective resources for reliably scaling web applications. The
CSN network is built upon Google App Engine (GAE). Fig-
ure 3 presents an overview of the CSN architecture. Hetero-
Figure 4: CSN-Droid stores and processes sen-
geneous sensors include cell phones, stand-alone sensors, and
sor data locally on an Android phone or tablet;
accelerometers connected via USB to host computers to the
sends pick messages during potential quakes; re-
cloud. The cloud, in turn, performs event detection and is-
ceives alerts; and responds to data requests.
sues notifications of potential seismic events. An advantage
Figure 6: Eccentric weights oscillate Millikan Li-
Figure 5: CSN sensors produced picks (blue and brary, demonstrating that CSN hardware can ob-
red bars) for both P-wave and S-wave of the Anza serve resonant frequencies in buildings.
M3.6 earthquake. Time series plots are arranged by
distance to quake epicenter.
Our experiments start with an evaluation of whether com-
3.2 CSN in the Cloud munity hardware is adequate for seismic detection. Sev-
Cloud computing services are well-suited for the network eral results indicate that this is indeed the case. Exper-
maintenance and real-time response tasks of CSR systems. iments with a large actuator called a ”shake table” allow
Figure 3 depicts the main data flows through the cloud. us to expose sensors to accurate reproductions of historic,
First, client registration and heartbeat messages are per- moderately large (M4.5-5.5) earthquakes. The shake table
sisted to the geographically-replicated Datastore. Next, in- demonstrates that both USB sensors and the lower quality
coming picks are spatially aggregated via geographic hashing phone accelerometers can detect the smaller initial shaking
into Memcache (a distributed in-memory data cache). While (P-wave) and stronger secondary shaking (S-wave) that pro-
memcache is not persistent (objects can be ejected from the duce the characteristic signature of an earthquake, as shown
cache due to memory constraints), it is much faster than in Figure 7. These laboratory experiments are confirmed by
the datastore. Memcache is also ideal for computations that measurements of actual earthquakes observed by the CSN
need to occur quickly, and, because memcache allows values network; similar signatures are visible in Figure 5, which
to set an expiry time, it is also perfect for data whose use- shows a subset of measurements of a M3.6 quake.
fulness expires after a period of time. Finally, an Associator
performs the final event detection and issues notifications. A second experiment assesses whether community sensors
Implementing this architecture on App Engine offers several can detect changes in the motion of buildings caused by
practical advantages: earthquakes. We oscillated the ten-story Millikan Library
on the Caltech campus, using a large eccentric weight on the
Dynamic scaling. Incoming requests are automatically load- roof of the building. CSN sensors in the library measured
balanced between instances that are created and destroyed the resonant frequency of the building (around 1.7Hz), Fig-
based on current demand levels. This both simplifies algo- ure 6, confirming that low-cost sensors can perform structure
rithmic development, and reduces costs during idle periods. monitoring.
✷ ✷ ✷
✟✞ ✟✞ ✟✞
✝ ✝ ✝
♠ ♠ ♠
✲✷ ✲✷ ✲✷
✺ ✶ ✶✺ ✷ ✺ ✶ ✶✺ ✷ ✺ ✶ ✶✺ ✷
s✁✂✄☎✆s s✁✂✄☎✆s s✁✂✄☎✆s
Figure 7: Android accelerometers accurately record strong shaking during a shake table experiment. (a)
Shake table experimental setup. (b) Ground truth. (c) Android phone. (d) Android phone in backpack.
Figure 8: Attainable true positive and false positive pick rates for (a) USB accelerometer, (b) Android
accelerometer. (c) Coresets allow drastic reduction in data storage, without sacrificing pick performance. (d)
Estimated quake detection rates for a mixture of USB and mobile phone sensors in a given area.
shows the tradeoff of detecting with a mix of sensor types, of the four events, the vertical bars give the time to detec-
while constraining to one false alarm per year. Our results tion for the learned bases, classical hypothesis testing, and a
indicate that approximately 50 phones or 10 Phidgets should competitive scan statistic algorithm. The bases learned from
be enough to detect a nearby magnitude 5 or larger event simple simulations in general achieve faster detection, e.g. 8
with close to 100% success. seconds faster than competitive algorithms in detecting the
Beverly Hills event.
5. CONCLUSION
In this article, we have outlined several algorithmic and sys-
tems principles that facilitate detecting rare and complex
spatial signals using large numbers of low-cost community
sensors. We have found that employing machine learning
at each stage of a decentralized architecture allows efficient
use of sensor-level and cloud-level resources, and is essen-
tial to providing performance guarantees when little can be
said about a particular community sensor, or when little
is known about the events we seek to detect. Community
sensing is applicable to a variety of application domains, in-
cluding disasters like fires, floods, radiation, epidemics, and
traffic accidents, as well as monitoring the pollution, pedes-
Figure 9: The learned sparsifying basis is faster than trian traffic, and acoustic noise levels in urban environments.
a network aggregate or a spatial scan statistic [19] for In all of these cases “responding” may range from taking
detecting four quakes recorded by the CSN network. physical action to merely devoting additional resources to
an event of interest. While the CSN project is motivated
While earthquakes are inherently unpredictable, simulations by detecting and reacting to strong earthquakes, we believe
provide a qualitative idea of spatial dependencies among sen- that community sense and response systems for these do-
sors that can be used to train detectors. Using a prior dis- mains and others will require a similar blueprint of machine
tribution constructed from historic earthquakes available in learning and scalable systems.
the USGS database, and a simulator for community sensors
similar to that in [17], we simulated picks from 128 CSN 6. ACKNOWLEDGEMENTS
sensors during 1000 simulated quakes. These picks are used We acknowledge the support of the Gordon and Betty Moore
as training data for a sparsifying basis, a network-wide hy- Foundation, NSF awards CNS0932392, IIS0953413 and ERC
pothesis test, and a spatial scan statistic. After training, StG 307036. Andreas Krause was supported in part by a
each algorithm is then evaluated on its ability to detect four Microsoft Research Faculty Fellowship. We thank Signal
recent events using real measurements recorded by the net- Hill Petroleum and Nodal Seismic for data from the Long
work. Figure 9 summarizes detection performance: for each
Beach Network, and SCSN for data from the permanent Chandy, and A. Krause. The next big one: Detecting
earthquake network in southern California. earthquakes and other rare events from
community-based sensors. In Proceedings of the 10th
ACM/IEEE International Conference on Information
7. REFERENCES Processing in Sensor Networks. ACM, 2011.
[1] Karl Aberer, Saket Sathe, Dipanjan Chakraborty,
[12] Dan Feldman, Matthew Faulkner, and Andreas
Alcherio Martinoli, Guillermo Barrenetxea, Boi
Krause. Scalable training of mixture models via
Faltings, and Lothar Thiele. Opensense: Open
coresets. In Advances in Neural Information
community driven sensing of environment. In ACM
Processing Systems 24, pages 2142–2150. NIPS, 2011.
SIGSPATIAL International Workshop on
GeoStreaming (IWGS), 2010. [13] Sariel Har-Peled and Soham Mazumdar. On coresets
for k-means and k-median clustering. In Proceedings of
[2] Paul M Aoki, RJ Honicky, Alan Mainwaring, Chris
the thirty-sixth annual ACM symposium on Theory of
Myers, Eric Paulos, Sushmita Subramanian, and
computing, pages 291–300. ACM, 2004.
Allison Woodruff. A vehicle for research: using street
sweepers to explore the landscape of environmental [14] B. Hoh, M. Gruteser, R. Herring, J. Ban, D. Work,
community action. In Proceedings of the 27th J.C. Herrera, A.M. Bayen, M. Annavaram, and
international conference on Human factors in Q. Jacobson. Virtual trip lines for distributed
computing systems, pages 375–384. ACM, 2009. privacy-preserving traffic monitoring. In Proc. of the
International conference on Mobile systems,
[3] Jeff A Bilmes et al. A gentle tutorial of the EM
applications, and services, pages 17–20. Citeseer, 2008.
algorithm and its application to parameter estimation
for Gaussian mixture and hidden Markov models. [15] A. Kapoor, N. Eagle, and E. Horvitz. People, Quakes,
International Computer Science Institute, 4(510):126, and Communications: Inferences from Call Dynamics
1998. about a Seismic Event and its Influences on a
Population. In Proceedings of AAAI Symposium on
[4] Paul Borokhov, Sebastien Blandin, Samitha
Artificial Intelligence for Development, 2010.
Samaranayake, Olivier Goldschmidt, and Alexandre
Bayen. An adaptive routing system for location-aware [16] Andreas Krause, Eric Horvitz, Aman Kansal, and
mobile devices on the road network. In Intelligent Feng Zhao. Toward community sensing. In Proc.
Transportation Systems (ITSC), 2011 14th ACM/IEEE International Conference on Information
International IEEE Conference on, pages 1839–1845. Processing in Sensor Networks (IPSN), 2008.
IEEE, 2011. [17] Annie Liu, Michael Olson, Julian Bunn, and K Mani
[5] Maged N Kamel Boulos, Bernd Resch, David N Chandy. Towards a discipline of geospatial distributed
Crowley, John G Breslin, Gunho Sohn, Russ Burtner, event based systems. In Proceedings of the 6th ACM
William A Pike, Eduardo Jezierski, and Kuo-Yu S International Conference on Distributed Event-Based
Chuang. Crowdsourcing, citizen sensing and sensor Systems, pages 95–106. ACM, 2012.
web technologies for public and environmental health [18] M. Mun, S. Reddy, K. Shilton, N. Yau, J. Burke,
surveillance and crisis management: trends, ogc D. Estrin, M. Hansen, E. Howard, R. West, and
standards and application examples. International P. Boda. Peir, the personal environmental impact
journal of health geographics, 10(1):67, 2011. report, as a platform for participatory sensing systems
[6] J. Burke, D. Estrin, M. Hansen, A. Parker, research. In Proceedings of the 7th international
N. Ramanathan, S. Reddy, and M.B. Srivastava. conference on Mobile systems, applications, and
Participatory sensing. In World Sensor Web services, pages 55–68. ACM, 2009.
Workshop, pages 1–5. Citeseer, 2006. [19] Daniel B. Neill. Fast subset scan for spatial pattern
[7] Xi Chen, Yanjun Qi, Bing Bai, Qihang Lin, and detection. Journal of the Royal Statistical Society:
Jaime G Carbonell. Sparse latent semantic analysis. In Series B (Statistical Methodology), 74(2), 2012.
SIAM 2011 International Conference on Data Mining, [20] A. Thiagarajan, L. Ravindranath, K. LaCurts,
2011. S. Madden, H. Balakrishnan, S. Toledo, and
[8] E.S. Cochran, J.F. Lawrence, C. Christensen, and R.S. J. Eriksson. VTrack: accurate, energy-aware road
Jakka. The Quake-Catcher Network: Citizen Science traffic delay estimation using mobile phones. In
Expanding Seismic Horizons. Seismological Research Proceedings of the 7th ACM Conference on Embedded
Letters, 80(1):26, 2009. Networked Sensor Systems, pages 85–98. ACM, 2009.
[9] Mari Ervasti, Shideh Dashti, Jack Reilly, Jonathan D [21] J.N. Tsitsiklis. Decentralized detection by a large
Bray, Alexandre Bayen, and Steven Glaser. ishake: number of sensors. Mathematics of Control, Signals,
mobile phones as seismic sensors–user study findings. and Systems (MCSS), 1(2):167–182, 1988.
In Proceedings of the 10th International Conference on [22] Xiaoxiao Yu, Qiao Fu, Lin Zhang, Wenzhu Zhang,
Mobile and Ubiquitous Multimedia, pages 43–52. Victor OK Li, and Leo Guibas. Cabsense: Creating
ACM, 2011. high-resolution urban pollution maps with taxi fleets
[10] M. Faulkner, A. Liu, and A. Krause. A fresh (Preprint). In preparation, In preparation.
perspective: Learning to sparsify for detection in
massive noisy sensor networks. In Proceedings of the
12th ACM/IEEE International Conference on
Information Processing in Sensor Networks. ACM,
2013.
[11] M. Faulkner, M. Olson, R. Chandy, J. Krause, K. M.