S2 cells and space-filling curves: Keys to building better digital map tools for cities
An open geospatial library for all the civic hackers out there.
At Sidewalk Labs, we love cities, and our mission is to make them even better using technology. One of the things that makes cities great is their sense of place, and for computers, that begins with being able to recognize where things are located. Some of the most useful apps for city life rely on fast access to geospatial data: think Google Maps, Foursquare, Uber, or Yelp.
Computers actually have a tough time indexing place-based data efficiently. To understand why, consider that we normally represent our three-dimensional Earth as a flat surface shown in two dimensions, either on a piece of paper or a digital screen. Computers lay out data in an even more simplistic way, indexing it using numbers on a one-dimensional line. So for computers to deal efficiently with geographic data, we need a way to map from one dimension to two.
Though no one knew it at the time, this problem took a big step toward resolution in 1891. That’s when the mathematician David Hilbert first described the “space-filling curve” that came to bear his name. Hilbert demonstrated that it was possible to draw a one-dimensional line that filled every part of a two-dimensional space:
It turns out a Hilbert curve is exactly what you need to index Earth with a computer, as Google Maps does. Fast forward to 2005, when Google engineer Eric Veach realized that a Hilbert curve could do for our planet what Larry and Sergey’s search engine had done for the World Wide Web: index it efficiently, and thus make it quickly searchable. The trick is to enclose Earth in a planet-size cube, fill each side of the cube with a Hilbert curve (below, in yellow), and project the Hilbert curve onto the Earth’s surface (below, in red):
Assuming you’re reading this while on Earth (apologies to the astronauts on the International Space Station) your location can be represented as a very specific point on a very long line. (Click here to find out your number!) And because the Hilbert curve is a fractal, less precise versions of the same number represent the same location at greater scales, all the way up to the side of the planet you’re currently on.
Simply put, the Hilbert curve, when projected onto Earth, is a super efficient way to represent a place as a single number.
So what does all this mean for urban technology?
Well, the Hilbert curve enables computers to see space as “S2 cells” — basically tiny units of geography represented by single numbers. S2 cells allow software engineers, civic hackers, or any coder developing digital tools for city life to index and retrieve the location of buildings, streets, parks, trees, park benches, parking spots, buses, and so on. As Sidewalk develops Flow, our transportation coordination system, S2 is an important part of our infrastructure.
Some S2 tools already exist, including Google’s excellent open-access S2 Geometry Library, which is written in C++. Sidewalk’s engineering team took a version of this S2 library that was converted to Python, made some improvements, and now we’re posting it online, free and open for anyone to use. We’ve also created a couple interactive tools that use the S2 library: Region Coverer and Planetary View.
Let’s take a closer look at Region Coverer. The first step in using it is to find the location you’d like to index. We’ll try the area around Grand Central Terminal, near the Sidewalk office:
Using the map toolbar, you can draw a rectangle or circle around the target location. The library spits out a “covering set” of blue, multi-sized S2 cells that blanket the area. On the toolbar, below the sliders, you’ll find a list of integers (in hexadecimal) that represent these cells. A developer can take this code and run a database query asking for, say, every parking garage in these locations.
The parameter sliders let you adjust both the cell “levels” (meaning size) and the maximum number of cells covering an area. If you set the level low and the number high, you see smaller S2 cells running along the perimeter of the chosen area, for greater coverage precision. Here’s that same area as above with a finer covering set of S2 cells:
The size and number of S2 cells represent a tradeoff for programmers. On one hand, more precise coverage means fewer false positives. On the other hand, more cells mean more computation. Still, generally speaking, using the S2 library can save developers an enormous amount of time—processing a query that might normally be very time-consuming in just an instant.
The Planetary View shows how the Hilbert space-filling curve is mapped onto Earth. Earth is surrounded by a cube (calm down, Star Trek fans — the Borg Collective has nothing to do with this), and each face is filled with a Hilbert curve. The curve from each face is connected to two adjacent faces, forming a single line. The curve is then mapped onto the Earth’s surface by drawing an imaginary line from each point on the curve to the center of Earth, and noting where it passes through the Earth’s surface.
Those are the basics. I’ve simplified things a great deal by design, but if you’d like a more technical explanation, you can find one here. And the demo remains a prototype: feel free to reach out with any questions or comments.
We’ve released this updated S2 library for a couple of reasons. As mentioned, S2 is an important piece of the tools we might develop in partnership with cities, such as Flow. More broadly, we believe in giving back to the developer community, especially the civic hackers using open data to improve the everyday experience of city life.
When asked about a student who dropped math to pursue poetry, Hilbert once said, “Good. He did not have enough imagination to become a mathematician.” Hopefully our S2 library can help all the mathematicians, coders, and developers out there put their imaginations to good use.
Follow what Sidewalk Labs is thinking, doing, and reading with our weekly newsletter.