|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "How to make interactive maps with Census and local data in R" |
| 4 | +author: "John Johnson" |
| 5 | +date: "July 21, 2017" |
| 6 | +status: publish |
| 7 | +published: true |
| 8 | +categories: Greenville |
| 9 | +tags: leaflet maps |
| 10 | +--- |
| 11 | + |
| 12 | +So the goal here is to focus back on Greenville County and have even more granularity. I look at median house prices near Greenville and then overlay the park data downloaded earlier. This time, for the Census data, I use the `tidycensus` package that came out recently. Furthermore, instead of using `ggplot2` to create a static map, I use the `leaflet` package to create an interactive map, and, furthermore integrate data from disparate sources in a convenient way. |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | +# Download the local park data |
| 20 | + |
| 21 | +The local parks file can be found [here](https://data.openupstate.org/map-layers) courtesy of a small group of dedicated volunteers and an API that makes publishing geojson files easy at the Open Upstate site. We will download a polygon file for the park boundaries as well as a point geojson file for the address of each park. |
| 22 | + |
| 23 | + |
| 24 | +{% highlight r %} |
| 25 | +data_url <- "https://data.openupstate.org/maps/city-parks/parks.php" |
| 26 | +data_file <- "parks.geojson" |
| 27 | +# for some reason, I can't read from the url directly, though the tutorial |
| 28 | +# says I can |
| 29 | +download.file(data_url, data_file) |
| 30 | +data_park <- geojson_read(data_file, what = "sp") |
| 31 | + |
| 32 | +data_url <- "https://data.openupstate.org/maps/city-parks/geojson.php" |
| 33 | +data_file <- "parks_point.geojson" |
| 34 | +# for some reason, I can't read from the url directly, though the tutorial |
| 35 | +# says I can |
| 36 | +download.file(data_url, data_file) |
| 37 | +data_park_addr <- geojson_read(data_file, what = "sp") |
| 38 | +{% endhighlight %} |
| 39 | + |
| 40 | +# Download the median home value data |
| 41 | + |
| 42 | +This code from `tidycensus` downloads demographic data *and* geometric data in a list column. A list column is a data frame, but one of the variables really contains a spatial data frame for each observation, which gives the polygon data for the census tracts. Having the demographic and geometric data in this format eases bookkeeping, and, thankfully, leaflet understands this format. |
| 43 | + |
| 44 | + |
| 45 | +{% highlight r %} |
| 46 | +gvl_value <- get_acs(geography = "tract", |
| 47 | + variables = "B25077_001", |
| 48 | + state = "SC", |
| 49 | + county = "Greenville County", |
| 50 | + geometry = TRUE) |
| 51 | +{% endhighlight %} |
| 52 | + |
| 53 | +# Plot the census and local data together |
| 54 | + |
| 55 | +Now we bring everything together. The `leaflet` package was written to make extensive use of the pipe operator that `dplyr` introduced a few years ago. We can set a default data frame for a leaflet map, but when we add markers and polygons, we can set it from other data sources. The following code is one way to do this, where we use the `tidycensus`-generated dataset as the foundation of the leaflet. We add polygons and markers for the data park using the `data=` option of `addPolygons` and `addMarkers`. Note the use of the `group=` option to create layers, which can be clicked on and off interactively. The `label=` option (or `popup=` option for addPolygons) are used to generate popup windows that give additional information. |
| 56 | + |
| 57 | + |
| 58 | +{% highlight r %} |
| 59 | +pal <- colorNumeric(palette = "viridis", |
| 60 | + domain = gvl_value$estimate) |
| 61 | + |
| 62 | +gvl_value %>% |
| 63 | + st_transform(crs = "+init=epsg:4326") %>% |
| 64 | + leaflet(width = "100%") %>% |
| 65 | + addProviderTiles(provider = "CartoDB.Positron") %>% |
| 66 | + addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"), |
| 67 | + stroke = FALSE, |
| 68 | + smoothFactor = 0, |
| 69 | + fillOpacity = 0.5, |
| 70 | + color = ~ pal(estimate), |
| 71 | + group="Median home value") %>% |
| 72 | + addLegend("bottomright", |
| 73 | + pal = pal, |
| 74 | + values = ~ estimate, |
| 75 | + title = "Median home value", |
| 76 | + labFormat = labelFormat(prefix = "$"), |
| 77 | + opacity = 1) %>% |
| 78 | + addPolygons(data=data_park,fillOpacity=0.8,group="Parks") %>% |
| 79 | + addMarkers(data=data_park_addr,group="Parks",label=~title) %>% |
| 80 | + addLayersControl(overlayGroups = c("Parks","Median home value")) |
| 81 | +{% endhighlight %} |
| 82 | + |
| 83 | + |
| 84 | + |
| 85 | +{% highlight text %} |
| 86 | +## PhantomJS not found. You can install it with webshot::install_phantomjs(). If it is installed, please make sure the phantomjs executable can be found via the PATH variable. |
| 87 | +{% endhighlight %} |
| 88 | + |
| 89 | + |
| 90 | + |
| 91 | +{% highlight text %} |
| 92 | +## Warning in normalizePath(path.expand(path), winslash, mustWork): path[1]=". |
| 93 | +## \webshot373038b9640a.png": The system cannot find the file specified |
| 94 | +{% endhighlight %} |
| 95 | + |
| 96 | + |
| 97 | + |
| 98 | +{% highlight text %} |
| 99 | +## Warning in file(con, "rb"): cannot open file 'C:\Users\johnd\AppData\Local |
| 100 | +## \Temp\RtmpaiScuL\file373053433dd\webshot373038b9640a.png': No such file or |
| 101 | +## directory |
| 102 | +{% endhighlight %} |
| 103 | + |
| 104 | + |
| 105 | + |
| 106 | +{% highlight text %} |
| 107 | +## Error in file(con, "rb"): cannot open the connection |
| 108 | +{% endhighlight %} |
| 109 | +# Discussion |
| 110 | + |
| 111 | +I'm just starting to learn about the `leaflet` package, but in just a couple of hours (and standing on the shoulders of giants) I was able to put together an interactive map combining Census data (median home value by census tract) and locally-generated data (park locations). Such combinations can be effectively used to examine local situations in the context of rich data already collected at a federal level (assuming the instability at the U.S. Census Bureau is temporary). |
0 commit comments