Etiquetas
This is the first post of a short series to show some code I have learnt to produce maps with R.
NOTE: Although the procedure described in this post is valid, there is a newer code version in one of the chapters of the book «Displaying time series, spatial and space-time data with R«
Some time ago I found this infographic from The New York Times (via this page) and I wondered how a multivariate choropleth map could be produced with R. Here is the code I have arranged to show the results of the last Spanish general elections in a similar fashion.
Some packages are needed:
library(maps) library(maptools) ## EDITED: if you have rgeos installed you won't need ## gpclibPermit() below. library(sp) library(lattice) library(latticeExtra) library(colorspace)
Let’s start with the data, which is available here (thanks to Emilio Torres, who «massaged» the original dataset, available from here).
Each region of the map will represent the percentage of votes obtained by the predominant political option. Besides, only four groups will be considered: the two main parties («PP» and «PSOE»), the abstention results («ABS»), and the rest of parties («OTH»).
load(url('http://dl.dropbox.com/u/40293713/spainVotes/votos2011.rda')) votesData <- votos2011[, 12:1023] ##I don't need all the columns votesData$ABS <- with(votos2011, Total.censo.electoral - Votos.validos) ##abstention Max <- apply(votesData, 1, max) whichMax <- apply(votesData, 1, function(x)names(votesData)[which.max(x)]) ## OTH for everything but PP, PSOE and ABS whichMax[!(whichMax %in% c('PP', 'PSOE', 'ABS'))] <- 'OTH' ## Finally, I calculate the percentage of votes with the electoral census pcMax <- Max/votos2011$Total.censo.electoral * 100
The Spanish administrative boundaries are available as shapefiles at the INE webpage (~70Mb):
espMap <- readShapePoly(fn="mapas_completo_municipal/esp_muni_0109") Encoding(levels(espMap$NOMBRE)) <- "latin1" ##There are some repeated polygons which can be dissolved with ## unionSpatialPolygons. ## EDITED: gpclibPermit() is needed for unionSpatialPolygons to work ## but can be ommited if you have rgeos installed ## (recommended, see comment of Roger Bivand below). espPols <- unionSpatialPolygons(espMap, espMap$PROVMUN)
(EDITED, following the question of Sandra). Spanish maps are commonly displayed with the Canarian islands next to the peninsula. First we have to extract the polygons of the islands and the polygons of the peninsula.
canarias <- sapply(espPols@polygons, function(x)substr(x@ID, 1, 2) %in% c("35", "38")) peninsulaPols <- espPols[!canarias] islandPols <- espPols[canarias]
Then we shift the coordinates of the islands:
dy <- bbox(peninsulaPols)[2,1] - bbox(islandPols)[2,1] dx <- bbox(peninsulaPols)[1,2] - bbox(islandPols)[1,2] islandPols2 <- elide(islandPols, shift=c(dx, dy)) bbIslands <- bbox(islandPols2)
and finally construct a new object binding the shifted islands with the peninsula:
espPols <- rbind(peninsulaPols, islandPols2)
The last step before drawing the map is to link the data with the polygons:
IDs <- sapply(espPols@polygons, function(x)x@ID) idx <- match(IDs, votos2011$PROVMUN) ##Places without information idxNA <- which(is.na(idx)) ##Information to be added to the SpatialPolygons object dat2add <- data.frame(prov = votos2011$PROV, poblacion = votos2011$Nombre.de.Municipio, Max = Max, pcMax = pcMax, who = whichMax)[idx, ] row.names(dat2add) <- IDs espMapVotes <- SpatialPolygonsDataFrame(espPols, dat2add) ## Drop those places without information espMapVotes <- espMapVotes[-idxNA, ]
So let’s draw the map. I will produce a list of plots, one for each group. The «+.trellis» method of the latticeExtra package with Reduce
superposes the elements of this list and produce a trellis
object. I will use a set of sequential palettes from the colorspace package with a different hue for each group.
classes <- levels(factor(whichMax)) nClasses <- length(classes) pList <- lapply(1:nClasses, function(i){ mapClass <- espMapVotes[espMapVotes$who==classes[i],] step <- 360/nClasses ## distance between hues pal <- rev(sequential_hcl(16, h = (30 + step*(i-1))%%360)) ## hues equally spaced pClass <- spplot(mapClass['pcMax'], col.regions=pal, lwd=0.1, at = c(0, 20, 40, 60, 80, 100)) }) p <- Reduce('+', pList)
However, the legend of this trellis
object is not valid.
First, a title for the legend of each element pList
will be
useful. Unfortunately, the levelplot function (the engine under the
spplot method) does not allow for a title with its colorkey
argument. The frameGrob
and packGrob
of the grid
package will do the work.
addTitle <- function(legend, title){ titleGrob <- textGrob(title, gp=gpar(fontsize=8), hjust=1, vjust=1) legendGrob <- eval(as.call(c(as.symbol(legend$fun), legend$args))) ly <- grid.layout(ncol=1, nrow=2, widths=unit(0.9, 'grobwidth', data=legendGrob)) fg <- frameGrob(ly, name=paste('legendTitle', title, sep='_')) pg <- packGrob(fg, titleGrob, row=2) pg <- packGrob(pg, legendGrob, row=1) } for (i in seq_along(classes)){ lg <- pList[[i]]$legend$right lg$args$key$labels$cex=ifelse(i==nClasses, 0.8, 0) ##only the last legend needs labels pList[[i]]$legend$right <- list(fun='addTitle', args=list(legend=lg, title=classes[i])) }
Now, every component of pList
includes a legend with a title below. The last step is to modify the legend of the p
trellis object in order to merge the legends from every component of pList
.
## list of legends legendList <- lapply(pList, function(x){ lg <- x$legend$right clKey <- eval(as.call(c(as.symbol(lg$fun), lg$args))) clKey }) ##function to pack the list of legends in a unique legend ##adapted from latticeExtra::: mergedTrellisLegendGrob packLegend <- function(legendList){ N <- length(legendList) ly <- grid.layout(nrow = 1, ncol = N) g <- frameGrob(layout = ly, name = "mergedLegend") for (i in 1:N) g <- packGrob(g, legendList[[i]], col = i) g } ## The legend of p will include all the legends p$legend$right <- list(fun = 'packLegend', args = list(legendList = legendList))
Here is the result with the provinces boundaries superposed (only for the peninsula due to a problem with the definition of boundaries the Canarian islands in the file) and a rectangle to separate the Canarian islands from the
rest of the map (click on the image to get a SVG file):
provinces <- readShapePoly(fn="mapas_completo_municipal/spain_provinces_ag_2") canarias <- provinces$PROV %in% c(35, 38) peninsulaLines <- provinces[!canarias,] p + layer(sp.polygons(peninsulaLines, lwd = 0.5)) + layer(grid.rect(x=bbIslands[1,1], y=bbIslands[2,1], width=diff(bbIslands[1,]), height=diff(bbIslands[2,]), default.units='native', just=c('left', 'bottom'), gp=gpar(lwd=0.5, fill='transparent')))
Francesc dijo:
Hola Oscar,
Primero felicitarte por el gran post, como ya han comentado, creo que es de esos artículos que se convierten en referencia.
Permíteme que te haga una pregunta, ya que me estoy volviendo un poco loco buscando por mi cuenta.
Imagina que tengo un data frame con un conjunto de puntos geográficos con algunos datos. En concreto este: http://www.businessintelligence.info/docs/listado-longitud-latitud-municipios-espana.xls.
Cómo podría visualizar los puntos sobre el mapa? No entiendo el formato de coordenadas del fichero shp.
Bueno, muchas gracias por tu atención y por posts como estos.
Oscar Perpiñán Lamigueiro dijo:
Hola Francesc,
Gracias por tus comentarios. Me alegra que esta contribución sea útil.
He preparado algo de código (aquí) para responder a lo que necesitas.
Saludos
Oscar
Francesc dijo:
Oscar,
Muchas gracias por la respuesta. Después de mirar tu código (y sin entender muy bien todo el tema de las coordenadas), veo que las coordenadas siguen estando en formato latitud/longitud.
No acabo de entender como (si es que en algún punto se hace) se pasan las coordenadas de los polígonos del fichero shape a latitud/longitud. Perdona por dar la lata, pero se podría hacer tal cosa, tener las coordenadas de los dos ficheros en el mismo formato?
Muchas gracias!.
Oscar Perpiñán Lamigueiro dijo:
Hola,
No entiendo bien tu duda, porque no se bien a cuál de los componentes te refieres. Por una parte están los dos «shapefile» (datos vectoriales en forma de polígonos) que están definidos con una proyección «longlat». Lo puedes comprobar con la función
CRS
aplicada a cualquiera de ellos. Por otra parte tú me indicas una tabla excel que, previamente convertida a csv, leo en R como datos vectoriales, pero esta vez en forma de puntos. Tal y como están estos datos es claro que también debo usar la proyección «longlat».Si lo que necesitas son las coordenadas de los centroides de los polígonos, debes usar la función
coordinates
aplicada a cualquiera de los shapefile que te interese.Saludos.
Oscar
Oscar
Francesc dijo:
Hola Oscar.
Perdona que no fuera demasiado claro, como te digo voy un poco pez. Si no entiendo mal, el fichero shp contiene una serie de polígonos que están formados por coordenadas.
Si accedo a las coordenadas, veo que estas no tienen formato lat/lon. Es decir, si hago:
espMap <- readShapePoly(fn="mapas_completo_municipal/esp_muni_0109")
coordinates(espMap)
summary(coords)
Obtengo;
V1 V2
Min. :-994385 Min. :3144115
1st Qu.: 323579 1st Qu.:4419712
Median : 482834 Median :4567058
Mean : 489380 Mean :4516567
3rd Qu.: 652378 3rd Qu.:4673522
Max. :1123631 Max. :4853345
Lo cual me hace pensar que estos datos no están en formato lat/lon. Seguramente haya dejado de entender algo, pero se podría tener estas coordenadas en lat/lon?
Bueno, de nuevo muchas gracias por tu tiempo.
Oscar Perpiñán Lamigueiro dijo:
Hola,
Disculpa, fallo mío. Contesté demasiado rápido, teniendo en la cabeza otra pregunta similar en la que sí se utilizaban coordenadas longitud-latitud. En este caso, como bien has comprobado, no es así.
Los ficheros que proporciona el INE no tienen esta información, pero rebuscando pude averiguar que usan coordenadas UTM en la zona 30. Por tanto, primero hay que asignar esta información:
proj4string(espMap) <- CRS('+proj=utm +zone=30 +ellps=intl +units=m +no_defs')
Para poder convertir a longitud-latitud necesitas la función
spTransform
del paquete rgdal. Pero este objeto tiene muchos polígonos, así que quizás tarde un rato.Si te interesan shapefiles con coordenadas longitud-latitud quizás sea mejor que uses los datos que tienes en GADM o en Natural Earth.
Saludos.
Oscar.
Francesc Guitart dijo:
Hola Oscar,
Muchas gracias por tu ayuda, he podido hacer el plot que queria. Queria poder usar ggplot2. Dejo aquí un código con lo que hecho por si es de utilidad: mapa provincia de Lleida
Oscar Perpiñán Lamigueiro dijo:
Fantástico.
Un pequeño apunte: si usas ggplot2 te sobran las líneas de
library(lattice)
ylibrary(latticeExtra)
.Y otro apunte: para hacer mapas con ggplot2 te puede interesar este curso de Beatriz Martínez.
Francesc dijo:
Muchas gracias y hasta pronto!
Phil dijo:
Hi,
Quick question, how would you remove the «blank space» between each of the colour ramps so that their edges align?
Also, is there any way to remove the border around those color ramps without removing the tick marks on the right side of the figure legend?
Many thanks!
Phil
Oscar Perpiñán Lamigueiro dijo:
Hi,
First question. I have not tested it, but I think the solution is in the «widths» argument of «grid.layout» and the «width» argument of each call to «packGrob».
Second question. The «draw.colorkey» of the lattice package uses the parameter «axis.line» (obtained via trellis.par.get) to define the characteristics of the legend box. Unfortunately, this same parameter is used to define the axis and ticks. Thus, if you choose a transparent line for the legend box, you will also remove the axis and ticks. I am sorry but I have no solution for it.
Best,
Oscar.
Phil dijo:
Thank you for your quick response. I managed to obtain the results i wanted. Best, Phil
Stephen Clermont dijo:
I am trying to make a similar map of a different country, Scotland, and when I reduce the pList, the resulting map panel only covers the portion of the map in the first selection of regions. How do I make the view big enough to cover the entire country?
Oscar Perpiñán Lamigueiro dijo:
Hi Stephen,
The canvas is defined by the first object of the list. You could try to put a dummy spplot at the beginning of the list that covers the whole extent. Or even better, you could modify the xlim and ylim arguments in spplot. Their default values are defined by the bounding box of the object: xlim = bbox(obj)[1, ].
Hope it helps.
Oscar.
ylim = bbox(obj)[2, ]
Stephen Clermont dijo:
Thank you for the quick response. How would I put a dummy spplot at the beginning of the list?
Oscar Perpiñán Lamigueiro dijo:
The trick is to print a
Spatial*
filled withNA
. Let’s saymyMap
is theSpatialPolygonsDataFrame
object that covers the whole extent. If you add a new column/variable to this object and fill it withNA
(for exampledummy
), you will get an empty panel if you print it withspplot
. Thus, you could do something like:spplot(myMap["dummy"]) + p
wherep
is the result ofReduce
.Anyway, I think it’s better if you use
xlim/ylim
.Stephen Clermont dijo:
Thank you so much. Helpful responses and really good book.
Oscar Perpiñán Lamigueiro dijo:
Thanks a lot for your feedback.
Pingback: R - Postly
melodyaross dijo:
Man, I wish I could get a pdf of this!
Oscar Perpiñán Lamigueiro dijo:
Have you tried «Print > Save as a File…»?
Pingback: Una de mapas con R
Pingback: Una de mapas con R » G. R. Serrano
Paco dijo:
Hello, Oscar
I’m a undergraduate student and I’m new using R for maps. Do you know how to remove, in this case, Canadian Islands from the map?
Oscar Perpiñán Lamigueiro dijo:
Hello,
You can drop the Canarian islands if you substitute
espPols
withpeninsulaPols
in the code or if, instead ofespPols <- rbind(peninsulaPols, islandPols2)
you useespPols <- peninsulaPols
.Cheers,
Oscar.
http://google.com dijo:
“Maps with R (I) Omnia sunt Communia!” was in fact a truly awesome blog, .
Continue composing and I’m going to continue to keep reading through! I appreciate it ,Julia
pvaldes dijo:
Es-pec-ta-cular…
la clase de ejemplos que crean escuela
Oscar Perpiñán Lamigueiro dijo:
Muchas gracias.
Pingback: datanalytics » Veinte herramientas de visualización
manuel dijo:
amazing, … but seems a litle problem with «finisterrae»?
Oscar Perpiñán Lamigueiro dijo:
Oops. You are right. And I love that region! I have to find a fix.
jcberman dijo:
Dear Oscar,
I believe I faithfully copied the code, loaded the data, installed the libraries, but after some think time I only get a big white window. I did check some of the variables and they are populated, .and the graphics demos work fine. Any suggestions as I am new to R.
Thanks,
Oscar Perpiñán Lamigueiro dijo:
Hello,
Try to write a PDF file with the graphic. For example:
trellis.device(pdf, file=’map.pdf’)
print(p)
dev.off()
Does it work?
Oscar.
jcberman dijo:
Hi Oscar,
The PDF is perfect. Thank you for the suggestion.
Oscar Perpiñán Lamigueiro dijo:
Then the problem is with your default graphical device. Try to change its settings.
Best,
Oscar.
jcberman dijo:
I was wondering if you could help me. I believe I copied all of the lines of code, got the right files, and installed the appropriate libraries, but all I get after some think time, is a all white window labeled QUartz 1 [*]. The vairables seem to be populated., just no picture. Other graphics work fine, i.e. demo(labels)
Thanks for any suggestions.
Sandra dijo:
Dear Oscar,
I have been working with the Spanish map for quite some time, and now that I want to make a nice map with latitude and longitude coordinates (of Spain mainland) with the Canary island in a box in the bottom left corner I am at loss (I can do that without axes, but with it just doesn’t work). I think it’s a rather stupid problem, easily solved… but just cannot see how!
Thanks for any help… 🙂
Cheers!
Oscar Perpiñán Lamigueiro dijo:
Hello Sandra,
I have edited the post to include the islands. The trick is to use ellide (from maptools) and rbind. If you need the latitude/longitude values use «scales=list(draw=TRUE)» inside spplot. Hope it helps!.
Oscar.
dog snuggie dijo:
Interesting…
Tinoush Jamali dijo:
Dear Oscar,
Many thanks for your post. I have a question. Do you think that there is any possibility to add the direction and size between the areas, regions or countries to the map in R. I add a link here and if you scroll down, you can see what I mean:
http://ucatlas.ucsc.edu/trade/tradeflows.htm
Is it possible to bring the arrows with weight of that for relation among the regions into the map in R.
Sorry for bothering you with this question and thanks for you response.
Regards,
Tinoush
Oscar Perpiñán Lamigueiro dijo:
Hello Tinoush,
I think this post will be useful for your needs.
Best,
Oscar.
Pingback: Mapas con R en Omnia sunt Communia! » Análisis y decisión
margaretrdonald dijo:
Thanks for some really nice stuff. However even though I had loaded rgeos the line
espPols <- unionSpatialPolygons(espMap, espMap$PROVMUN)
failed until I put
require(gpclib)
gpclibPermit()
before it…
Cheers,
Margaret Donald
Oscar Perpiñán Lamigueiro dijo:
I am not sure but perhaps it is only a problem of old versions. Could you try updating both rgeos and maptools?
Oscar.
Roger Bivand dijo:
Very nice! Please note that gpclibPermit()is not needed, as unionSpatialPolygons() will use gUnaryUnion() in rgeos instead of encumbered gpclib functions. Support for gpclib in maptools will go away very soon now, as rgeos is available for all platforms.
Oscar Perpiñán Lamigueiro dijo:
Thanks for the info, Roger. I have edited the post.
However, I found that (at least at CRAN) rgeos is not available for MacOSX:
http://cran.r-project.org/web/packages/rgeos/index.html
Roger Bivand dijo:
It has been made available by Prof. Brian Ripley for OSX Intel architectures. All you need to do is to ensure that the first two repositories are on the search path, so:
setRepositories(ind=1:2)
library(rgeos)
Then you get your shiny rgeos, I think now with the very latest GEOS inside. Maybe I should add this advice to the package description!
MarioLosada dijo:
Excelente Trabajo!, de mucha ayuda 😉
Has llegado a usar el GeoTools for Java?
Oscar Perpiñán Lamigueiro dijo:
Gracias, me alegra que sea útil.
No he usado GeoTools. Me lo apunto en la lista de tareas :-).
Marcelo Lugo dijo:
Hello!
I’m trying to replicate your steps, but I keep getting the following error:
> votesData$ABS <- with(votos2011, Total.censo.electoral – Votos.válidos) ##abstention
Error en eval(expr, envir, enclos) : objeto 'Votos.válidos' no encontrado
Thanks,
Marcelo Lugo
Oscar Perpiñán Lamigueiro dijo:
Hello, Erin, Marcelo.
I think the problem you are having is related to the use of non-ASCII character in the names of «votos2011». I have changed «Votos.válidos» to «Votos.validos» in the dataset and in the code. Please let me know if this fix works for you.
Cheers.
Oscar.
Erin Hodgess dijo:
Hello again!
Thanks for the great fix on the variable names.
However, I’m having trouble with the zip file. I keep getting that the downloaded file is not a valid zip file. I’m using Windows.
Should I be using Linux, please?
Thanks,
Erin
Oscar Perpiñán Lamigueiro dijo:
Hello,
If I am not wrong, the zip from the INE webpage is created with the compression format 7z. Try the 7-zip software to open it.
Oscar.
Erin Hodgess dijo:
Hello!
I’m trying to replicate your steps, but I keep getting the following error:
> votesData$ABS
When I checked the names on votos2011, there is no Votos.validos.
Have you run into this please?
Thanks,
Erin
Luis dijo:
¡Muchas gracias! It is great to find such a detailed explanation of the whole process of map creation. I can think quite a few variants of the map, but your code provides the starting point for all of them.
Saludos desde New Zealand.
Tim Riffe dijo:
brilliant work! I realize that the political landscape in Spain is more than 2 dimensional, but did you think of reducing the variables to a single gradient, e.g. from left to right, rather than choosing a dominant party? In the US it works rather well, with purples dominating in places with an evenly split electorate, and red and blue on the extremes. For Spain you’d have to do some ad hoc party aggregating. One could also make the political choropleth for nationalist vs centralist parties using a continuous color ramp. Not to make your life more difficult, but I think this and other maps from the same data source are of high interest in Spain. Cheers!
Oscar Perpiñán Lamigueiro dijo:
I decided to use this dataset to play with several variables in a map, and thus building a «complex» legend, and to show the relevance of the abstention in Spain (which is the most important choice in several locations).
Thanks for your ideas!