Maps with R (I)

18 sábado Feb 2012

Posted by Oscar Perpiñán Lamigueiro in R-english, visualization

Etiquetas

choropleth, elections, lattice, maps, New York Times, R

This is the first post of a short series to show some code I have learnt to produce maps with R.

NOTE: Although the procedure described in this post is valid, there is a newer code version in one of the chapters of the book «Displaying time series, spatial and space-time data with R«

Some time ago I found this infographic from The New York Times (via this page) and I wondered how a multivariate choropleth map could be produced with R. Here is the code I have arranged to show the results of the last Spanish general elections in a similar fashion.

Some packages are needed:

library(maps)
library(maptools)
## EDITED: if you have rgeos installed you won't need
## gpclibPermit() below.
library(sp)
library(lattice)
library(latticeExtra)
library(colorspace)

Let’s start with the data, which is available here (thanks to Emilio Torres, who «massaged» the original dataset, available from here).

Each region of the map will represent the percentage of votes obtained by the predominant political option. Besides, only four groups will be considered: the two main parties («PP» and «PSOE»), the abstention results («ABS»), and the rest of parties («OTH»).

load(url('http://dl.dropbox.com/u/40293713/spainVotes/votos2011.rda'))
votesData <- votos2011[, 12:1023] ##I don't need all the columns

votesData$ABS <- with(votos2011, Total.censo.electoral - Votos.validos) ##abstention

Max <- apply(votesData, 1, max)
whichMax <- apply(votesData,  1, function(x)names(votesData)[which.max(x)])

## OTH for everything but PP, PSOE and ABS
whichMax[!(whichMax %in% c('PP',  'PSOE', 'ABS'))] <- 'OTH'

## Finally, I calculate the percentage of votes with the electoral census
pcMax <- Max/votos2011$Total.censo.electoral * 100

The Spanish administrative boundaries are available as shapefiles at the INE webpage (~70Mb):

espMap <- readShapePoly(fn="mapas_completo_municipal/esp_muni_0109")
Encoding(levels(espMap$NOMBRE)) <- "latin1"

##There are some repeated polygons which can be dissolved with
## unionSpatialPolygons.
## EDITED: gpclibPermit() is needed for unionSpatialPolygons to work
## but can be ommited if you have rgeos installed
## (recommended, see comment of Roger Bivand below).
espPols <- unionSpatialPolygons(espMap, espMap$PROVMUN)

(EDITED, following the question of Sandra). Spanish maps are commonly displayed with the Canarian islands next to the peninsula. First we have to extract the polygons of the islands and the polygons of the peninsula.

canarias <-  sapply(espPols@polygons, function(x)substr(x@ID, 1, 2) %in% c("35",  "38"))
peninsulaPols <- espPols[!canarias]
islandPols <- espPols[canarias]

Then we shift the coordinates of the islands:

dy <- bbox(peninsulaPols)[2,1] - bbox(islandPols)[2,1]
dx <- bbox(peninsulaPols)[1,2] - bbox(islandPols)[1,2]

islandPols2 <- elide(islandPols, shift=c(dx, dy))
bbIslands <- bbox(islandPols2)

and finally construct a new object binding the shifted islands with the peninsula:

espPols <- rbind(peninsulaPols, islandPols2)

The last step before drawing the map is to link the data with the polygons:

IDs <- sapply(espPols@polygons, function(x)x@ID)
idx <- match(IDs, votos2011$PROVMUN)

##Places without information
idxNA <- which(is.na(idx))

##Information to be added to the SpatialPolygons object
dat2add <- data.frame(prov = votos2011$PROV,
poblacion = votos2011$Nombre.de.Municipio,
Max = Max,  pcMax = pcMax,  who = whichMax)[idx, ]

row.names(dat2add) <- IDs
espMapVotes <- SpatialPolygonsDataFrame(espPols, dat2add)

## Drop those places without information
espMapVotes <- espMapVotes[-idxNA, ]

So let’s draw the map. I will produce a list of plots, one for each group. The «+.trellis» method of the latticeExtra package with Reduce superposes the elements of this list and produce a trellis object. I will use a set of sequential palettes from the colorspace package with a different hue for each group.

classes <- levels(factor(whichMax))
nClasses <- length(classes)

pList <- lapply(1:nClasses, function(i){
  mapClass <- espMapVotes[espMapVotes$who==classes[i],]
  step <- 360/nClasses ## distance between hues
  pal <- rev(sequential_hcl(16, h = (30 + step*(i-1))%%360)) ## hues equally spaced
  pClass <- spplot(mapClass['pcMax'], col.regions=pal, lwd=0.1,
                   at = c(0, 20, 40, 60, 80, 100))
  })

p <- Reduce('+', pList)

However, the legend of this trellis object is not valid.

First, a title for the legend of each element pList will be
useful. Unfortunately, the levelplot function (the engine under the
spplot method) does not allow for a title with its colorkey
argument. The frameGrob and packGrob of the grid package will do the work.

addTitle <- function(legend, title){
  titleGrob <- textGrob(title, gp=gpar(fontsize=8), hjust=1, vjust=1)
  legendGrob <- eval(as.call(c(as.symbol(legend$fun), legend$args)))
  ly <- grid.layout(ncol=1, nrow=2, widths=unit(0.9, 'grobwidth', data=legendGrob))
  fg <- frameGrob(ly, name=paste('legendTitle', title, sep='_'))
  pg <- packGrob(fg, titleGrob, row=2)
  pg <- packGrob(pg, legendGrob, row=1)
  }

for (i in seq_along(classes)){
  lg <- pList[[i]]$legend$right
  lg$args$key$labels$cex=ifelse(i==nClasses, 0.8, 0) ##only the last legend needs labels
  pList[[i]]$legend$right <- list(fun='addTitle',
                                  args=list(legend=lg, title=classes[i]))
}

Now, every component of pList includes a legend with a title below. The last step is to modify the legend of the p trellis object in order to merge the legends from every component of pList.

## list of legends
legendList <- lapply(pList, function(x){
  lg <- x$legend$right
  clKey <- eval(as.call(c(as.symbol(lg$fun), lg$args)))
  clKey
})

##function to pack the list of legends in a unique legend
##adapted from latticeExtra::: mergedTrellisLegendGrob
packLegend <- function(legendList){
  N <- length(legendList)
  ly <- grid.layout(nrow = 1,  ncol = N)
  g <- frameGrob(layout = ly, name = "mergedLegend")
  for (i in 1:N) g <- packGrob(g, legendList[[i]], col = i)
  g
}

## The legend of p will include all the legends
p$legend$right <- list(fun = 'packLegend',  args = list(legendList = legendList))

Here is the result with the provinces boundaries superposed (only for the peninsula due to a problem with the definition of boundaries the Canarian islands in the file) and a rectangle to separate the Canarian islands from the
rest of the map (click on the image to get a SVG file):

provinces <- readShapePoly(fn="mapas_completo_municipal/spain_provinces_ag_2")

canarias <- provinces$PROV %in% c(35, 38)
peninsulaLines <- provinces[!canarias,]

p +
  layer(sp.polygons(peninsulaLines,  lwd = 0.5)) +
  layer(grid.rect(x=bbIslands[1,1], y=bbIslands[2,1],
                  width=diff(bbIslands[1,]),
                  height=diff(bbIslands[2,]),
                  default.units='native', just=c('left', 'bottom'),
                  gp=gpar(lwd=0.5, fill='transparent')))

58 pensamientos sobre “Maps with R (I)”

Francesc dijo:

30/03/2015 en 7:26

Hola Oscar,

Primero felicitarte por el gran post, como ya han comentado, creo que es de esos artículos que se convierten en referencia.

Permíteme que te haga una pregunta, ya que me estoy volviendo un poco loco buscando por mi cuenta.

Imagina que tengo un data frame con un conjunto de puntos geográficos con algunos datos. En concreto este: http://www.businessintelligence.info/docs/listado-longitud-latitud-municipios-espana.xls.

Cómo podría visualizar los puntos sobre el mapa? No entiendo el formato de coordenadas del fichero shp.

Bueno, muchas gracias por tu atención y por posts como estos.
- Oscar Perpiñán Lamigueiro dijo:
  
  30/03/2015 en 8:36
  
  Hola Francesc,
  
  Gracias por tus comentarios. Me alegra que esta contribución sea útil.
  
  He preparado algo de código (aquí) para responder a lo que necesitas.
  
  Saludos
  
  Oscar
  - Francesc dijo:
    
    01/04/2015 en 21:24
    
    Oscar,
    
    Muchas gracias por la respuesta. Después de mirar tu código (y sin entender muy bien todo el tema de las coordenadas), veo que las coordenadas siguen estando en formato latitud/longitud.
    
    No acabo de entender como (si es que en algún punto se hace) se pasan las coordenadas de los polígonos del fichero shape a latitud/longitud. Perdona por dar la lata, pero se podría hacer tal cosa, tener las coordenadas de los dos ficheros en el mismo formato?
    
    Muchas gracias!.
  - Oscar Perpiñán Lamigueiro dijo:
    
    03/04/2015 en 10:31
    
    Hola,
    
    No entiendo bien tu duda, porque no se bien a cuál de los componentes te refieres. Por una parte están los dos «shapefile» (datos vectoriales en forma de polígonos) que están definidos con una proyección «longlat». Lo puedes comprobar con la función CRS aplicada a cualquiera de ellos. Por otra parte tú me indicas una tabla excel que, previamente convertida a csv, leo en R como datos vectoriales, pero esta vez en forma de puntos. Tal y como están estos datos es claro que también debo usar la proyección «longlat».
    Si lo que necesitas son las coordenadas de los centroides de los polígonos, debes usar la función coordinates aplicada a cualquiera de los shapefile que te interese.
    
    Saludos.
    
    Oscar
    Oscar
  - Francesc dijo:
    
    03/04/2015 en 18:03
    
    Hola Oscar.
    
    Perdona que no fuera demasiado claro, como te digo voy un poco pez. Si no entiendo mal, el fichero shp contiene una serie de polígonos que están formados por coordenadas.
    
    Si accedo a las coordenadas, veo que estas no tienen formato lat/lon. Es decir, si hago:
    
    espMap <- readShapePoly(fn="mapas_completo_municipal/esp_muni_0109")
    coordinates(espMap)
    summary(coords)
    
    Obtengo;
    V1 V2
    Min. :-994385 Min. :3144115
    1st Qu.: 323579 1st Qu.:4419712
    Median : 482834 Median :4567058
    Mean : 489380 Mean :4516567
    3rd Qu.: 652378 3rd Qu.:4673522
    Max. :1123631 Max. :4853345
    
    Lo cual me hace pensar que estos datos no están en formato lat/lon. Seguramente haya dejado de entender algo, pero se podría tener estas coordenadas en lat/lon?
    
    Bueno, de nuevo muchas gracias por tu tiempo.
  - Oscar Perpiñán Lamigueiro dijo:
    
    04/04/2015 en 8:12
    
    Hola,
    
    Disculpa, fallo mío. Contesté demasiado rápido, teniendo en la cabeza otra pregunta similar en la que sí se utilizaban coordenadas longitud-latitud. En este caso, como bien has comprobado, no es así.
    Los ficheros que proporciona el INE no tienen esta información, pero rebuscando pude averiguar que usan coordenadas UTM en la zona 30. Por tanto, primero hay que asignar esta información:
    
    proj4string(espMap) <- CRS('+proj=utm +zone=30 +ellps=intl +units=m +no_defs')
    
    Para poder convertir a longitud-latitud necesitas la función spTransform del paquete rgdal. Pero este objeto tiene muchos polígonos, así que quizás tarde un rato.
    
    Si te interesan shapefiles con coordenadas longitud-latitud quizás sea mejor que uses los datos que tienes en GADM o en Natural Earth.
    
    Saludos.
    
    Oscar.
  - Francesc Guitart dijo:
    
    08/04/2015 en 22:26
    
    Hola Oscar,
    
    Muchas gracias por tu ayuda, he podido hacer el plot que queria. Queria poder usar ggplot2. Dejo aquí un código con lo que hecho por si es de utilidad: mapa provincia de Lleida
  - Oscar Perpiñán Lamigueiro dijo:
    
    09/04/2015 en 6:10
    
    Fantástico.
    Un pequeño apunte: si usas ggplot2 te sobran las líneas de library(lattice) y library(latticeExtra).
    Y otro apunte: para hacer mapas con ggplot2 te puede interesar este curso de Beatriz Martínez.
  - Francesc dijo:
    
    09/04/2015 en 8:16
    
    Muchas gracias y hasta pronto!
Phil dijo:

03/10/2014 en 12:14

Hi,

Quick question, how would you remove the «blank space» between each of the colour ramps so that their edges align?
Also, is there any way to remove the border around those color ramps without removing the tick marks on the right side of the figure legend?

Many thanks!
Phil
- Oscar Perpiñán Lamigueiro dijo:
  
  03/10/2014 en 23:11
  
  Hi,
  
  First question. I have not tested it, but I think the solution is in the «widths» argument of «grid.layout» and the «width» argument of each call to «packGrob».
  Second question. The «draw.colorkey» of the lattice package uses the parameter «axis.line» (obtained via trellis.par.get) to define the characteristics of the legend box. Unfortunately, this same parameter is used to define the axis and ticks. Thus, if you choose a transparent line for the legend box, you will also remove the axis and ticks. I am sorry but I have no solution for it.
  
  Best,
  
  Oscar.
  - Phil dijo:
    
    05/10/2014 en 2:14
    
    Thank you for your quick response. I managed to obtain the results i wanted. Best, Phil
Stephen Clermont dijo:

27/09/2014 en 16:55

I am trying to make a similar map of a different country, Scotland, and when I reduce the pList, the resulting map panel only covers the portion of the map in the first selection of regions. How do I make the view big enough to cover the entire country?
- Oscar Perpiñán Lamigueiro dijo:
  
  28/09/2014 en 15:30
  
  Hi Stephen,
  
  The canvas is defined by the first object of the list. You could try to put a dummy spplot at the beginning of the list that covers the whole extent. Or even better, you could modify the xlim and ylim arguments in spplot. Their default values are defined by the bounding box of the object: xlim = bbox(obj)[1, ].
  
  Hope it helps.
  
  Oscar.
  ylim = bbox(obj)[2, ]
  - Stephen Clermont dijo:
    
    29/09/2014 en 17:56
    
    Thank you for the quick response. How would I put a dummy spplot at the beginning of the list?
  - Oscar Perpiñán Lamigueiro dijo:
    
    01/10/2014 en 8:59
    
    The trick is to print a Spatial* filled with NA. Let’s say myMap is the SpatialPolygonsDataFrame object that covers the whole extent. If you add a new column/variable to this object and fill it with NA (for example dummy), you will get an empty panel if you print it with spplot. Thus, you could do something like:
    spplot(myMap["dummy"]) + p where p is the result of Reduce.
    Anyway, I think it’s better if you use xlim/ylim.
  - Stephen Clermont dijo:
    
    01/10/2014 en 15:30
    
    Thank you so much. Helpful responses and really good book.
  - Oscar Perpiñán Lamigueiro dijo:
    
    01/10/2014 en 16:21
    
    Thanks a lot for your feedback.
Pingback: R - Postly
melodyaross dijo:

08/04/2014 en 0:50

Man, I wish I could get a pdf of this!
- Oscar Perpiñán Lamigueiro dijo:
  
  08/04/2014 en 9:11
  
  Have you tried «Print > Save as a File…»?
Pingback: Una de mapas con R
Pingback: Una de mapas con R » G. R. Serrano
Paco dijo:

22/04/2013 en 4:39

Hello, Oscar

I’m a undergraduate student and I’m new using R for maps. Do you know how to remove, in this case, Canadian Islands from the map?
- Oscar Perpiñán Lamigueiro dijo:
  
  22/04/2013 en 11:10
  
  Hello,
  
  You can drop the Canarian islands if you substitute espPols with peninsulaPols in the code or if, instead of espPols <- rbind(peninsulaPols, islandPols2) you use espPols <- peninsulaPols.
  
  Cheers,
  
  Oscar.
http://google.com dijo:

12/02/2013 en 19:58

“Maps with R (I) � Omnia sunt Communia!” was in fact a truly awesome blog, .

Continue composing and I’m going to continue to keep reading through! I appreciate it ,Julia
pvaldes dijo:

31/10/2012 en 18:45

Es-pec-ta-cular…
la clase de ejemplos que crean escuela
- Oscar Perpiñán Lamigueiro dijo:
  
  01/11/2012 en 18:35
  
  Muchas gracias.
Pingback: datanalytics » Veinte herramientas de visualización
manuel dijo:

14/06/2012 en 9:07

amazing, … but seems a litle problem with «finisterrae»?
- Oscar Perpiñán Lamigueiro dijo:
  
  14/06/2012 en 9:24
  
  Oops. You are right. And I love that region! I have to find a fix.
jcberman dijo:

09/05/2012 en 16:58

Dear Oscar,
I believe I faithfully copied the code, loaded the data, installed the libraries, but after some think time I only get a big white window. I did check some of the variables and they are populated, .and the graphics demos work fine. Any suggestions as I am new to R.

Thanks,
- Oscar Perpiñán Lamigueiro dijo:
  
  09/05/2012 en 17:46
  
  Hello,
  
  Try to write a PDF file with the graphic. For example:
  
  trellis.device(pdf, file=’map.pdf’)
  print(p)
  dev.off()
  
  Does it work?
  
  Oscar.
  - jcberman dijo:
    
    09/05/2012 en 18:23
    
    Hi Oscar,
    The PDF is perfect. Thank you for the suggestion.
  - Oscar Perpiñán Lamigueiro dijo:
    
    09/05/2012 en 19:48
    
    Then the problem is with your default graphical device. Try to change its settings.
    
    Best,
    
    Oscar.
jcberman dijo:

09/05/2012 en 16:55

I was wondering if you could help me. I believe I copied all of the lines of code, got the right files, and installed the appropriate libraries, but all I get after some think time, is a all white window labeled QUartz 1 [*]. The vairables seem to be populated., just no picture. Other graphics work fine, i.e. demo(labels)

Thanks for any suggestions.
Sandra dijo:

17/04/2012 en 19:08

Dear Oscar,
I have been working with the Spanish map for quite some time, and now that I want to make a nice map with latitude and longitude coordinates (of Spain mainland) with the Canary island in a box in the bottom left corner I am at loss (I can do that without axes, but with it just doesn’t work). I think it’s a rather stupid problem, easily solved… but just cannot see how!
Thanks for any help… 🙂
Cheers!
- Oscar Perpiñán Lamigueiro dijo:
  
  18/04/2012 en 12:48
  
  Hello Sandra,
  
  I have edited the post to include the islands. The trick is to use ellide (from maptools) and rbind. If you need the latitude/longitude values use «scales=list(draw=TRUE)» inside spplot. Hope it helps!.
  
  Oscar.
dog snuggie dijo:

09/03/2012 en 4:33

Interesting…
Tinoush Jamali dijo:

07/03/2012 en 18:17

Dear Oscar,

Many thanks for your post. I have a question. Do you think that there is any possibility to add the direction and size between the areas, regions or countries to the map in R. I add a link here and if you scroll down, you can see what I mean:

http://ucatlas.ucsc.edu/trade/tradeflows.htm

Is it possible to bring the arrows with weight of that for relation among the regions into the map in R.

Sorry for bothering you with this question and thanks for you response.

Regards,
Tinoush
- Oscar Perpiñán Lamigueiro dijo:
  
  07/03/2012 en 18:49
  
  Hello Tinoush,
  
  I think this post will be useful for your needs.
  
  Best,
  
  Oscar.
Pingback: Mapas con R en Omnia sunt Communia! » Análisis y decisión
margaretrdonald dijo:

26/02/2012 en 4:23

Thanks for some really nice stuff. However even though I had loaded rgeos the line

espPols <- unionSpatialPolygons(espMap, espMap$PROVMUN)

failed until I put
require(gpclib)
gpclibPermit()
before it…

Cheers,
Margaret Donald
- Oscar Perpiñán Lamigueiro dijo:
  
  27/02/2012 en 0:14
  
  I am not sure but perhaps it is only a problem of old versions. Could you try updating both rgeos and maptools?
  
  Oscar.
Roger Bivand dijo:

20/02/2012 en 20:48

Very nice! Please note that gpclibPermit()is not needed, as unionSpatialPolygons() will use gUnaryUnion() in rgeos instead of encumbered gpclib functions. Support for gpclib in maptools will go away very soon now, as rgeos is available for all platforms.
- Oscar Perpiñán Lamigueiro dijo:
  
  21/02/2012 en 12:52
  
  Thanks for the info, Roger. I have edited the post.
  However, I found that (at least at CRAN) rgeos is not available for MacOSX:
  http://cran.r-project.org/web/packages/rgeos/index.html
  - Roger Bivand dijo:
    
    22/02/2012 en 17:53
    
    It has been made available by Prof. Brian Ripley for OSX Intel architectures. All you need to do is to ensure that the first two repositories are on the search path, so:
    
    setRepositories(ind=1:2)
    library(rgeos)
    
    Then you get your shiny rgeos, I think now with the very latest GEOS inside. Maybe I should add this advice to the package description!
MarioLosada dijo:

19/02/2012 en 16:10

Excelente Trabajo!, de mucha ayuda 😉

Has llegado a usar el GeoTools for Java?
- Oscar Perpiñán Lamigueiro dijo:
  
  19/02/2012 en 17:17
  
  Gracias, me alegra que sea útil.
  
  No he usado GeoTools. Me lo apunto en la lista de tareas :-).
Marcelo Lugo dijo:

19/02/2012 en 14:56

Hello!
I’m trying to replicate your steps, but I keep getting the following error:

> votesData$ABS <- with(votos2011, Total.censo.electoral – Votos.válidos) ##abstention
Error en eval(expr, envir, enclos) : objeto 'Votos.válidos' no encontrado

Thanks,

Marcelo Lugo
- Oscar Perpiñán Lamigueiro dijo:
  
  19/02/2012 en 17:09
  
  Hello, Erin, Marcelo.
  
  I think the problem you are having is related to the use of non-ASCII character in the names of «votos2011». I have changed «Votos.válidos» to «Votos.validos» in the dataset and in the code. Please let me know if this fix works for you.
  
  Cheers.
  
  Oscar.
  - Erin Hodgess dijo:
    
    20/02/2012 en 8:32
    
    Hello again!
    
    Thanks for the great fix on the variable names.
    
    However, I’m having trouble with the zip file. I keep getting that the downloaded file is not a valid zip file. I’m using Windows.
    Should I be using Linux, please?
    
    Thanks,
    Erin
  - Oscar Perpiñán Lamigueiro dijo:
    
    20/02/2012 en 12:11
    
    Hello,
    
    If I am not wrong, the zip from the INE webpage is created with the compression format 7z. Try the 7-zip software to open it.
    
    Oscar.
Erin Hodgess dijo:

19/02/2012 en 8:01

Hello!

I’m trying to replicate your steps, but I keep getting the following error:
> votesData$ABS
When I checked the names on votos2011, there is no Votos.validos.

Have you run into this please?
Thanks,
Erin
Luis dijo:

18/02/2012 en 23:58

¡Muchas gracias! It is great to find such a detailed explanation of the whole process of map creation. I can think quite a few variants of the map, but your code provides the starting point for all of them.

Saludos desde New Zealand.
Tim Riffe dijo:

18/02/2012 en 19:32

brilliant work! I realize that the political landscape in Spain is more than 2 dimensional, but did you think of reducing the variables to a single gradient, e.g. from left to right, rather than choosing a dominant party? In the US it works rather well, with purples dominating in places with an evenly split electorate, and red and blue on the extremes. For Spain you’d have to do some ad hoc party aggregating. One could also make the political choropleth for nationalist vs centralist parties using a continuous color ramp. Not to make your life more difficult, but I think this and other maps from the same data source are of high interest in Spain. Cheers!
- Oscar Perpiñán Lamigueiro dijo:
  
  19/02/2012 en 17:23
  
  I decided to use this dataset to play with several variables in a map, and thus building a «complex» legend, and to show the relevance of the abstention in Spain (which is the most important choice in several locations).
  
  Thanks for your ideas!

Related articles

Comparte / Share

Relacionado

58 pensamientos sobre “Maps with R (I)”