1. Introduction

Phylogenetic diversity is the sum of branch lengths connecting species into the root of a phylogenetic tree1. We will demonstrate the ability of our new R package phyloregion2 to handle and execute large biogeographical datasets for analysis of phylogenetic diversity for 4888 plant species in just five steps!.

2. Input data

Species distributions data are commonly available in three formats: polygons, points records and raster layers. We will tap into the Botanical Information and Ecology Network database to download polygons range maps and phylogeny for 4888 plant species using the R package BIEN3 as follows:

First, load the packages for the analysis:

s <- BIEN_ranges_genus("Pinus")

Because downloading large number of species using the BIEN R package can take a long time, we previously downloaded the shapefiles and saved to file:

wd <- getwd()
dir = tempdir()
    destfile = "bien_data.zip")
unzip(zipfile = "bien_data.zip")
setwd(dir = "bien_data")
## [1] "americas"      "bien_polygons"

Read in the phylogeny

We will download the complete BIEN phylogenies for 81,274 plant species.

tree <- BIEN_phylogeny_complete(1)
## Phylogenetic tree with 81274 tips and 81273 internal nodes.
## Tip labels:
##  Haplomitrium_blumei, Apotreubia_nana, Marchantia_polymorpha, Marchantia_aquatica, Marchantia_domingensis, Marchantia_alpestris, ...
## Rooted; includes branch lengths.

Read in species polygon maps

m <- shapefile("bien_polygons/bien_plants.shp")

# and base map of the Americas for plotting
b <- shapefile("americas/americas.shp")

3. Community matrix

Next, we convert the polygon range maps to community matrix using the function polys2comm at a grid resolution of 1 degree, but other grain sizes can be specified using the res argument – remember, patterns of biodiversity are scale dependent4,5.

p <- polys2comm(m, res = 1, species = "Species", trace=0)

# This gives 2 objects: `comm_dat` the (sparse) community matrix and `poly_shp`, shapefile of grid cells with cell values, i.e. species richness per cell.

comm <- p$comm_dat
shp <- p$poly_shp

4. Species richness

We can use the function plot_swatch to map cell values using nice continuous color gradients as follows:

plot_swatch(shp, shp$richness, k = 30, border=NA, breaks = "jenks",
            col = hcl.colors(n=30, palette = "RdYlGn", rev = TRUE))
plot(b, add=TRUE, lwd=.5)

plot of richness

5. Phylogenetic diversity

Computing phylogenetic diversity is very straightforward using the function PD and visualized using plot_swatch as follows:

# First, match phylogeny to community matrix:
submat <- match_phylo_comm(tree, comm)$comm
subphy <- match_phylo_comm(tree, comm)$phy

mypd <- PD(submat, subphy)
##      v100    v10067    v10068    v10072    v10099      v101 
##  730.1314 1657.3708 1657.3708 1530.7859 2173.5121  730.1314
y <- merge(shp, data.frame(grids=names(mypd), pd=mypd), by="grids")
y <- y[!is.na(y@data$pd),]
##      grids       lon      lat richness       pd
## 4716   v98 -81.62558 82.80651        1 730.1314
## 4730   v99 -80.62558 82.80651        1 730.1314
## 1     v100 -79.62558 82.80651        1 730.1314
## 6     v101 -78.62558 82.80651        1 730.1314
## 78    v102 -77.62558 82.80651        1 730.1314
## 79    v103 -76.62558 82.80651        1 730.1314
plot_swatch(y, values = y$pd, k = 30, border=NA, breaks = "jenks",
            col = hcl.colors(n=30, palette = "RdYlGn", rev = TRUE))
plot(b, add=TRUE, lwd=.5)

plot of PD

View the correlation between Phylogenetic Diversity and Species Richness:

plot(y$richness, y$pd, pch=21, bg="black", col="white",
     lwd=0.31, cex=1, ylab="Phylogenetic Diversity",
     xlab="Species Richness", las=1)
abline(lm(y$pd~y$richness), col="red", lwd=2)

plot of correlation

If you find this vignette tutorial useful, please cite in publications as:

Daru, B.H., Karunarathne, P. & Schliep, K. (2020) phyloregion: R package for biogeographic regionalization and macroecology. Methods in Ecology and Evolution doi: 10.1111/2041-210X.13478.


1. Faith, D. P. Conservation evaluation and phylogenetic diversity. Biological Conservation 61, 1–10 (1992).

2. Daru, B. H., Karunarathne, P. & Schliep, K. Phyloregion: R package for biogeographical regionalization and macroecology. Methods in Ecology and Evolution n/a, (2020).

3. Maitner, B. S. et al. The bien r package: A tool to access the botanical information and ecology network (bien) database. Methods in Ecology and Evolution 9, 373–379 (2018).

4. Daru, B. H., Farooq, H., Antonelli, A. & Faurby, S. Endemism patterns are scale dependent. Nature Communications 11, 2115 (2020).

5. Jarzyna, M. A. & Jetz, W. Taxonomic and functional diversity change is scale dependent. Nature communications 9, 1–8 (2018).