You are on page 1of 18

TREE DIVERSITY ANALYSIS A manual and software for common statistical methods for ecological and biodiversity studies

Using the BiodiversityR software within the R 2.6.1 environment


R Kindt, World Agroforestry Centre, Nairobi (Kenya), January 2008

Introduction
The software accompanying the Tree Diversity Analysis manual was developed for the R 2.1.1 environment. Since the publication of the manual at the end of 2005, new versions of the base R and its accompanying packages have become available. These changes made it necessary to modify the BiodiversityR software. At the same time that these changes were implemented, the opportunity was taken to develop BiodiversityR into a package that can be installed and is documented in the same way as other R packages. Some new functions were integrated in the new version of the package. Functions that were not directly associated with the graphical user interface (GUI) provided by BiodiversityR were documented separately. This document shows where the Tree Diversity Analysis manual has become outdated.

Main changes in the software


The main changes in the software include the following: Installation Using the package and the graphical user interface after the package was installed Import of data via Excel workbooks or Access databases

In the new version of the software, the package is installed and loaded as any other package developed within the R statistical environment. Two accompanying documents (Windows Installation and Using BiodiversityR under Windows, both in PDF format and available from the Software folder on the CD-ROM) provide instructions how BiodiversityR can be installed and used under MS Windows. These accompanying documents replace most of the information that is available in Chapter 3: Doing biodiversity analysis with Biodiversity.R of the manual. An important change from the previous instructions for installation is that the step of copying the Biodiversity.R and Rcmdr-menus.txt is not needed anymore (page 34 in the manual).

As the software is now a standard package, use the following command to load the package (obviously after the package was installed; alternatively you could use the following menu options: Packages > Load package): library(BiodiversityR) To access the graphical user interface of the package (still based on the Rcmdr package), use: BiodiversityRGUI() To learn more about the features of the BiodiversityR package use menu options of: BiodiversityR > Help about BiodiversityR > Help about BiodiversityR, or type: ?BiodiversityRGUI A new feature of the updated package is that data can be imported from Excel workbooks or Access databases. To be able to import data for the community and environmental datasets (read Chapter 2: Data preparation if you do not know what information is contained in these datasets), data for the environmental data set needs to be available from an Excel worksheet (alternatively an Access table) named environmental. Data for the community data set should be available as a matrix (formatted as sites species, with species abundances as cell entries) from an Excel worksheet (alternatively an Access table) named community, or these data should be available in a stacked format (with separate columns for sites, species and abundances) from an Excel worksheet (alternatively an Access table) named stacked. Both datasets should be available from the same Excel workbook (or Access database). More information on importing data from Excel or Access is available from the help provided for the import.from.Excel and import.from.Access functions: ?import.from.Excel ?import.from.Access

Main changes in the examples of the manual


The main change in the examples is that the menu options should now be accessed via BiodiversityR and not Biodiversity. Below, other changes in the different chapters are listed. Chapter 1: Sampling No changes. Chapter 2: Data preparation - Page 29 To load data from an external file (see chapter 3 for the required format of a data file): Data > Import data > from text file or clipboard Enter name for data set: data (choose any name) Click OK Browse for the file and click on it Select the species and environmental matrices: BiodiversityR > Environmental Matrix > Select environmental data set Select the dune.env dataset BiodiversityR > Community Matrix > Select community data set Select the dune dataset Chapter 3: Doing biodiversity analysis with BiodiversityR Most information from this chapter has become outdated. Please see above and consult the accompanying documents of Windows Installation and Using BiodiversityR under Windows (both documents are in PDF format and are available from the Software folder on the CD-ROM).

Chapter 4: Analysis of species richness - Page 51 Select the species and environmental matrices: BiodiversityR > Environmental Matrix > Select environmental data set Select the dune.env dataset BiodiversityR > Community Matrix > Select community data set Select the dune dataset - Page 54 To compare species richness between various subsets in the data using species accumulation curves Accum.6 <- accumcomp(dune, y=dune.env, factor='Management', method='exact') Accum.6 dune.env$site.totals <- apply(dune,1,sum) Accum.7 <- accumcomp(dune, y=dune.env, factor='Management', scale='site.totals', method='exact', xlab='pooled individuals') Click in the plot where you want to put the legend Accum.7

Chapter 5: Analysis of diversity - Page 69 Select the species and environmental matrices: BiodiversityR > Environmental Matrix > Select environmental data set Select the dune.env dataset BiodiversityR > Community Matrix > Select community data set Select the dune dataset - Page 70 To calculate and plot a Rnyi diversity profile: Renyi.1 <- renyiresult(dune) Renyi.1 renyiplot(Renyi.1, labelit=FALSE, legend=FALSE) renyiplot(Renyi.1, labelit=FALSE, legend=FALSE, evenness=TRUE) To calculate and plot Rnyi diversity profile for each site separately: Renyi.2 <- renyiresult(dune, method='s') Renyi.2 renyiplot(Renyi.2, legend=FALSE) renyiplot(Renyi.2, legend=FALSE, evenness=TRUE) To calculate diversity indices for each site: Diversity.1 <- diversityresult(dune, index='Shannon' ,method='s') Diversity.1 Diversity.2 <- diversityresult(dune, index='Simpson' ,method='s') Diversity.2 Diversity.3 <- diversityresult(dune, index='Logalpha' ,method='s') Diversity.3

-Page 70 (continued) To compare diversity between subsets of the dataset: Renyi.3 <- renyicomp(dune, y=dune.env, factor='Management', permutations=100) Click in the graph where you want to put the legend Renyi.3 To calculate accumulation patterns for the Rnyi diversity profile Renyi.4 <- renyiaccum(dune, permutations=100) Renyi.4 persp.renyiaccum(Renyi.1)

Chapter 6: Analysis of counts of trees - Page 98 Load the datasets Panama species.txt and Panama environmental.txt (available from the Data folder on the CD-ROM) and make them the species and environmental datasets, respectively. Give them the names spec and faramea. Data > Import data > from text file or clipboard (Panama species.txt) Enter name for data set: spec Data > Import data > from text file or clipboard (Panama environmental.txt) Enter name for data set: faramea BiodiversityR > Community Matrix > Select community data set Data set: spec BiodiversityR > Environmental Matrix > Select environmental data set Data set: faramea

As an alternative, load the dataset Faramea.txt (available from the Data folder on the CD-ROM) and make it both the species and environmental dataset (as both the species and environmental information is in the same dataset). Data > Import data > from text file or clipboard (Faramea.txt) Enter name for data set: faramea BiodiversityR > Community Matrix > Select community data set Data set: faramea BiodiversityR > Environmental Matrix > Select environmental data set Data set: faramea

- Page 100 Load the datasets Panama species.txt and Panama environmental.txt (these are available from the Data folder on the CD-ROM). Give them the names spec and faramea, respectively. Alternatively, load the dataset Faramea.txt (available from the Data folder on the CD-ROM) and give it the name faramea. # Alternative 1 spec <- read.table(file.choose()) attach(spec) faramea <- read.table(file.choose()) faramea$Faramea.occidentalis <spec$Faramea.occidentalis attach(faramea) # Alternative 2 faramea <- read.table(file.choose()) attach(faramea) To calculate a linear regression model: Count.model1 <- lm(Faramea.occidentalis ~ Precipitation, data=faramea, na.action=na.exclude) summary(Count.model1) fitted(Count.model1) predict(Count.model1, interval='confidence') residuals(Count.model1) shapiro.test(residuals(Count.model1)) ks.test(residuals(Count.model1), pnorm) anova(Count.model1,test='F') Count.model2 <- lm(Faramea.occidentalis ~ Age.cat, data=faramea, na.action=na.exclude) levene.test(residuals(Count.model2), faramea$Age.cat)

- Page 100 (continued) To plot a linear regression model: par(mfrow=c(2,2)) plot(Count.model1) par(mfrow=c(1,1)) termplot(Count.model1, se=T, partial.resid=T, rug=T, terms='Precipitation') library(effects) plot(effect('Precipitation',Count.model1)) To check for the spatial distribution of residuals: surface.1 <- residualssurface(Count.model1, na.omit(faramea), 'UTM.EW', 'UTM.NS', gam=F, npol=1, plotit=T, bubble=F, fill=F) surface.2 <- residualssurface(Count.model1, na.omit(faramea), 'UTM.EW', 'UTM.NS', gam=F, npol=2, plotit=T, bubble=F, fill=F) surface.2 <- residualssurface(Count.model1, na.omit(faramea), 'UTM.EW', 'UTM.NS', gam=F, npol=2, plotit=T, bubble=T, fill=F) surface.gam <- residualssurface(Count.model1, na.omit(faramea), 'UTM.EW', 'UTM.NS', gam=T, npol=2, plotit=T, bubble=F, fill=T) summary(surface.1) anova(surface.1) correlogram(surface.1, nint=10) summary(surface.gam)

- Page 101 To calculate a generalised linear regression model (GLM): Count.model3 <- glm(formula = Faramea.occidentalis ~ Precipitation, family = poisson(),data=faramea, na.action=na.exclude) summary(Count.model3) anova(Count.model3,test='F') predict(Count.model3, type='response', se.fit=T) Count.model4 <- glm(formula = Faramea.occidentalis ~ Precipitation, family = quasipoisson(), data=faramea, na.action=na.exclude) library(MASS) Count.model5 <- glm.nb(Faramea.occidentalis ~ Precipitation, maxit = 5000, init.theta = 1, data=faramea, na.action=na.exclude) To calculate a generalised additive regression model (GAM): library(mgcv) Count.model6 <- gam(Faramea.occidentalis ~ s(Precipitation), family=poisson(), data = na.omit(faramea)) summary(Count.model6) predict(Count.model6, type='response', se.fit=T)

To calculate a multiple regression model: Count.model7 <- glm.nb(Faramea.occidentalis ~ Precipitation + I(Precipitation^2), maxit = 5000, init.theta = 1, data=faramea, na.action=na.exclude) summary(Count.model7) anova(Count.model7, test='F') Anova(Count.model7, type='II', test='Wald') vif(lm(Faramea.occidentalis ~ Precipitation + I(Precipitation^2), data=faramea, na.action=na.exclude))

Chapter 7: Analysis of presence or absence of species - Page 118 Load the datasets Panama species.txt and Panama environmental.txt (available from the Data folder on the CD-ROM) and make them the species and environmental datasets, respectively. Give them the names spec and faramea. Data > Import data > from text file or clipboard (Panama species.txt) Enter name for data set: spec Data > Import data > from text file or clipboard (Panama environmental.txt) Enter name for data set: faramea BiodiversityR > Community Matrix > Select community data set Data set: spec BiodiversityR > Environmental Matrix > Select environmental data set Data set: faramea

As an alternative, load the dataset Faramea.txt (available from the Data folder on the CD-ROM) and make it both the species and environmental dataset (as both the species and environmental information is in the same dataset). Data > Import data > from text file or clipboard (Faramea.txt) Enter name for data set: faramea BiodiversityR > Community Matrix > Select community data set Data set: faramea BiodiversityR > Environmental Matrix > Select environmental data set Data set: faramea

- Page 119 To calculate a generalised additive regression model (GAM): BiodiversityR > Analysis of species as response > Species presence-absence as response Model options: gam model Response: Faramea.occidentalis Explanatory: s(Precipitation) + Geology + Age.cat + s(Elevation) print summary - Page 120 Load the datasets Panama species.txt and Panama environmental.txt (these are available from the Data folder on the CD-ROM). Give them the names spec and faramea, respectively. Alternatively, load the dataset Faramea.txt (available from the Data folder on the CD-ROM) and give it the name faramea. # Alternative 1 spec <- read.table(file.choose()) attach(spec) faramea <- read.table(file.choose()) faramea$Faramea.occidentalis <spec$Faramea.occidentalis attach(faramea) # Alternative 2 faramea <- read.table(file.choose()) attach(faramea) To analyse presence or absence by cross-tabs: Do not do the first step of the manual table1 <- table(Faramea.occidentalis>0, Age.cat) Presabs.1 <- chisq.test(table1) Presabs.1 Presabs.1$observed Presabs.1$expected

- Page 120 (continued) To calculate a generalised linear regression model (GLM): Presabs.model2 <- glm(formula = Faramea.occidentalis>0 ~ Age.cat, family = binomial(link=logit), data = faramea, na.action = na.exclude) summary(Presabs.model2) anova(Presabs.model2,test='F') predict(Presabs.model2, type='response', se.fit=T) null.model <- glm(formula = Faramea.occidentalis>0 ~ 1, family = binomial(link=logit) , data = na.omit(faramea), na.action = na.exclude) anova(null.model, Presabs.model2, test='Chi') par(mfrow=c(2,2)) plot(Presabs.model2) par(mfrow=c(1,1)) termplot(Presabs.model2, se=T, partial.resid=T, rug=T, terms='Age.cat') library(effects) plot(effect('Age.cat', Presabs.model2)) Presabs.model3 <- glm(formula = Faramea.occidentalis>0 ~ Age.cat, family = quasibinomial(link=logit) , data = faramea, na.action = na.exclude) Presabs.model4 <- glm(formula = Faramea.occidentalis>0 ~ Elevation, family = quasibinomial(link=logit) , data = faramea, na.action = na.exclude) To calculate a generalised additive regression model (GAM): library(mgcv) Presabs.model5 <- gam(formula = Faramea.occidentalis>0 ~ s(Precipitation) + Geology + Age.cat + s(Elevation), family = quasibinomial(link=logit) , data = faramea, na.action = na.exclude) summary(Presabs.model5)

Chapter 8: Analysis of differences in species composition - Page 137 Select the species and environmental matrices: BiodiversityR > Environmental Matrix > Select environmental data set Select the dune.env dataset BiodiversityR > Community Matrix > Select community data set Select the dune dataset - Page 138 Transformations of the species data community.hel <- disttransform(dune, method='hellinger') hellinger.distance <- vegdist(community.hel, method='euclidean')

Chapter 9: Analysis of ecological distance by clustering - Page 149 Select the species and environmental matrices: BiodiversityR > Environmental Matrix > Select environmental data set Select the dune.env dataset BiodiversityR > Community Matrix > Select community data set Select the dune dataset - Page 161 Calculate and plot agglomerative clustering: library(cluster) distmatrix <- vegdist(dune, method='bray') distmatrix Cluster.1 <- agnes(distmatrix, method='single') summary(Cluster.1) plot(Cluster.1, which.plots=2) plot(Cluster.1, which.plots=2, hang=-1) Cluster.2 <- agnes(distmatrix, method='single') summary(Cluster.2) Cluster.3 <- agnes(distmatrix, method='complete') summary(Cluster.1)

Selecting cluster membership from a hierarchical clustering: cutree(Cluster.1,k=4) plot(Cluster.1, which.plots=2) rect.hclust(Cluster.1, k=4)

Chapter 10: Analysis of ecological distance by ordination - Page 191 Select the species and environmental matrices: BiodiversityR > Environmental Matrix > Select environmental data set Select the dune.env dataset BiodiversityR > Community Matrix > Select community data set Select the dune dataset Calculating a Principal Component Analysis (PCA): BiodiversityR > Analysis of ecological distance > Unconstrained ordination Ordination method: PCA (or PCA (prcomp)) scaling: 1 Plot method: ordiplot Plot method: text sites Plot method: text species Plot method: ordiequilibriumcircle Calculating a Non-metric Multidimensional Scaling (NMS) Biodiversity > Analysis of ecological distance > Unconstrained ordination Ordination method: metaMDS (or NMS (standard)) Distance: bray NMS axes: 2 NMS permutations: 100 Plotting quantitative environmental characteristics onto an ordination graph: Biodiversity > Analysis of ecological distance > Unconstrained ordination Ordination method: PCoA Distance: bray Plot variable: A1 Plot method: ordiplot Plot method: envfit Plot method: ordibubble Plot method: ordisurf

- Page 191 (continued) Plotting qualitative environmental characteristics onto an ordination graph Biodiversity > Analysis of ecological distance > Unconstrained ordination Ordination method: PCoA Distance: bray Plot variable: Management Plot method: ordiplot Plot method: envfit Plot method: ordihull Plot method: ordispider Plot method: ordiellipse Plot method: ordisymbol - Page 194 Conducting a PCA on a transformed matrix Community.1 <- disttransform(dune, method='hellinger') Ordination.model2 <- rda(Community.1) summary(Ordination.model2, scaling=1) plot3 <- ordiplot(Ordination.model2, scaling=1, type="text") - Page 196 Plotting clustering results onto an ordination graph distmatrix <- vegdist(dune, method='bray') cluster <- hclust(distmatrix, method='single') Ordination.model3 <- cmdscale(distmatrix, k=nrow(dune)-1, eig=T, add=F) plot4 <- ordiplot(Ordination.model3, type='text') ordicluster(plot4, cluster)

- Page 197 (continued) Plotting quantitative environmental characteristics onto an ordination graph plot4 <- ordiplot(Ordination.model3, type='text') fitted <- envfit(plot4, data.frame(A1), permutations=100) fitted plot(fitted, col='blue', cex=1) ordibubble(plot4,A1,fg='blue') ordisurf(plot4,A1,add=T) Plotting qualitative environmental characteristics onto an ordination graph plot4 <- ordiplot(Ordination.model3, type='text') fitted <- envfit(plot4, data.frame(Management), permutations=100) plot(fitted, col='blue', cex=1) fitted ordispider(plot4,Management,col='blue') ordiellipse(plot4,Management,col='blue') ordisymbol(plot4,dune.env,'Management', legend=F, rainbow=T, cex=1)

You might also like