Correlate biological and physical patterns

 

A number of methods are available to examine the correlation between biological and physical patterns. Two commonly used computer packages with a variety of multivariate methods are PRIMER and CANOCO.

 

PRIMER

PRIMER comprises a wide range of univariate, graphical and multivariate routines for analysing biological and physical/environmental data (Clarke and Warwick, 2001; Clarke and Gorley, 2006). ‘BEST’ and ‘LINKTREE’ are two routines targeted at linking multivariate biological patterns with single or multiple environmental variables.
 
The BEST routine available in PRIMER v6 combines the BIO-ENV and BVSTEP procedures found in PRIMER v5. BIO-ENV uses all the available environmental variables to find the combination that ‘best explains’ the patterns in the biological data. However, when large numbers (>15 or 16) of environmental variables are used the procedure can become impractical, as computation time may be excessive. In such cases the BVSTEP option can be employed to carry out a stepwise search of the variables, employing both forward selection and backward elimination. Starting with the variable showing the maximum matching coefficient, variables are successively added, the combinations tested and (at each stage) the variable contributing least eliminated. Several iterations of the procedure are carried out from a random selection of (e.g., ≤6) variables to ensure that the ‘best’ match is found.
 
GMHM4-22_Example_table_of_BIO-ENV.jpg
An example table of a BIO-ENV analysis in relation to benthic polychaete distributions in the Irish Sea (from Mackie et al., 1997). A combination of three environmental variables (%gravel, %silt and depth) provides the best match to the patterns observed in the biological data.
 
The LINKTREE routine takes the combination of variables that were identified as ‘best’ in BIO-ENV together with the faunal inter-station similarities to find the most effective way of describing the biological-environment relationships relative to the successive use of single variables. Starting with the group of all samples, it divides them into two groups (a binary split), determined by the most influential environmental variable(s). So, the first split could be on the grounds that the two resulting groups are most dissimilar in terms of their salinity. By iteratively repeating this procedure on the resulting groups, the samples are divided into a number of groups, within which all the samples have similar biological and physical characteristics. Expressed more technically, the group of samples is successively divided according to the environmental variable(s) that maximise the separation between the groups in multidimensional space. Sometimes more than one variable is determined at a split (if variable each gives the same result). A statistical test is used to examine the significance (5% level) before each split, with division stopped when non-significant. An output value (B%, see table) provides an absolute measure of group differences, and low values occur when samples are most similar.
 
This is divisive clustering, as opposed to agglomerative in cluster analysis, and inversions can sometimes occur in the clustering pattern. Unlike BIO-ENV the environmental variables are non-additive and one advantage is that a variable can be identified as important in part of the overall faunal distribution, yet not so in other parts (conversely, BIO-ENV examines the overall wider situation). The LINKTREE procedure also has potential for prediction: if the environmental conditions are known for a new sample station, then the LINKTREE results may allow it to be assigned to a particular assemblage or group of sites.
 
GMHM4-23_Example_of_LINKTREE.jpg
An example of a LINKTREE analysis in PRIMER, repeated binary splits dividing the samples into groups with similar biological and physical properties. The following table shows that at node G, the samples are split into two further groups, AG and H, determined by their percentage sand content (see first line of following table).
 
GMHM4-24_CANCO.png
An example table of the part of the descriptive information for a LINKTREE analysis of benthic macrofaunal distributions in the Outer Bristol Channel (from Mackie et al., 2006)
 

CANOCO

CANOCO is a computer program for CANOnical Community Ordination by (partial/detrended/canonical) correspondence analysis, principal components analysis and redundancy analysis (ter Braak, 1986 and 1988), that originated as an extension of DECORANA (Hill, 1979b). Over the last 20 years it has evolved to include a variety of multivariate ordination methods and the current version (4.5) is available with a Microsoft Windows interface (ter Braak and Smilauer, 2002). Jongman et al. (1995) provide a detailed account of the theory and implementation of the various techniques.
 
Ordinations, like cluster analysis, are ‘indirect’ methods of analysing species-environment relationships since additional procedures are necessary to correlate the biological patterns to the environmental variables. Canonical (or constrained) analyses overcome this by integrating ordination with regression.
 
The methods available fall into four categories:
 
  1. Unconstrained ordinations describe the structure in a single data set
  2. Canonical ordinations explain one data set by another data set (ordinations are constrained by explanatory variables)
  3. Partial ordinations describe the structure in a data set after accounting for variation explained by a second data set (co-variable data)
  4. Partial canonical ordinations explain one data set by another data set after accounting for variation by a third data set (co-variable data)
 
ter Braak and Verdonschot (1995) examine the use of Canonical Correspondence Analysis (CCA) in aquatic ecology and this technique is the most commonly used direct gradient analysis method. It has been widely used in marine benthic situations, from the intertidal to deep water (Ysebaert, and Herman, 2002; Narayanaswamy et al., 2003; Bergquist et al., 2005). In CCA the ordination axes are derived from linear combinations of the environmental variables such that the dispersion of the species (and sample) scores are maximised. Environmental variables are shown on the ordinations as arrows directed from the origin of the plot where the origin represents the grand mean for each variable. Longer arrows are more strongly correlated with the ordination axes than short ones.
 
In the following example, CCA was employed to investigate the species-environment relationships of benthic polychaetes in the Irish Sea (Mackie et al., 1997). Forward selection of the variables revealed seven that ‘best’ explained the data. At each step, a Monte Carlo permutation test was used to determine the significance of each variable. The first five variables were highly significant (P<0.0001), the others less so (P<0.05). The seven variables collectively explained 34.75% of the total inertia.
 
GMHM4-25_Example_table_of_forward_selection.jpg
An example table of the forward selection of variables in a study of the distribution of benthic polychaetesin the Irish Sea
 
In the ordination, Axes I and II were the most important accounting for 21.3% of the species variance and 61.2% of that explained by the variables.
 
GMHM4-26_Example_table_of_CCA_ordination.jpg
An example table of CCA ordination summary for the polychaete-environment relationship
 
As can be seen by the ordination plot and the correlation table, sediment gravel content was most influential for axis I. Depth and latitude were most important in defining axis II. Variables such as depth (and latitude) may however be proxies for other co-varying factors (e.g. temperature, pressure, currents) rather than the variable itself.
 
GMHM4-27_ordination_table.jpg
 
GMHM4-28_example_table_of_ontraset_correllations
An example table of ontraset correlations of environmental variables for axes I-IV
 
Although omitted from the CCA plot displayed here, species can also be displayed. This can be on the same plot alongside the sample stations, or (for clarity) separately. The species displayed can be selected to those showing the best relationships with the environmental factors. Likewise, the species-environment relationships could be investigated further through partial CCA. Oug (1998) demonstrated this in a study of the benthic macrofauna near Tromsø, Norway.
 
 

All material variously copyrighted by MESH project partners 2004-2010

| List Access Keys | Mapping European Seabed Habitats (MESH) - Home | Overview | News | Work areas in the MESH Project | Partners | Contact Us | Search | Site Map | Partner Extranet |