Population genetics and Geographic Information Systems
GEO 565: Annotated Bibliography
By
Phylogeography and especially the emerging field of landscape genetics are experiencing rapid growth due to increased availibility of spatial data and improved methods for analyzing this data using GIS. The goal of phylogeography is to understand the distribution of a taxon or taxa in a biogeographic and phylogenetic context. Analysis of current distributions can reveal the geographic features which limit range distributions, promote speciation or lead to secondary contact. The emerging field of ecological niche modeling is particularly promising, as it uses locality records to extract data from overlayed layers containing climatic and ecological data and uses this data to construct a model of the ecological niche of a species. This allows for prediction of species distributions, secondary contact, and revelation of the climatic variables which influence species distributions. Landscape genetics seeks to determine the landscape features that determine population structure on a much smaller scale than phylogeographic studies. My annotations are primarily focused on the analytical methods used in each paper so that I can quickly identify papers which will be useful for my own research applications. Consequently, I focus primarily on methods and results, including the names of software used when applicable.
DATA SOURCES
Global Biodiversity Information Facility
The Global Biodiversity Information Facility (GBIF) is a large catalog of locality records from museums across the world. The website provides open access to a wide variety of taxa, with particularly good representation of vertebrates due to integration with taxa specific databases such as Herpnet, Fishbase and MaNis. While each of these databases has an extremely useful website of their own, the integration of all of these sites into one allows for queries of entire communities of organisms to easily be obtained. This georeferenced data can be downloaded and imported into a GIS, complete with attributes such as coordinate precision, locality name, collector, museum where the specimen is held, year of collection and comments about the specimen (ie age, sex, gravidity etc.). Importing into a GIS allows for spatial analysis of population distributions, past and present, using existing data. The search option allows you to specify the scientific, common or english name of a taxon and to narrow to a single country of origin. In addition, there is a Google Earth page that allows you to obtain all the records of a taxon using its scientific name in a .kml file, which can then be opened in Google Earth for visualization of population distributions. The query builder allows you to choose predefined icons to represent localities, or import your own graphics. In addition, you can use the website to create your own .kml files from existing data or build advanced queries in which you can specify dates of collection and coordinate precision and display different values for these attributes differently on the .kml file. While these features are great for visualization of population distributions, attributes are not easily accessed once in the Google Earth environment. Each data point will open a pop-up window in which you can click “View GBIF record”, however, this opens a web page that typically takes a while to load and does not allow for opening a single table with all locality records (and clearly, spatial analyses).
JOURNAL ARTICLES
POPULATION GENETICS
Key methods/concepts:
Mantel test, Kriging interpolation on
allele frequency PC1 scores
In this study, the authors examine variation at 7
mitochondrial loci in 14 populations of L.
l. scoticus in
Key methods/concepts:
Habitat modeling, Ecological niche modeling, predicting species distributions,
planning re-introduction sites
Extensive destruction of habitat for the endangered shrub Triunia robusta has resulted from
clearing of the
Key methods/concepts:
Multiple regression analysis, Spatial analysis of molecular variance
Population genetic structure was determined from allelic
variation at microsatellite loci for a vertebrate metacommunity consisting of
the competing predators Thamnophis
elegans and T. sirtalis and the
prey species Bufo boreas in a Lassen
Co,
Key methods/concepts:
GIS modeling of dispersal paths, Least-cost paths using ArcINFO, Partial Mantel
tests, BIOENV
The authors use microsatellite markers in the salamander Ambystoma tigrinum to identify the
landscape characteristics that determine genetic structure via GIS data. A
total of 10 ponds were sampled across
Kidd DM, and MG Ritchie. 2000. Inferring the patterns and causes of geographic variation in Ephippiger ephipigger (Orthoptera, Tettigoniidae) using geographical information systems (GIS). Biological Journal of the Linnean Society 71:269-295.
Key concepts/methods:
Principle component analysis of traits, surface creation, exploratory data heuristics,
Idrisi surfaces
The Ephippiger cricket complex in the Alps of Europe are a complex of species. This study combines trait data from previous studies which were geocoded with varying levels of precision and accuracy (this study is still rather early and many papers rarely gave exact coordinates for localities). Digital elevation models (DEMs) were downloaded from USGS, as well as solar irradiation and annual average precipitation. Irradiation and precipitation were interpolated into continuous surfaces using the Idrisi linear contour interpolator, which is a distance weighting method. Using the Idrisi PCA function, the authors determined the axes across the trait surfaces that explained the most variation in the observed traits. PC2 divided consistently divided the groups into northern and southern groups. Discriminant analysis was performed to test classification of northern and southern populations. They also implemented multiple regression of body size on altitude, irradiation, precipitation, lat and long, and distance from the sea to create a surface of body size across the landscape. The authors found several patterns of correlation between phenotypic traits and environmental clines. These patterns could be divided into two categories, general environmental clines causing ecotypic variation, and historical divergence resulting in trait divergence. This paper is also very important as it is the first instance I have seen of a journal article that includes a 3D tittilator plot.
Ritchie MG, Kidd DM, and JM Gleason. 2001. Mitochondrial DNA variation and GIS analysis confirm a secondary origin of geographical variation in the bushcricket Ephippiger ephippiger (Orthoptera: Tettigonoidea), and resurrect two subspecies. Molecular Ecology 10:603-611.
Key concepts/methods:
RFLPs compared to predictions from GIS analysis for contact zones between
geographical variants, partial Mantel tests
This paper examines concordance of interpolated character clines in putative secondary contact zones of the bushcricket. The data include behavioral, morphological and allozyme data that are interpolated across the geographic surface and compared with bioclimatic layers for covariance and concordance. Environmental variables explained only body size, and few clines were concordant with subspecific designations. In this paper, the authors generated matrices of geographic data including geographic distance (generated using ArcInfo and mapping predicted distances for cricket dispersal), environmental dissimilarity (obtained from Kidd and Ritchie 2000) and vicariant models based on potential refugia during the last ice age. The response matrix for this study was the genetic distance matrix based on RFLPs. The different independent variables represented the different matrices, which were tested against the response matrix with a partial Mantel test. Isolation by distance is not supported, while environmental variables only approach significance. Historical refugial models perform the best, indicating past isolation has the greatest explanatory power for determining genetic distances. This paper has a cool figure of a neighbor-joining phylogeny overlaid on the geography of the region, which is likely created in GIS software. In conclusion, the author’s find strong support for vicariant hypotheses using GIS and Mantel tests, thus resulting in their resurrection of subspecific names to indicate geographic variants.
ECOLOGICAL NICHE
MODELING
Key methods/concepts:
Maximum entropy niche modeling, Maxent
A common problem in species distribution data is that only
presence is likely to be available, while absence data is almost always
lacking. Consequently, the Maxent approach to modeling species distributions
with only presence data is particularly valuable. The performance of this model
is tested against the other commonly used presence-only data model, GARP, and
other methods of estimating species ranges. An interesting complication with
presence-only data presented in this paper is that localities are always
assumed to be source populations, never sink populations; thus affecting the
accuracy of predicted distributions. Furthermore, care must be taken when
choosing layers and locality data (e.g. current landcover layer would not work
well with a collection locality from the 1700s!). Maxent seeks to approximate
the desired species probability distribution using everything that is known via
locality data about the habitat requirements (extracted from layers) and
maximizes the probability distribution subject to the constraints of what is
known. “It agrees with everything is known, but carefully avoids assuming
anything that is not known”. This paper goes over the many advantages of the
Maxent approach, and paints a compelling picture for its use in habitat niche
modeling. Maximum entropy modeling is a rapidly growing field of statistics
with applications in numerous diverse fields, and thus has a robust literature
associated with it. As an initial test of the Maxent approach, it is compared
to the GARP model for Bradypus variegatus
and Microryzomys minutus in
Key methods/concepts:
Maxent, Utilizing both presence and absence data, testing hypotheses of
speciation with a GIS
Ecological niche modeling could produce three possible options for a pair of sister species isolated from one another: (1) Niche conservatism results in spatial overlap of predicted ranges (2) Niche divergence results in no spatial overlap of predicted ranges (3) Other factors influencing range distributions result in spatial overlap in the current ranges as well as the intervening areas of species absence. The authors tested these predictions using 16 pairs of sister species of montane North American salamanders. Specimen localities for each species were imported into ArcGIS and ranges were estimated by enclosing the points with a minimum convex polygon. Degree of overlap was determined using Lynch’s method to predict ancestral distributions by summing the ranges of all species in a given clade, as well as ARCs using “the nested averages of pairwise overlaps between all species in a clade” (quotes because I don’t understand enough to paraphrase!). Degree of overlap was then regressed against age of most recent common ancestor. To test hypothesis of the factors that promote speciation, the authors utilized ecological niche modeling using maximum entropy methods as implemented by the software Maxent. This program computes probability distributions of habitat suitability over an entire grid by utilizing all the information contained in known species localities while avoiding unfounded constraints. It expresses the probability of finding a species in a grid as the function of environmental variables at known localities; which is then used to generate a species probability distribution over the entire data frame. Climatic data was extracted based on applicability to amphibian life history from the WORLD-CLIM dataset. Predicted ranges were imported from Maxent into DIVA-GIS v. 5.2. Minimum convex polygons were then constructed for absence localities in between the minimum convex polygons constructed earlier with localities known to have salamander species, but not the species pair of interest. To test the niche conservatism hypothesis, known localities of a species were mapped onto their sister species predicted range distribution and the cumulative probability of occuring at that site were extracted (a low probability of occurrence would reject the hypothesis). Similarity of climatic niches was evaluated using PCA to generate a “climate distance” between sister taxa. ARCs and Lynch’s method supported allopatric speciation as the primary mechanism for species divergence. Most allopatric taxa exhibited the pattern expected by niche conservatism. PC1 generally represents climatic variables that are typical of montane environments (temperature stability, precipitation and low temperatures) and explains the majority of the variation. However, some species pairs show significant niche divergence and support the niche divergence hypothesis. Parapatric sister species, in contrast to allopatric taxa, demonstrate niche divergence but borders of relatively broad overlap. This study shows how the climatic variables that promote speciation can be determined using GIS and ecological niche modeling.
Key methods/concepts:
Bioclimatic modeling applied to identifying secondary contact, testing
endogenous vs. exogenous adaptation
Key concepts/methods: Testing hypotheses of competitive exclusion and
competitive release using GARP
The authors using the Genetic Algorithm for Rule-Set Predictions (GARP; http://biodi.sdsc.edu) to model the distribution of pocket mice in South America. GARP utilizes bioclimatic-envelope rules as well as logistic regression to generate predicted distributions of species. The competitive exclusion principle hypothesizes that two species occupying the same niche will not be found in the same geographic area. Thus, sister species that occupy the same niche will not occupy the same geographic region, even though their predicted ranges will overlap. Competitive release is the idea that if the dominant species is removed from such a site, then the weaker competitor will expand its range into habitat that it is not able to inhabit with the presence of the competitor. Physical, biotic and climatic variables such as elevation, slope, aspect, soil, vegetation, solar radiation, temperature and precipitation were extracted from GIS coverages using known localities for the two species of mice. Predicted distributions revealed areas of limited overlap between the two species. Although the two species occupied different bioclimatic regimes, there was considerable overlap in the predicted envelope for most variables examined. The results of this study supported the competitive release hypothesis for Heteromys anomalus. This was demonstrated the significantly different bioclimatic variables between H. anomalus from H. australis in areas of predicted contact, while there is no significant difference in areas where H. australis is absent due to historical reasons. In areas of predicted sympatry, H. australis was the predominant species found, again supporting competitive exclusion of H. anomalus by H. australis.
SOFTWARE
Coming soon:
DIVA-GIS
MAXENT