Research Article |
Corresponding author: Jaime Fagúndez ( jaime.fagundez@udc.es ) Academic editor: Michael Pirie
© 2024 Iván Rodríguez-Buján, Pilar Díaz-Tapia, Jaime Fagúndez.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Rodríguez-Buján I, Díaz-Tapia P, Fagúndez J (2024) Genetic and morphological evidence support the specific status of the endemic Erica andevalensis (Ericales, Ericaceae). PhytoKeys 244: 57-76. https://doi.org/10.3897/phytokeys.244.120914
|
Assessing the taxonomic status of closely related taxa is crucial in plant systematics and can have important implications for conservation and human plant use. Erica andevalensis Cabezudo & Rivera is a metallophyte endemic species from highly metal-polluted soils of SW Iberian Peninsula, an area with a mining history going back more than 5,000 years. Erica andevalensis is closely related to Erica mackayana Bab., a northern Iberian species also present in western Ireland. The status of E. andevalensis as a species or subspecies subordinated to E. mackayana is subject to debate. Here, we assessed the genetic and phenotypic relationship between both species, including the population structure of E. andevalensis. We used high throughput sequencing to determine genome-wide Single Nucleotide Polymorphisms (SNPs), and morphometric analyses from 35 reproductive and vegetative traits. The morphological analysis showed at least eight characters that can discriminate the two species, from which ovary hairiness and the size of leaf glandular hairs were the most informative. Genetic analyses showed that each species formed a monophyletic cluster with full support, separated by an interspecific genetic distance >4-fold higher than intra-specific distance. Population genetic analyses of E. andevalensis shows that populations are highly structured, with the Portuguese one as the most isolated and less variable. These results support the recognition of E. andevalensis as a distinct species with a highly constrained ecological requirements and a narrow geographic distribution, but with a limited gene flow between populations. We discuss the implications of these outcomes in conservation policies and potential uses of E. andevalensis such as decontamination of polluted soils.
Endemic species, Erica, metallophyte, population structure, species delimitation, systematics
The species is the fundamental unit in taxonomy and systematics of living organisms (
In plant taxonomy, morphological characters have historically been the primary source of evidence for delimitating species (
High throughput sequencing (HTS) is the most significant recent advance in molecular techniques, as it greatly facilitates the generation of large amounts of DNA data. Restriction site-associated DNA (RAD-seq) and similar approaches involve enzymatic fragmentation of genomic DNA coupled with HTS to determine large numbers of molecular markers with genome-wide coverage (
Erica is among the largest genera of seed plants, with 851 accepted species (
Erica andevalensis Cabezudo and Rivera, occurs in a restricted area in the south-west of the Iberian Peninsula, mainly in western Andalusia, Spain, and bordering Portugal (
A Flowering stems of Erica andevalensis B Tinto river shores in Huelva, SW Spain, the habitat of E. andevalensis C São Domingos, Portugal, abandoned mining area with E. andevalensis in front D Erica andevalensis in the shores of Odiel river E flowering stems of Erica mackayana F wet heathland dominated by E. mackayana in Galicia, NW Spain.
Erica andevalensis is closely related to E. mackayana and both species are morphologically similar. However, they have disjunct distributions, as E. mackayana is restricted to the northern Iberian Peninsula (
In this study, we aim to reassess the taxonomic identity of E. andevalensis as a distinct species from E. mackayana combining molecular and morphological data. By studying the whole geographic range of both species, we aim to identify morphological traits that might reliably distinguish E. andevalensis from E. mackayana. In addition, we aim to study the population structure of E. andevalensis, to understand the level of isolation and gene flow among populations and provide better guidelines for conservation.
We sampled 38 plants from four populations of E. andevalensis in August 2021. Seventy-one plants from ten populations of E. mackayana analysed in
Distribution of the sampled populations of E. andevalensis and E. mackayana A Erica mackayana populations (as in
A subsample of each specimen was dried in silica gel, and the remaining material was prepared for morphological studies. All herbarium vouchers were deposited in SANT herbarium.
Each sample was mounted in a herbarium sheet and scanned at a minimum resolution of 1000 dpi. In addition, a minimum of two fresh flowers per specimen were dissected to measure ovaries and anthers. Pictures of ovaries and anthers were taken using an OLYMPUS C3040-ADU for subsequent digital analysis. For leaf micro-characters, we used an optical microscope at 40–100 magnifications. All images were analyzed using ImageJ software (
The final dataset for morphological analyses comprises 71 individuals of E. mackayana and 38 of E. andevalensis. In the dataset, each trait was represented by one value per specimen, thus the arithmetic mean was calculated for multiple measures (Suppl. material
Schematic representation of the morphological traits measured in the samples of Erica andevalensis and E. mackayana. Leaf traits (a) (LF = Leaf length; LW = Leaf width; LP = Petiole length), variation on leaf morphology (b) (LR = Leaf rolling degree), and indumentum (c) (LSG = length of glandular hairs; LNR = presence of non-glandular hairs). Stamen traits (d) (AL = Anther length; AO = Anther pore; AAL = Anther appendix length; AAC = Anther appendix curvature; AK = Anther knob), flower morphology (e) (FL = Corolla length; FW = Corolla width; FWC = Corolla opening; FS = Style exertion; FHP = Pedicel hairiness, FLP = Pedicel length), and arrangement of a flowering branch (f) (FI = Flowers per inflorescence; LIA = Leaf insertion angle, L5 = 5nate leaves, LVD = Density of whorls). The ovary (g) (OL = Ovary length; OW = Ovary width; ON = Nectary size) and detail of ovary surface (h) (OH = Ovary hairiness) and stem (i) (BLH = Branch longest hair). Coding, measures, and additional information in Suppl. material
We performed a principal component analysis (PCA) with all samples and traits. Since each trait was on a different scale, the prep function from the PCAMETHODS package (
We used genomic data obtained from
Nextera adaptors from sample reads were trimmed using BBDUK (BBMAP TOOLS v.38.79, http://sourceforge.net/projects/bbmap/) with the following parameters: ktrim = r, k = 17, hdist = 1, mink = 8, minlen = 100, ow = t, qtrim = r, trimq = 10. Quality of filter reads was checked using FastQC (v.0.11.9) (
To select the optimal parameters, a preliminary optimization step following the 80% rule (
VCF file from STACKS was filtered with VCFTOOLS retaining the SNPs with less than 60% of missing data and those that were separated by at least 200 bp to avoid linkage disequilibrium issues. Furthermore, only variable SNPs present in a minimum of 4 samples were accepted. The VCF file was exported as interleaved phyllip using TASSEL (
To assess genetic distance between E. andevalensis and E. mackayana, we used Tamura-Nei distance (
A new dataset per putative species was created. VCFTOOLS was used to remove indels, as well as filter SNPs retaining those that were present in at least 80% of samples and minor alleles were present in at least two samples. Also, to avoid linked loci, SNPs at a distance less than 500 base pairs (bp) were excluded. Both datasets were exported to plink format. The function filter_data() from SAMBAR package was used with default thresholds to detect and erase possible paralogous loci (
Observed heterozygosity (Ho) and expected heterozygosity (He) were calculated with function filter_data() in SAMBAR, and nucleotide diversity was calculated with calc_diversity() of the same package. FIS was calculated as 1-Ho/He as in
We further analysed the dataset of E. andevalensis, including a FST pairwise comparison among populations, using calc_diversity() function in SAMBAR and Bayesian population assignment (BPA). BPA probabilities were calculated using the function find_structure() in SAMBAR package. The optimal number of clusters (K) was determined using the elbow method on cross-entropy scores.
Twenty-eight out of 35 morphological characters measured showed statistically significant differences between Erica mackayana and E. andevalensis (Suppl. material
The first and second axes of the PCA explained 30.6% and 12% of the variance in the data, respectively. PC1 clearly delimitate the two species into two separated clusters (Fig.
Selection of traits that most contribute to the two first PCA components. Percentage of variability within traits explained by the first two components. For each trait average values for each species are provided. Significance of correspondent mean differences tests is highlighted with *** if p-value <0.001. OH = Ovary hairiness; LSG = Length of glandular hairs; L5 = Presence of 5nate whorls; LS2G = Presence of more than two rows of glandular hairs; LNR = Presence of non-glandular hairs; BLH = Longest hair length in stem; LR = Leaf rolling degree; FHP = Pedicel hairiness.
Trait | Units/coding | Contribution to PC1 | Contribution to PC2 | E. mackayana mean | E. andevalensis mean | Significance |
---|---|---|---|---|---|---|
OH | 0–5 | 8.584 | 0.000 | 0 | 1.083 | *** |
LSG | µm | 8.173 | 0.103 | 562.025 | 136.448 | *** |
L5 | 0–1 | 7.890 | 0.011 | 0.958 | 0 | *** |
LS2G | 0–1 | 7.452 | 0.021 | 0 | 0.773 | *** |
LNR | 0–1 | 7.412 | 0.187 | 0 | 0.839 | *** |
BLH | µm | 7.369 | 0.246 | 0.986 | 0.333 | *** |
LR | 1–5 | 6.506 | 0.125 | 2.937 | 1 | *** |
FHP | 1–5 | 5.703 | 0.607 | 2.500 | 1.105 | *** |
Biplot showing the first two principal components. Each ellipse represents the area that would encompass 95% of individuals assuming populations follow a normal distribution. White dots and triangles represent E. andevalensis and E. mackayana centroids respectively. Erica andevalensis populations are represented in different colors: Green (São Domingos), blue (Odiel), brown (Tinto1), yellow (Tinto2). Trait acronyms as in Fig.
Boxplots and histograms of the traits with the highest contribution to PC1 of the PCA as in Table
The phylogenetic tree placed the samples of the two species in two distinct clades that received full support (Fig.
Nei’s genetic distance among E. mackayana populations was 0.280 in average, while distance among E. andevalensis populations was 0.268 (Suppl. material
Bayesian population assignment probabilities revealed a highly structured assemblage into three clusters in E. andevalensis. This number of clusters was the optimal aggrupation within the dataset following cross-entropy scores. The assemblage is consistent with geographical distribution of populations from São Domingos and Tinto1, which were the most homogeneous, and admixture with other clusters is nearly absent. These two populations corresponded with two of the clusters (green and red, respectively in Fig.
We also assessed population structure through pairwise FST comparisons. The results align with clustering analyses, providing additional insights. Notably, São Domingos stood out as a genetically distinct population (FST = 0.906; 0.8; 0.652, from Tinto1, Tinto2 and Odiel populations respectively). Spanish populations (Odiel, Tinto1, and Tinto2) showed a lower genetic differentiation, although FST values among populations were relatively high (Odiel-Tinto1, FST = 0.392; Tinto2-Tinto1: 0.466). Only the genetic differentiation between Odiel and Tinto2 was moderate low (Fst = 0.122).
The nucleotide diversity (πi) and heterozygosity (Ho and He) in E. andevalensis exhibited significant variation among populations, with Odiel and Tinto2 ranking as the most diverse, in that order. Additionally, these two populations showed a lower FIS index (Table
Summary diversity statistics of E. andevalensis populations calculated across sampling locations in which more than two samples were collected.
Population | N | Ho | He | FIS | πi |
---|---|---|---|---|---|
Odiel | 11 | 0.155 | 0.289 | 0.464 | 0.219 |
São Domingos | 16 | 0.028 | 0.087 | 0.674 | 0.025 |
Tinto 1 | 5 | 0.049 | 0.106 | 0.537 | 0.044 |
Tinto 2 | 4 | 0.148 | 0.247 | 0.401 | 0.163 |
Total | 36 | 0.083 | 0.169 | 0.508 | 0.10 |
Nucleotide diversity (πi) and heterozygosity (Ho and He) in E. andevalensis was in general similar to E. mackayana (Suppl. material
This study provides evidence supporting the status of Erica andevalensis as an accepted species, clearly distinct from its sister-group E. mackayana. Both morphological and genetic variability between species is much higher than among populations, and populations are highly structured in E. andevalensis, as previously found for E. mackayana (
The analysis of morphological traits provided reliable diagnostic characters for E. mackayana and E. andevalensis (Table
Other diagnostic characters related to leaf indumentum included the presence of short unicellular hairs in E. andevalensis which are nearly absent in E. mackayana, and long, pluricellular glandular trichomes in the two species but much longer in E. mackayana. Diagnostic differences in leaf hairiness are also reflected in the hairiness of other vegetative and non-vegetative organs, such as stems and flower pedicels (
Leaf arrangement also contributed to species delimitation, as E andevalensis consistently shows 4-nate whorls of erect leaves. In E. mackayana, leaves are 4-6-nate and patent. Leaves of E. andevalensis are similar in size to those of E. mackayana, but narrower with more rolled-in margins. Remarkably, leaf length shows minimum contribution to PC1, but the highest weight in PC2, reflecting strong variation at the population level (Suppl. material
Other traits showed limited potential as diagnostic characters, particularly those related to reproductive biology. In E. andevalensis, flowers can be shorter and narrower, with a more pronounced style exertion. However, flower size and proportions can vary at different development stages. This variability is even more pronounced in ovary morphology and size (Suppl. material
Phylogenetic analyses show that both Erica mackayana and E. andevalensis are well supported monophyletic groups. Reciprocal monophyly is accepted as one of the most important lines of evidence on species delimitation (
The much greater branch lengths in E. andevalensis than in E. mackayana (Fig.
Two of the studied populations of E. andevalensis showed low support for internal nodes in the phylogeny, but the population structure analysis showed low levels of admixture, meaning that E. andevalensis populations are highly structured. The genetic differentiation among populations separated by less than 85 km is higher than in the entire distribution area analyzed for the related species E. mackayana (
Soil endemic plants usually have low genetic diversity, but their FIS index clearly diverge between different plants groups (e. g.
The origin and migration history of different Erica species are a subject of debate, especially when there is a potential transportation linked to human activities (
Clarifying cryptic or poorly understood taxa and species delimitation is needed for the design of conservation policies and actions, especially in large genera such as Erica (
Conservation strategies for E. andevalensis should protect a variety of populations in different river basins and sections, and research evaluating its use for restoration should consider infraspecific variability and special issues of working with metallophyte species. For instance, habitat decontamination of heavy metal polluted river shores needs to be carefully considered, as it may have a negative effect in local populations of E. andevalensis (
There is potential for use of E. andevalensis in bioremediation of polluted soils (
We would like to thank M. Sheehy Skeffington, R. Sheppard and A. Montañés for field support. We also acknowledge the computational facilities of Centro de Supercomputación de Galicia (CESGA). APCs were covered by CICA (Centro Interdisciplinar de Química e Bioloxía), UDC.
The authors have declared that no competing interests exist.
No ethical statement was reported.
No funding was reported.
All authors have contributed equally.
Iván Rodríguez-Buján https://orcid.org/0000-0002-9869-4033
Pilar Díaz-Tapia https://orcid.org/0000-0003-4680-4867
Jaime Fagúndez https://orcid.org/0000-0001-6605-7278
All of the data that support the findings of this study are available in the main text or Supplementary Information.
Supplementary tables
Data type: docx
Explanation note: table S1: Information on the populations of the studied species Erica mackayana and E. andevalensis, and one population of E. tetralix used as the outgroup. N1: number of individuals used in the genetic analyses, based on having <50% of missing data in population structure analysis. N2: number of individuals used in the morphological analyses. table S2: Morphological traits measured in plants of Erica mackayana and E. andevalensis, acronym names and methodology. QN = Quantitative trait, O = Ordinal trait, B = Binary trait. N = Number of measures per specimen. table S3: Results of univariant tests between Erica mackayana and E. andevalensis, including the arithmetic mean and range for each of the morphological traits named as in Fig.