Research Article |
Corresponding author: Ofelia Vargas-Ponce ( vargasofelia@gmail.com ) Academic editor: Sandy Knapp
© 2022 Isaac Sandoval-Padilla, María del Pilar Zamora-Tavares, Eduardo Ruiz-Sánchez, Jessica Pérez-Alquicira, Ofelia Vargas-Ponce.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Sandoval-Padilla I, Zamora-Tavares MdP, Ruiz-Sánchez E, Pérez-Alquicira J, Vargas-Ponce O (2022) Characterization of the plastome of Physalis cordata and comparative analysis of eight species of Physalis sensu stricto. PhytoKeys 210: 109-134. https://doi.org/10.3897/phytokeys.210.85668
|
In this study, we sequenced, assembled, and annotated the plastome of Physalis cordata Mill. and compared it with seven species of the genus Physalis sensu stricto. Sequencing, annotating, and comparing plastomes allow us to understand the evolutionary mechanisms associated with physiological functions, select possible molecular markers, and identify the types of selection that have acted in different regions of the genome. The plastome of P. cordata is 157,000 bp long and presents the typical quadripartite structure with a large single-copy (LSC) region of 87,267 bp and a small single-copy (SSC) region of 18,501 bp, which are separated by two inverted repeat (IRs) regions of 25,616 bp each. These values are similar to those found in the other species, except for P. angulata L. and P. pruinosa L., which presented an expansion of the LSC region and a contraction of the IR regions. The plastome in all Physalis species studied shows variation in the boundary of the regions with three distinct types, the percentage of the sequence identity between coding and non-coding regions, and the number of repetitive regions and microsatellites. Four genes and 10 intergenic regions show promise as molecular markers and eight genes were under positive selection. The maximum likelihood analysis showed that the plastome is a good source of information for phylogenetic inference in the genus, given the high support values and absence of polytomies. In the Physalis plastomes analyzed here, the differences found, the positive selection of genes, and the phylogenetic relationships do not show trends that correspond to the biological or ecological characteristics of the species studied.
Boundaries, cpDNA, expansion, phylogeny, positive selection
Physalis L. (Solanaceae) includes 95 morphologically and ecologically variable species (
Physalis contains species of economic, nutritional, and medicinal importance. The fruits of some species are edible and contain vitamins, minerals, carotenoids, phytosterols, and phenolic compounds that have nutraceutical and antioxidant properties (
Chloroplasts possess photosynthetic machinery for the transformation of solar energy into chemical energy. They present their own genome, the plastome, which in spermatophytes tends to be between 120 and 180 kb long. Its circular structure consists of a large single-copy (LSC) region and a small single-copy (SSC) region separated by two inverted repeat regions (IRa and IRb), and the order and content of genes and introns are overall conserved (
Comparative plastomic analyses contribute to understanding the evolutionary history of different groups of plants. These comparisons help to identify whether the evolution of a particular group has occurred in parallel, presenting similar evolutionary patterns when homology among genomes is high or has occurred independently showing reticulated evolution (
Several comparative plastomic analysis have been conducted on the family Solanaceae, but for Physalis, few studies of the chloroplast genome have been undertaken.
Fresh leaves of P. cordata were collected in the field and immediately dried with silica gel for further DNA extraction. The cpDNA was isolated based on
Species | GenBank accession | Reference | Voucher specimen or DNA number |
---|---|---|---|
P. angulata | MH019241 | Unpublished | Not available |
P. cordata | ON018728 | This study | JS571 |
P. chenopodiifolia | MN508249 |
|
OVP539-5112011 |
P. minima | MH045577 |
|
PHZ3003 |
P. peruviana | MH019242 | Unpublished | Not available |
P. philadelphica | MN192191 |
|
021118ISP |
P. pruinosa | MH019243 | Unpublished | Not available |
P. pubescens | MH045576 |
|
PHZ2001 |
A. officinarum | MH045575 |
|
PHZ4001 |
The quality of the raw reads was evaluated in FastQC 0.11.7 (
The complete sequence of the plastome of P. cordata was compared with the plastomes of seven Physalis species: P. angulata, P. chenopodiifolia, P. minima, P. peruviana, P. philadelphica, P. pruinosa L., and P. pubescens. The cpDNA of P. chenopodiifolia and P. philadelphica were stored in the LaniVeg. Accession numbers, references and voucher or DNA number of Physalis species are listed in Table
The sequences of the eight plastomes were aligned in MAFFT (
Forward, reverse, and palindromic repeat sequences in the plastomes were identified in REPuter (
To investigate the type of selection that has acted on Physalis plastome genes, we calculated the ratio of non-synonymous (Ka) and synonymous (Ks) substitutions. The Ka/Ks ratios of 51 genes that showed variation were evaluated. The aligned sequences were analyzed in KaKs_Calculator 2.0 (
To obtain a phylogenetic perspective on the relationships of P. cordata and the other seven species of Physalis sensu stricto we used A. officinarum as outgroup. The sequences of nine plastomes were aligned in MAFFT (
The Physalis cordata plastome is 157,000 bp long and presents a quadripartite structure, with an LSC region of 87,267 bp, an SSC region of 18,501 bp, and two IRs of 25,616 bp (Fig.
Summaries of plastomes of eight Physalis species and Alkekengi officinarum.
Characteristics | P. angulata | P. cordata | P. chenopodiifolia | P. minima | P. peruviana | P. philadelphica | P. pruinosa | P. pubescens | A. officinarum | |
---|---|---|---|---|---|---|---|---|---|---|
Size (bp) | 156,706 | 15,7000 | 15,6888 | 15,6692 | 15,6706 | 156,804 | 156,706 | 15,7007 | 156,578 | |
LSC length (bp) | 90,977 | 87,267 | 87,117 | 86,845 | 86,995 | 87,131 | 88,758 | 87,137 | 88,309 | |
SSC length (bp) | 18,395 | 18,501 | 18,451 | 18,503 | 18,393 | 18,483 | 18,394 | 18,500 | 18,363 | |
IR length (bp) | 23,667 | 25,616 | 25,660 | 25,672 | 25,695 | 25,595 | 24,777 | 25,685 | 24,953 | |
Number of genes | 114 | 115 | 113 | 114 | 114 | 115 | 114 | 114 | 115 | |
Protein-coding genes | 79 | 80 | 79 | 80 | 79 | 80 | 79 | 80 | 80 | |
tRNA genes | 31 | 31 | 30 | 30 | 31 | 31 | 31 | 30 | 31 | |
rRNA genes | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | |
Genes in IR | 22 | 22 | 22 | 22 | 22 | 22 | 22 | 22 | 22 | |
Genes with introns | 19 | 19 | 19 | 19 | 19 | 19 | 19 | 19 | 19 | |
Genes in IR with introns | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | |
Nucleotide content | A | 30.81 | 30.84 | 30.83 | 30.82 | 30.81 | 30.87 | 30.81 | 30.83 | 30.78 |
C | 19.1 | 19.07 | 19.08 | 19.09 | 19.08 | 19.06 | 19.1 | 19.09 | 19.14 | |
G | 18.45 | 18.45 | 18.44 | 18.45 | 18.46 | 18.45 | 18.46 | 18.45 | 18.52 | |
T | 31.63 | 31.64 | 31.65 | 31.64 | 31.64 | 31.66 | 31.63 | 31.63 | 31.56 | |
GC content (%) | Total | 37.55 | 37.52 | 37.52 | 37.54 | 37.54 | 37.51 | 37.56 | 37.54 | 37.65 |
LSC | 35.58 | 35.57 | 35.57 | 35.6 | 35.57 | 35.63 | 35.7 | 35.57 | 35.75 | |
SSC | 31.4 | 31.26 | 31.36 | 31.4 | 31.36 | 31.32 | 31.37 | 31.36 | 31.88 | |
IR | 43.06 | 43.08 | 43.06 | 43.03 | 43.08 | 43.1 | 43.19 | 43.08 | 42.88 |
Plastome gene content and functional classification in Physalis species.
Gene group | Gene name | |
---|---|---|
Photosynthesis | Photosystem I | psaA, psaB, psaC, psaI, psaJ, ycf3ΨΨ, ycf4 |
Photosystem II | psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, lhbA | |
ATP synthase | atpA, atpB, atpE, atpF, atpH, atpI | |
NADH dehydrogenase | ndhAΨ, ndhB*Ψ, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK | |
Cytochrome b/f complex | petA, petBΨ, petD, petG, petL, petN | |
Large subunit of RuBisCO | rbcL | |
Large subunit of ribosome | rpl2*Ψ, rpl14, rpl16Ψ, rpl20, rpl22, rpl23*, rpl32, rpl33, rpl36 | |
Self-replication | RNA polymerase subunits | rpoA, rpoB, rpoC1Ψ, rpoC2 |
Small subunit of ribosome | rps3, rps4, rps7*, rps8, rps11, rps12*Ψ, rps12_3end, rps14, rps15, rps18, rps19 | |
Ribosomal RNA genes | rrn16*, rrn23*, rrn4.5*, rrn5* | |
Transfer RNA genes | trnA-UGC*Ψ, trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-GCCΨ, trnG-UCC, trnH-GUG, trnI-CAU*, trnI-GAU*Ψ, trnK-UUUΨ, trnL-CAA*, trnL-UAAΨ, trnL-UAG, trnM-CAU, trnN-GUU*, trnP-GGG†, trnP-UGG, trnQ-UUG, trnR-ACG*, trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC*, trnV-UACΨ, trnW-CCA, trnY-GUA | |
Other genes | Hypothetical chloroplast reading frames | orf42*, orf56*, ycf2*, ycf68*, orf188‡ |
Subunit of acetyl-CoAcarboxylase | accD | |
c-type cytochrome synthesis | ccsA | |
Envelope membrane protein | cemA | |
Protease | clpPΨΨ | |
Maturase | matK | |
Pseudogenes | infA, rps2, rps16Ψ, ycf1*, ycf15* |
Plastome map of Physalis cordata. Genes located outside the outer circle are transcribed in the clockwise direction, whereas genes within the circle are transcribed in the counterclockwise direction. Genes with introns were marked with (*). Genes belonging to different functional groups are color-coded. Darker gray dashed area in the inner circle indicates GC content while lighter gray corresponds to the AT content of the plastome.
The comparison of the plastome of P. cordata with those of P. angulata, P. chenopodiifolia, P. minima, P. peruviana, P. philadelphica, P. pruinosa, and P. pubescens showed that all plastomes presented the typical quadripartite structure and genetic organization (Table
The plastome of P. cordata presented 115 genes. This number is only shared with P. philadelphica since P. angulata, P. minima, P. peruviana, P. pruinosa, and P. pubescens have 114 genes and P. chenopodiifolia 113 genes. Of the species sharing 113 genes, P. cordata and P. philadelphica differed in the presence of the trnP-GGG gene, and P. chenopodiifolia was lacking orf188. All species presented 22 genes in IRs and the rps12 gene was trans-spliced (Table
The comparison of the limits of the LSC/IR and SSC/IR regions of the eight Physalis plastomes and A. officinarum showed some variations (Fig.
The identity between the plastome of P. cordata and those of the other seven Physalis species was high. Identical sequences were mainly found in coding regions, and the greatest divergence was in the intergenic regions. The comparison between regions showed that the LSC and SSC regions were more divergent than were IRs. Introns also exhibited greater variation than the exons. The most divergent genes were ycf1 and ycf2, as well as the intergenic regions trnH-GUG-psbA and trnL-UAA-trnF-GAA (Fig.
Comparative plots of identity among Physalis species. The percentage of identity ranges from 50 to 100% and is shown in the vertical axis. Gray arrows indicate genes with their orientation and position of their transcription in the reference plastome (P. cordata). Plastome regions are color coded as blue blocks for the conserved coding genes (exon), turquoise for introns and red blocks for non-coding sequences in intergenic regions (CNS).
The sequences of 51 genes and 75 intergenic regions showed variation. The lowest variation in genes was one change in 14 genes, and the highest variation was 173 changes in ycf1. The lowest variation in intergenic regions was one change in 16 of them, and the highest was in trnL-UAA-trnF-GAA with 42. The average value of π was lower in the genes than in the intergenic regions (Suppl. material
The repeated sequences in the plastome ranged from 35 in P. philadelphica to 49 in P. cordata (Suppl. material
In 51 genes, eight showed values of Ka/Ks > 1, indicating that they are under positive selection (cemA, ndhB, ndhJ, ndhK, psaC, rbcL, rpoA, and ycf1, Fig.
The ML phylogeny recovers P. minima as a sister to the seven other Physalis species included in this study (BS = 100; see Fig.
The plastome of P. cordata analyzed here presents the typical quadripartite structure and the same order of genes as has been found for other species of the genus. However, the species vary in the total size and the size of the regions. In general, the average size of plastomes in Physalis is 156,814 bp, and the difference between the largest (P. pubescens, 157,007 bp) and the smallest plastome (P. minima, 156,692 bp) was 315 bp. Phylogenetically, closely related species tend to be homogeneous in size and their regions (
The Physalis species studied have between 113 to 115 genes; 113 of these are completely shared, with the same distribution and number of introns. The difference in the number of genes is based on the presence of trnP-GGG in P. cordata and P. philadelphica and the absence of orf188 in P. chenopodiifolia. We suggest that the genes that are not shared are the product of loss events during the evolutionary process. In addition, the size of 10 genes was different in at least one of the eight species. For example, in P. philadelphica, the second exon of the petB gene differs by three bp with respect to those of the other species. Additionally, gene sizes are variable among the eight species, as occurs for ycf1, which varies from six to 114 bp. In the eight species, there were 17 genes with 19 introns, 15 genes with one intron, and two genes with two introns (clpP and ycf3). Physalis does not have an intron in the petD gene (gene of the cytochrome b6-f subunit 4 complex), unlike that which occurs in other genera of Solanaceae such as Atropa, Capsicum, Datura L., Nicotiana, Solanum, and Withania Pauquy (
Variation in the boundaries of plastome regions is a relatively common evolutionary process that occurs in different plant groups (
The variation between plastomes, in some cases, is limited due to their low rate of evolution, so repetitive regions and microsatellites can reveal interspecific variation (
The evolutionary history of species is shaped by two main factors: mutation, which generates new genotypes, and selection, which determines the probability that new genotypes will be fixed or eliminated (
Throughout the evolutionary history of the plastome, most genes have been under purifying selection due to functional limitations (
Coding and non-coding regions of plastomes both tend to have a high degree of conservation (
The phylogenetic perspective we obtained confirms the usefulness of the plastome as a source of information for conducting phylogenetic studies in Physalis, despite the limited number of species studied. In comparison with other studies that include partial nucleus and chloroplast sequences (
The plastome of Physalis cordata has the typical quadripartite structure, total size, and GC content similar with other Physalis species for which full plastome sequences are available. Physalis plastomes have 113 to 115 genes with the same distribution and number of introns. Comparative analysis among eight Physalis species showed differences in the boundary of the LSC/IR and SSC/IR regions and three distinct types were identified, given by the variation in genes present. The high percentage of conservation of the sequences and the variation observed at the boundaries of the plastome regions, in the ycf1 and ycf2 genes, and in some coding and intergenic regions are relatively common evolutionary processes, and is seen here in all the Physalis species studied. Likewise, the presence of genes under positive selection, in some or all of the Physalis species analyzed, suggest that they are differentially expressed, and could favor the photosynthetic process and environmental adaptation, which needs to be verified. We have shown that the plastome is potentially useful for further phylogenetic studies if key highly variable genes are used. Finally, we identified that despite the level of conservation in the plastome of Physalis, variation in sequence does exist and probably reflects independent evolutionary processes. Future studies should include a larger number of species representing the variation in biological and ecological characteristics to understand the evolution of the plastome in Physalis.
This work was supported by UDG and CONACyT-Laboratorio Nacional de Identificación y Caracterización Vegetal (LaniVeg) [Grant No. 293833], Universidad de Guadalajara [Grant Prosni-2018 to OVP] and CONACyT-México through a Doctor scholarship for graduate studies in Doctorado en Biosistemática, Ecología y Manejo de Recursos Naturales y Agrícolas (BEMARENA) [Grant No. 928518 awarded to ISP].
Tables S1, S2 and Figures S1–S3
Data type: Tables and images of plastome data and atributes (MS Word file)
Explanation note: The tables contains data about introns in choroplast genes and biological and ecological traits of Physalis species included in the study. Graphs show data about type of microsatellites and frequency.