Research Article |
Corresponding author: Sangho Choi ( decoy0@kribb.re.kr ) Corresponding author: Ritesh Kumar Choudhary ( rkchoudhary@aripune.org ) Corresponding author: Soo-Yong Kim ( soodole@kribb.re.kr ) Academic editor: Stephen Boatwright
© 2021 Ashwini M. Darshetkar, Satish Maurya, Changyoung Lee, Badamtsetseg Bazarragchaa, Gantuya Batdelger, Agiimaa Janchiv, Eun Ju Jeong, Sangho Choi, Ritesh Kumar Choudhary, Soo-Yong Kim.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Darshetkar AM, Maurya S, Lee C, Bazarragchaa B, Batdelger G, Janchiv A, Jeong EJ, Choi S, Choudhary RK, Kim S-Y (2021) Plastome analysis unveils Inverted Repeat (IR) expansion and positive selection in Sea Lavenders (Limonium, Plumbaginaceae, Limonioideae, Limonieae). PhytoKeys 175: 89-107. https://doi.org/10.3897/phytokeys.175.61054
|
The genus Limonium, commonly known as Sea Lavenders, is one of the most species-rich genera of the family Plumbaginaceae. In this study, two new plastomes for the genus Limonium, viz. L. tetragonum and L. bicolor, were sequenced and compared to available Limonium plastomes, viz. L. aureum and L. tenellum, to understand the gene content and structural variations within the family. The loss of the rpl16 intron and pseudogenisation of rpl23 was observed. This study reports, for the first time, expansion of the IRs to include the ycf1 gene in Limonium plastomes, incongruent with previous studies. Two positively selected genes, viz. ndhF and ycf2, were identified. Furthermore, putative barcodes are proposed for the genus, based on the nucleotide diversity of four Limonium plastomes.
Intron loss, IR expansion, positive selection, pseudogenisation, ycf1
The family Plumbaginaceae of the order Caryophyllales is highly diverse, rich in species and displays a cosmopolitan distribution with its maximum diversity in the temperate areas of the northern hemisphere (
The genus Limonium Mill., popularly known as sea lavenders, belongs to the subfamily Limonioideae and tribe Limonieae (
Certain phylogenetic studies tried to resolve the relationships within Limonium at a global scale (
With the advent of sequencing technologies, the availability of large genome-scale data has made it easier to understand phylogeny and detect polyploidy events (
The present study reports the plastome sequences of two Asian Limonium species, viz. L. tetragonum (Thunb.) Bullock and L. bicolor (Bunge) Kuntze and compares the structure, composition and diversity within the genus by combining them with other available plastomes. L. tetragonum is a biennial species characterised by a spicate inflorescence, yellow corolla, acute calyx with pink at the base, white in upper parts and distributed in Japan, Korea, New Caledonia and Primorye (
Leaf samples of Limonium bicolor and L. tetragonum were collected from Meneng steppe of Dornod Province of Mongolia (Voucher No. KRIB 0070251) in June 2015 and from the coastal area of Ulsan City of the Republic of Korea (Voucher No. KRIB 0086343) in April 2018, respectively. The samples were deposited at the Herbarium of Korea Research Institute of Bioscience and Biotechnology (KRIB). DNA extraction was carried out from dried leaves using the DNeasy Plant Mini Kit (QIAGEN, Cat. No. 69104) according to the manufacturer’s protocol. For both the plastomes, a 550 bp DNA TruSeq Illumina (Illumina, San Diego, CA, USA) sequencing library was constructed. After the library preparation, the DNA samples were run in a single lane of an Illumina HiSeq 10X with a read length of 151 bp.
The raw reads obtained after Illumina sequencing were analysed using FastQC V0.11.7 (
All Plumbaginaceae plastomes available so far were included for comparison. Four plastomes of Limonium, viz. L. aureum (MN623109), L. tenellum (MK397871), L. tetragonum, L. bicolor and Ceratostigma willmottianum (MK397862), as well as Plumbago auriculata (MH286308), were included in the analysis. The plastome of L. sinense was not included in any of the analyses due to ambiguities observed in the assembly. The selected six plastomes were aligned using a Geneious prime 2020.2.2 plugin MAFFT v.7.450 (
Simple Sequence Repeats across the four Limonium plastomes were detected using MISA online server (
The four Limonium plastomes were aligned using Geneious prime 2020.2.2 plugin MAFFT v.7.450 (
The percentage codon usage of protein-coding regions of all four Limonium plastomes was calculated using Geneious prime 2020.2.2. To examine the frequency and uniformity of Synonymous codon and codon biases, the Relative Synonymous Codon Usage (RSCU) was also determined in DnaSP v.6.12.01 software (
In order to detect protein-coding genes under selection in the genus Limonium, sequences of each gene were aligned using the MAFFT v.7.450 plugin of Geneious prime 2020.2.2. The aligned sequences were again manually checked for an end-to-end alignment. The phylogenetic tree for each protein-coding gene was constructed using the FastTree plugin (
A total of 39 plastome sequences, including six from Plumbaginaceae were considered as ingroups for phylogenomic analysis. The outgroup was composed of members of Amaranthaceae. All the plastome sequences were aligned using the MAFFT v.7.450 plugin of Geneious prime 2020.2.2. Maximum Likelihood analysis, based on the best fit model GTR+F+R5, was performed using IQtree 1.6.12-MacOSX (
The average organelle coverage for the plastomes of L. tetragonum and L. bicolor was 1014X and 1009X, respectively. Plastomes of L. tetragonum and L. bicolor exhibited a typical quadripartite structure (Fig.
Circular gene map of plastomes of L. tetragonum and L. bicolor. Genes drawn inside the circle are transcribed clockwise and those outside are counter-clockwise. Genes belonging to different functional groups are shown in different colours. The innermost circle denotes GC content across the plastome.
Species | Limonium tetragonum | Limonium bicolor | Limonium aureum | Limonium tenellum | Ceratostigma willmottianum | Plumbago auriculata |
---|---|---|---|---|---|---|
Accession No. | MW085088 | MW085089 | MN623109 | MK397871 | NC041261 | NC041245 |
Genome size (bp) | 154691 | 154617 | 154661 | 150515 | 164999 | 168765 |
LSC length (bp) | 84568 | 84541 | 84546 | 84634 | 89454 | 91912 |
SSC length (bp) | 12997 | 12964 | 12980 | 23755 | 13491 | 13331 |
IR length (bp) | 28563 | 28556 | 28568 | 21063 | 31027 | 31761 |
No. of genes duplicated in IR | 15 | 15 | 16 | 10 | 15 | 19 |
No. of genes | 128 | 128 | 130 | 124 | 127 | 132 |
No. of protein coding genes | 83 | 83 | 83 | 82 | 82 | 84 |
No. of tRNA genes | 37 | 37 | 37 | 36 | 37 | 37 |
No. of rRNA genes | 8 | 8 | 8 | 6* | 8 | 8 |
Total GC content (%) | 37 | 37 | 37.1 | 37.1 | 37.5 | 37.2 |
List of genes in the newly-sequenced plastomes of L. tetragonum and L. bicolor.
Category | Group | Name |
---|---|---|
Photosynthesis-related genes | Rubisco | rbcL |
Photosystem 1 | psaA, psaB, psaC, psaI, psaJ | |
Photosystem 2 | psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ | |
APT synthase | atpA, atpB, atpE, atpF†, atpH, atpI | |
Cytochrome b/f complex | petA, petB, petD, petG, petL, petN | |
NADPH Dehydrogenase | ndhA†, ndhB†*, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK | |
Transcription and translation-related genes | Transcription | rpoA, rpoB, rpoC1†, rpoC2 |
Ribosomal proteins | rps2, prs3, rps4, rps7*, rps8, rps11, rps12†*, rps14, rps15, rps16†, rps18, rps19 | |
rpl2†, rpl14, rpl16, rpl20, rpl22, rpl23#*, rpl33, rpl36 | ||
Translation initiation factor | infA | |
RNA genes | Ribosomal RNA | rrn5*, rrn4.5*, rrn16*, rrn23* |
Transfer RNA | trnA-UGC†*, trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-UCC, trnH-GUG, trnI-GAU†*, trnK-UUU†, trnL-CAA*, trnL-UAA, trnL-UAG†, trnM-CAU, trnN-GUU*, trnP-UGG, trnQ-UUG, trnR-ACG*, trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC*, trnW-CCA, trnY-GUA | |
Other genes | RNA processing | matK |
Carbon metabolism | cemA | |
Fatty acid synthesis | accD | |
Proteolysis | clpP† | |
Genes of unknown function | Conserved reading frame | ycf1†*, ycf2*, ycf3†, ycf4 |
The IR regions of plastomes are divided by four junctions viz., IRb/LSC, IRb/SSC, IRa/SSC and IRa/LSC. All six plastomes of Plumbaginaceae (including four Limonium) were compared for their IR boundaries. The annotations available on NCBI database were used for Limonium aureum, L. tenellum, Ceratostigma willmottianum and Plumbago auriculata.
The IRb/LSC junction of three Limonium species viz. L. tetragonum, L. bicolor and L. aureum was characterised by the presence of the rpl2 gene (Fig.
The next junction, i.e. IRb/SSC, was characteried by the presence of the gene in L. tetragonum, L. bicolor, L. aureum and P. auriculata (49 bp of all Limonium species and 51 bp of Plumbago in IR region). In L. tenellum, the junction exhibited rrn23 (IR) and trnR-ACG (SSC), which could probably be due to wrong assembly or annotation. In Ceratostigma, ndhF appeared to be shifted to SSC, 118 bp away from IRb border.
IRa/SSC junction of all compared species was characterised by the presence of rps15 and ycf1 genes, except in L. tenellum. The IRa/LSC junction was characterised by rpl32 and trnH in L. tetragonum, L. bicolor and L. aureum, while in L. tenellum and C. willmottianum, it was characterised by ycf2 and trnH. In P. auriculata, the junction was bordered by rps19 and trnH (Fig.
Plastomes of L. aureum, L. bicolor and L. tetragonum exhibited two copies of ycf1, except for L. tenellum which exhibited a single copy. All three plastomes are characterised by the ycf1 gene having a length of 5,298 bp and IR has been expanded to accommodate the ycf1 gene. Plumbago and Ceratostigma also exhibited the ycf1 gene duplicated in IRs. However, the annotation provided for L. tenellum (MK397871) exhibits a single copy of ycf1.
Plastomes of Limonium tetragonum, L. tenellum, L. bicolor and L. aureum were compared with two Plumbaginaceae plastomes, keeping L. tetragonum as a reference. Sequence divergence amongst the four compared Limonium plastomes was similar as compared to Plumbago and Ceratostigma. Limonium tenellum exhibited partial deletion at the IRa/LSC junction in the ycf1 gene (Fig.
The nucleotide diversity (Pi) of four Limonium plastomes was analysed, except for the ycf1 region, which was removed due to ambiguous alignment. Sliding window analysis yielded some regions with higher Pi values. High nucleotide diversity was found in two spacer regions viz. trnY-GUA-trnT-GGU, trnL-UAG-ccsA and one gene ycf3 with Pi values 0.015, 0.043 and 0.02, respectively (Fig.
The plastomes of L. bicolor exhibited 6 compound, 26 mono-, 9 di-, 5 tri- and 5 tetranucleotide repeats, while L. tetragonum exhibited 4 compound, 31 mono-, 9 di-, 4 tri-, 6 tetra- and 1 hexanucleotide repeats (Fig.
All four species of Limonium exhibited only forward and palindrome type in REPuter analysis (Suppl. material
The four Limonium plastomes were compared for their codon usage. The plastomes of L. tetragonum, L. tenellum, L. bicolor and L. aureum exhibited 27,290, 24,093, 26,682 and 27,308 codons, respectively. Leucine was the most abundant while Cysteine was the least abundant amino acid in all the compared plastomes (Fig.
A total of 79 consensus protein-coding genes of four Limonium species were evaluated with respect to selective pressure. Two genes were found to have undergone positive selection viz. ndhF and ycf2 with ω values 2.2278 and 19.657, respectively (Suppl. material
In the phylogenomic analysis, the representatives of Limonium formed a strongly-supported monophyletic group (BS = 100), in which L. bicolor was recovered as sister to L. tetragonum (BS = 73), with L. aureum and L. tenellum being successive sisters (BS = 100) to the clade of L. bicolor and L. tetragonum. Plumbago auriculata and Ceratostigma willmottianum also formed a monophyletic group (BS = 100), sister to the Limonium clade (BS = 100). All these made Plumbaginaceae a strongly-supported (BS = 100) monophyletic group (Fig.
In this study, two Limonium plastomes were assembled and the structure and composition of four Limonium plastomes were compared. The plastomes were conserved in terms of size and structure ranging from 154,617 to 154,691 bp, except for L. tenellum with 150,515 bp. Expansion and contraction of IRs/SSC account for huge variation, evolutionary events and also affect the plastome sizes (
Ribosomal Protein L23 is a protein component of the 60s large ribosomal subunit. The comprehensive study of plastomes of Caryophyllales (
The value of the ratio of synonymous and nonsynonymous substitutions (Ka/Ks or ω) above 1 indicates that the corresponding genes experience positive selection, however, ω values ranging from 0.5 to 1 indicate relaxed selection (
The sampling for the phylogenomic analysis followed the studies of
The present study has made an effort to understand the structural changes in the plastomes of Plumbaginaceae by including two newly-generated Limonium plastome sequences. The study also confirms the loss of rpl16 intron in the genus Limonium and pseudogenisation of the rpl23 gene in Plumbaginaceae. Our results also revealed, for the first time, the expansion of the IRs to accommodate the ycf1 gene in Limonium as in other Plumbaginaceae members. The annotation available for L. tenellum exhibits ycf1 in the SSC region. Hence, the sequencing of more plastomes would aid in identifying the exact position of ycf1. Two positively-selected genes were identified, viz. ndhF and ycf2. The positive selection of these genes could be linked to the evolution of ndhF to adapt to extreme environmental conditions, such as salt stress. It would be interesting to identify the adaptive sites in the ndhF amino acid by adding more ndhF sequences of Limonium species, while the expansion of IRs and accommodation of ycf2 genes could be related to the re-arrangement of the plastome. The function of ycf2 is still not clear, but it would be interesting to study the ycf2 evolution and re-arrangement in the whole order. High nucleotide diversity was observed in two spacer regions trnY-GUA--trnT-GGU, trnL-UAG--ccsA and one gene ycf3, which could be used as potential DNA barcodes for the genus. Future studies will focus on identifying adaptive codon sites in positively-selected genes and correlating those with the habitats and environmental conditions and validation of the proposed barcodes by including more Limonium species.
AMD, SM and RKC are grateful to the Director, Agharkar Research Institute for providing facilities. AMD is thankful to CSIR for Senior Research Fellowship. This research was supported by the grant from the KRIBB Initiative Program of the Republic of Korea and the Bio and Medical Technology Development Program of the National Research Foundation (NRF) and funded by the Korean Government (MSIT) (NRF-2016K1A1A8A01939075).
The output of the repeat analysis of four Limonium plastomes
Data type: molecular analysis
Relative Synonymous Codon Usage
Data type: molecular data
List of Positively selected genes
Data type: molecular data