Research Article |
Corresponding author: Le Thi Thu Hien ( hienlethu@igr.ac.vn ) Academic editor: Yasen Mutafchiev
© 2022 Nguyen Nhat Linh, Pham Le Bich Hang, Huynh Thi Thu Hue, Nguyen Hai Ha, Ha Hong Hanh, Nguyen Dang Ton, Le Thi Thu Hien.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Linh NN, Hang PLB, Hue HTT, Ha NH, Hanh HH, Ton ND, Hien LTT (2022) Species discrimination of novel chloroplast DNA barcodes and their application for identification of Panax (Aralioideae, Araliaceae). PhytoKeys 188: 1-18. https://doi.org/10.3897/phytokeys.188.75937
|
Certain species within the genus Panax L. (Araliaceae) contain pharmacological precious ginsenosides, also known as ginseng saponins. Species containing these compounds are of high commercial value and are thus of particular urgency for conservation. However, within this genus, identifying the particular species that contain these compounds by morphological means is challenging. DNA barcoding is one method that is considered promising for species level identification. However, in an evolutionarily complex genus such as Panax, commonly used DNA barcodes such as nrITS, matK, psbA-trnH, rbcL do not provide species-level resolution. A recent in silico study proposed a set of novel chloroplast markers, trnQ-rps16, trnS-trnG, petB, and trnE-trnT for species level identification within Panax. In the current study, the discriminatory efficiency of these molecular markers is assessed and validated using 91 reference barcoding sequences and 38 complete chloroplast genomes for seven species, one unidentified species and one sub-species of Panax, and two outgroup species of Aralia L. along with empirical data of Panax taxa present in Vietnam via both distance-based and tree-based methods. The obtained results show that trnQ-rps16 can classify with species level resolution every clade tested here, including the highly valuable Panax vietnamensis Ha et Grushv. We thus propose that this molecular marker to be used for identification of the species within Panax to support both its conservation and commercial trade.
DNA barcode, Panax genus, Panax vietnamensis, petB, trnE-trnT, trnQ-rps16, trnS-trnG
The genus Panax L. is well-known in culinary and medicinal traditions in many countries including China, Korea, Japan, and Vietnam. Its species produce ginsenosides, also known as ginseng saponins with strong antioxidant, antidiabetic, antitumor, and neuroprotective activities (
Historically, morphological methods have been used to identify ginseng species, though this is challenging due to how similar different ginseng species can appear. Incorrect identification can lead to unintentional or intentional mislabeling and adulteration with low-quality ginsengs, and ultimately affect the consumers’ health and damage the providers’ integrity. Recently, molecular methods have been shown to be efficient for solving problems related to species identification. However, the most commonly used barcoding sequences are challenging to use in the genus Panax, because these often lack sufficient variability to unambiguously identify the species (
Previously performed an in silico analysis indicated that the chloroplast DNA markers trnQ-rps16, trnE-trnT, petB, and trnS-trnG had high species identification potential within the genus Panax (
Leaf samples of five taxa belonging to the genus Panax were collected in the North and Central Vietnam (Table
Sample ID | Collector | Collection date | Collected location | ||
---|---|---|---|---|---|
Coordinates | District | Province | |||
P. vietnamensis | |||||
TL25 | Luong Duc Toan | 10/16/2017 | 15°01.17'N, 108°00.76'E | Nam Tra My | Quang Nam |
CP13 | Luong Duc Toan | 10/16/2017 | 15°01.40'N, 108°03.10'E | Nam Tra My | Quang Nam |
TN22 | Luong Duc Toan | 10/16/2017 | 15°00.94'N, 108°03.08'E | Nam Tra My | Quang Nam |
D42 | Le Thi Thu Hien | 09/28/2018 | 15°00.94'N, 108°02.58'E | Nam Tra My | Quang Nam |
D43 | Le Thi Thu Hien | 09/28/2018 | 15°00.94'N, 108°02.58'E | Nam Tra My | Quang Nam |
D11 | Le Thi Thu Hien | 09/28/2018 | 15°00.94'N, 108°02.58'E | Nam Tra My | Quang Nam |
D6 | Le Thi Thu Hien | 09/28/2018 | 15°00.94'N, 108°02.58'E | Nam Tra My | Quang Nam |
Q1 | Le Thi Thu Hien | 09/28/2018 | 15°02.53'N, 108°02.72'E | Nam Tra My | Quang Nam |
B42 | Le Thi Thu Hien | 09/28/2018 | 15°03.11'N, 107°97.97'E | Nam Tra My | Quang Nam |
ML043 | Luong Duc Toan | 10/11/2017 | 15°03.20'N, 107°97.90'E | Nam Tra My | Quang Nam |
TL27 | Luong Duc Toan | 10/11/2017 | 15°03.18'N, 107°97.91'E | Nam Tra My | Quang Nam |
TT15 | Luong Duc Toan | 10/11/2017 | 14°96.41'N, 108°10.05'E | Nam Tra My | Quang Nam |
TR2 | Luong Duc Toan | 10/11/2017 | 15°07.73'N, 108°00.76'E | Nam Tra My | Quang Nam |
PL073 | Luong Duc Toan | 10/11/2017 | 15°27.50'N, 107°87.90'E | Phuoc Son | Quang Nam |
TG07 | Luong Duc Toan | 10/11/2017 | 15°79.20'N, 107°25.90'E | Tay Giang | Quang Nam |
NLay1 | Le Thi My Hao | 10/11/2017 | 14°59.60'N, 108°14.80'E | Tu Mo Rong | Kon Tum |
MR3 | Le Thi My Hao | 10/11/2017 | 14°97.08'N, 107°99.90'E | Tu Mo Rong | Kon Tum |
TX1 | Le Thi My Hao | 10/11/2017 | 14°96.10'N, 107°95.40'E | Tu Mo Rong | Kon Tum |
MR7 | Le Thi My Hao | 10/11/2017 | 14°97.10'N, 107°89.50'E | Tu Mo Rong | Kon Tum |
NL1 | Le Thi My Hao | 10/11/2017 | 15°06.20'N, 107°94.40'E | Dak Glei | Kon Tum |
X1 | Le Thi My Hao | 10/11/2017 | 15°07.60'N, 107°83.20'E | Dak Glei | Kon Tum |
MH1 | Le Thi My Hao | 10/11/2017 | 15°73.00'N, 107°54.43'E | Dak Glei | Kon Tum |
P. vietnamensis var. fuscidiscus | |||||
SLC | Nguyen Tien Dung | 07/31/2015 | 22°20.00'N, 103°42.40'E | Sin Ho | Lai Chau |
Panax sp. Puxailaileng | |||||
SNA | Nguyen Tien Dung | 12/07/2015 | 19°53.06'N, 104°33.89'E | Ky Son | Nghe An |
P. stipuleanatus | |||||
TTH | Nguyen Tien Dung | 08/26/2015 | 22°40.86'N, 103°80.67'E | Sa Pa | Lao Cai |
P. bipinnatifidus | |||||
SVD | Nguyen Tien Dung | 08/26/2015 | 22°40.86'N, 103°80.67'E | Sa Pa | Lao Cai |
Distribution of Panax in Vietnam and sample locations. P. vietnamensis (green) collected in Quang Nam and Kon Tum Provinces. P. vietnamensis var. fuscidiscus (brown) collected in Lai Chau Province. Panax sp. Puxailaileng (pink) collected in Nghe An Province. P. bipinnatifidus (blue) and P. stipuleanatus (yellow) collected in Lao Cai Province. The natural distribution of P. vietnamensis, P. vietnamensis var. fuscidiscus, and Panax sp. are marked as green, brown, and pink, respectively. The wild habitat for P. bipinnatifidus and P. stipuleanatus is shown in yellow, and the purple area represents the distribution region of P. vietnamensis var. langbiangensis (not included in this study).
Total genomic DNA was extracted from leaf specimens using GeneJET Plant Genomic DNA Purification Kit (Thermo Fisher Scientific, USA) with the provided protocol. The concentration of genomic DNA was determined using a NanoDrop Spectrophotometer 2000 (Thermo Fisher Scientific, USA). Primer pairs for amplification of psbA-trnH, matK and rbcL regions were designed based on available sequences deposited in GenBank, and for ITS region primers were designed as previously reported (
Region | Primer name | Sequence (5’-3’) | Approximate amplicon length (bp) |
---|---|---|---|
ITS | ITS_AB_101 | ACGAATTCATGGTCCGGTGAAGTGTTCG | 650 |
ITS | ITS_AB_102 | TAGAATTCCCCGGTTCGCTCGCCGTTAC | 650 |
matK | MatK_F1A | ACYGTATTTTATGTTTACGACG | 750 |
matK | MatK_R1A | TCCATHTDGAAATCTTGGTTCA | 750 |
psbA-trnH | PsbA_trnH_PF | ACCCGGTCTTAGTGTATACGAG | 390 |
psbA-trnH | PsbA_trnH_PR | TTCACTGCCTTGATCCACTTGG | 390 |
rbcL | RbcL_PF | AGTGTTGGATTCAAGCTGGTG | 550 |
rbcL | RbcL_PR | TGGTTGTGAGTTCACGTTCT | 550 |
trnQ-rps16 (1) | Pv_trnQ_rps16_F | GAAGATTTAGGTCCTTAGTCGTTCG | 590 |
trnQ-rps16 (1) | Pv_trnQ_rps16_R | GATTCAGCATTCCCAGAGAATTGG | 590 |
trnS-trnG (2) | Pv_trnS_trnG_F | GCCGCTTTAGTCCACTCAGC | 660 |
trnS-trnG (2) | Pv_trnS_trnG_F | GTGTTGACATTTTTCGTGGGGG | 660 |
petB (3) | Pv_petB_F | AATATTCAGACCTCGCGGCC | 580 |
petB (3) | Pv_petB_R | GGCTCAAGCAAAACACCCAA | 580 |
trnE-trnT (4) | Pv_trnE_trnT_F | GAGTGGTTGGTCCGTCAGAA | 520 |
trnE-trnT (4) | Pv_trnE_trnT_R | CATGGCGTTACTCTACCGCT | 520 |
Raw sequencing data were checked for quality and cleaned using BioEdit version 7.0.9 (
Pairwise summary and pairwise explorer modules in TaxonDNA version 1.8 (
The best substitution model for each matrix was searched for using the jModelTest2 (
To evaluate the species discrimination efficiency for both the commonly used as well as newly proposed DNA markers for Panax we assessed the amplification success as well as the amplicon lengths. Bidirectional Sanger DNA sequencing of each fragment showed the amplicon lengths to be as follows: ITS 618–619 bp, matK 751 bp, psbA-trnH 352–361 bp, rbcL 521 bp, trnQ-rps16 575–590 bp, trnS-trnG 648–658 bp, petB 576–577 bp, and trnE-trnT 490–514 bp. ITS and matK did not amplify efficiently despite optimization of PCR amplification conditions, while other chloroplast regions were easily amplified. Despite some challenges, both PCR amplification and sequencing were successful for all regions (Table
Amplification and sequence information for all analyzed markers and their combinations.
Marker | Amplification/ Sequencing success rate (%) | Matrix size (bp) | Variable sites (%) | No. of PI sites | Mean pairwise distance | Intraspecific distances (mean) | Interspecific distances (mean) |
ITS | 100/ 100 | 623 | 17.17 | 75 | 0.0259 | 0.0000 –0.0292 (0.0107) | 0.0082 –0.0400 (0.0261) |
matK | 100/ 100 | 751 | 4.26 | 29 | 0.0054 | 0.0000- –0.0016 (0.0003) | 0.0000 –0.0216 (0.0091) |
psbA-trnH | 100/ 100 | 362 | 10.22 | 27 | 0.0175 | 0.0000 –0.0029 (0.0010) | 0.0000 –0.0297 (0.0212) |
rbcL | 100/ 100 | 521 | 2.50 | 11 | 0.0061 | 0.0000 –0.0007 (0.0002)* | 0.0019 –0.0101 (0.00615) |
trnQ-rps16 (1) | 100/ 100 | 657 | 6.54 | 35 | 0.0116 | 0.0000 –0.0025 (0.0007) | 0.0067 –0.0222 (0.0131) |
trnS-trnG (2) | 100/ 100 | 674 | 5.34 | 22 | 0.0068 | 0.0000 –0.0027 (0.0005) | 0.0017 –0.0133 (0.0082) |
petB (3) | 100/ 100 | 591 | 5.58 | 30 | 0.0164 | 0.0000 –0.0025 (0.0004) | 0.0013 –0.0340 (0.0196) |
trnE-trnT (4) | 100/ 100 | 614 | 13.84 | 16 | 0.0075 | 0.0000 –0.0004 (0.0001) | 0.0039 –0.0274 (0.0108) |
1+2 | 100/ 100 | 1331 | 5.94 | 57 | 0.0090 | 0.0000- –0.0021 (0.0006) | 0.0047 –0.0167 (0.0105) |
1+3 | 100/ 100 | 1248 | 6.09 | 65 | 0.0139 | 0.0000- –0.0025 (0.0006) | 0.0040 –0.0251 (0.0164) |
1+4 | 100/ 100 | 1271 | 10.07 | 51 | 0.0096 | 0.0000 –0.0014 (0.0004) | 0.0054 –0.0238 (0.0120) |
2+3 | 100/ 100 | 1265 | 5.45 | 52 | 0.0112 | 0.0000 –0.0014 (0.0005) | 0.0017 –0.0210 (0.0135) |
2+4 | 100/ 100 | 1288 | 9.39 | 38 | 0.0071 | 0.0000 –0.0017 (0.0003) | 0.0034 –0.0195 (0.0093) |
3+4 | 100/ 100 | 1205 | 9.79 | 46 | 0.0121 | 0.0000 –0.0013 (0.0003) | 0.0025 –0.0240 (0.0154) |
1+2+3 | 100/ 100 | 1922 | 5.83 | 87 | 0.0113 | 0.0000 –0.0016 (0.0005) | 0.0036 –0.0196 (0.0134) |
1+2+4 | 100/ 100 | 1945 | 8.43 | 73 | 0.0086 | 0.0000 –0.0016 (0.0005) | 0.0045 –0.0199 (0.0106) |
1+3+4 | 100/ 100 | 1862 | 8.65 | 81 | 0.0119 | 0.0000 –0.0017 (0.0004) | 0.0040 –0.0213 (0.0146) |
2+3+4 | 100/ 100 | 1879 | 8.20 | 68 | 0.0101 | 0.0000 –0.0011 (0.0004) | 0.0027 –0.0186 (0.0127) |
1+2+3+4 | 100/ 100 | 2536 | 7.77 | 103 | 0.0104 | 0.0000 –0.001 (0.0005) | 0.0037 –0.0181 (0.0128) |
The nucleotide matrices for the amplified markers and complemented with the 89 reference barcoding sequences and 36 complete chloroplast genomes from the seven species of Panax, one unidentified species and one sub-species of Panax present in GenBank, showed that the matrix sizes ranged from 362 to 751 bp for individual markers and 1205 to 2536 bp for concatenated markers (Table
Distance-based classification methods rely on intraspecific and interspecific distances to set a threshold to distinguish distinct species. In this study, genetic distances were calculated between individuals both within and between species using MEGAX and Pairwise Explorer (TaxonDNA). Due to the complexity in the species group consisting of P. bipinnatifidus and P. stipuleanatus, these two species were treated as a single group when calculating pairwise distances and assessing the species classification ability of different markers. For interspecific distances, MEGAX computed the average distance of all pairwise distances between each two species while TaxonDNA returned all the distances for every pair of sequences. According to the distances obtained from MEGAX a barcoding gap exists in rbcL, trnQ-rps16, trnE-trnT, and all combined markers (Table
BM/BCM analysis from TaxonDNA discriminates species based on similarity between sequences. For separated barcodes, analysis results showed that trnS-trnG and rbcL regions had the strongest discriminatory power with 100% correct identification for both BM calculations, followed by trnE-trnT (98.76%), trnQ-rps16 (97.53%), and ITS (93.82%). BCM analysis returned more stringent calculations of successful identified sequences than BM with 100% for trnS-trnG, 98.76% for trnE-trnT, 96.87% for rbcL, and 95.06% for trnQ-rps16. Markers that had the lowest identification success rate were petB (BM: 72.83%, BCM: 71.60%), matK (BM: 62.50%, BCM: 60.93%), and psbA-trnH (BM: 60.93%, BCM: 60.93%). Combinations made from the four newly proposed markers were also estimated for species identification tests. Discriminatory abilities of concatenated markers were observed to be slightly better than most separated barcodes. Combinations 2+3, 2+4, 3+4, and 2+3+4 showed correct classification rates of 100% for both BM and BCM calculations (Fig.
Both separate and concatenated matrices were used to reconstruct ML trees. We found that most of the markers could separate most of the clades with strong bootstrap support, with the exception of P. bipinnatifidus and P. stipuleanatus. These sister species had poor branch structure and weak support values. The taxonomic circumscription of P. bipinnatifidus has been controversial. Recent studies from
Results of mPTP species delimitation analysis for several markers based on ML trees A Species delimitation for marker trnQ-rps16 B Species delimitation for the combination of markers 1+3+4. Bootstrap values are displayed on the branches. The red branches represent supported species delimitations. Sequences highlighted in orange originate from this study.
Incongruence between genetic distance-based, sequence similarity-based and tree-based methods has led to difficulties in choosing robust markers for species discrimination in complex genera like Panax. Here we examined the identification abilities of two methods for four newly proposed markers and combinations thereof in comparison with four commonly used barcodes (Fig.
Phylogenetic studies on Panax using different DNA barcodes, different reference sequences or samples have resulted in conflicting tree topologies and clade placements for several species (
In the present study, the discriminatory power of four chloroplast markers proposed by
This work was supported by the Ministry of Science and Technology of Vietnam under the project: “Transcriptome sequencing and analysis of Panax vietnamensis Ha et Grushv.” (Grant number 16/2017-HĐ-NVQG). We deeply appreciate Le Thi My Hao and Luong Duc Toan who kindly providing the samples of P. vietnamensis and Nguyen Tien Dung for providing the samples of P. vietnamensis var. fuscidiscus, Panax sp.Puxailaileng, P. stipuleanatus, and P. bipinnatifidus. We would like to thank Nguyen Tap and Nguyen Quoc Binh for the morphological identification of the samples. Marcella Orwick Rydmark, Hugo J. de Boer, and Nguyen Tuong Van are acknowledged for proofreading the text.
NCBI accession numbers of DNA barcoding sequences, and complete chloroplast genomes used in this study.
Data type: NCBI accession numbers of DNA sequences and complete chloroplast genomes
Explanation note: The NCBI accession numbers of newly obtained and 91 reference barcoding sequences, and 38 complete chloroplast genomes representing seven Panax species, one unidentified species and one sub-species of Panax, and two outgroup Aralia species.