Genetic structure and effective population size of the bamboo locust Ceracris kiangsu

The yellow-spined bamboo locust Ceracris kiangsu Tsai is a famous migratory forestry pest endemic to China. However, few genetic studies have been focusing on bamboo locusts. Here, we used microsatellite markers and the sequences from the mitochondrial COI gene to investigate genetic differences among the C. kiangsu populations from sampling locations of the species’ range in Southern China. High levels of genetic variation were determined for both mtDNA and microsatellites. For mtDNA data, there was a statistical support for a historical population increase for most populations except Jiangxi and Anhui, where no significant signals of expansion were detected. No significant isolation-by-distance pattern was observed by microsatellites data. Analyses of molecular variance indicated a consistent high level of differentiation within populations for COI and microsatellites, with FST = 0.27611 and FST = 0.20010 respectively. Phylogeographic analyses indicated little geographic structure, even though microsatellite-based Bayesian clustering identified two genetic clusters, suggesting a historically wide gene flow among the populations. The high levels of gene flow are likely to reflect the spread of this locust population into the surrounding areas. The lack of population structure in this species is likely caused by recent range expansions or cycles of local extinctions followed by recolonization/expansions. For the effective population size (Ne) in this locust, whether we used software ONeSAMP or LDNe, the results showed that Taoyuan population had a low Ne value, whereas GN had a high value. In general, its Ne s are 211, at a moderate level, by means of a global evaluation in all populations. Correspondence to: Guo-Fang Jiang, Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, Jiangsu 210023, China, E-mail: cnjgf1208@163.com


Introduction
The yellow-spined bamboo locust Ceracris kiangsu Tsai is a famous migratory forestry pest endemic to China. It mainly distributes in South China and is characterised by its behavior of feeding in large groups on the leaves of bamboo plants during all of its life stages [1][2][3]. Due to this feeding habit, their distribution range is more restricted than Locust migratoria [4] and Oxya hylaintricate [5]. However, the C. kiangsu population has dramatically increased recently in many regions in China. Populations of herbivorous insects often fluctuate in response to host plant quality and availability [6,7].
Species with high migratory ability are usually supposed to have minimal population substructure over their distributional ranges because strong gene flow can counteract the isolating effects of geographical distance and physical barriers [8,9]. The current distribution of genetic variation within a species is the result of its demographic history as well as the interaction between mutation, genetic drift and gene flow. It is hard to distinguish historical effects, like climate change or geological accident, from shorter-term consequences of dispersal patterns and limited population size, but the distinction is important [10]. It is the effect of historical changes on the distribution and abundance of organisms left in the current population structure that underpins phylogeography [11]. Often these spatial genetic patterns are placed on a much larger scale than the correlations generated by dispersal of individuals in stable populations. There was defectiveness in prediction of the spread of agricultural pests and inferences about selection for local adaptation. It may be due to the traditional assumption that the current population structure was in equilibrium between drift, mutation and dispersal [10]. In these cases, historical effects may be considered a complicated factor.
The sensitivity to demographic fluctuations of statistics summarizing genetic variation is dependent on dispersal characteristics, effective population sizes, strength of demographic fluctuations, and properties of the genetic markers [12][13][14]. Population genetic structure can provide information to help evaluate the extent of the same potential evolutionary process and genetic differences among different populations, such as genetic drift, population range expansion or fragmentation [15]. Here, genomic data can help to uncover the degree of differentiation and the influence of historical events like climatic shifts and ecological changes. Furthermore, they may reveal patterns of geographic variation within the gene pool of a species or a species complex. To date, there have been a number of worldwide molecular studies on population structure and gene flow in locusts. Chapuis et al. [16] scored 14 microsatellites in 25 L. migratoria populations; the result rejected the hypothesis that L. migratoria consists of two genetically distinct clusters adapted to non-outbreaking or cyclically outbreaking. Chapuis et al. [17] analysed genetic variation within and between 24 population non-outbreaking populations and outbreaking populations in Western Europe, Madagascar and Northern China based on 14 microsatellites. The result showed that the long-term effective population size is similar in outbreaking and non-outbreaking populations with a similar genetic diversity, and F ST values indicated that gene flow is substantially larger among outbreaking populations than among non-outbreaking populations. Ibrahim et al. [18] carried out a genetic survey in solitarious desert locust populations based on a single nuclear DNA marker. The study revealed a considerable genetic diversity within a population and substantial genetic variations among populations, despite the homogenizing potential of outbreak events. Most studies in the past were focused on the biological characteristic, morphological classification, disaster forecast and pest control for C. kiangsu. The latest genetic research has reported that C. kiangsu is a monophyletic group with a high level of genetic diversity, a low genetic differentiation among populations under the 16S rRNA and amplified fragment length polymorphism (AFLP) markers. It was a landmark of C. kiangsu in genetic research [9]. For this species, however, there is still a lack of any hypothesis about population substructure, population migration or mechanism of population outbreaking.
In this study, we first combined microsatellite and mitochondrial COI data to evaluate: (1) what is the degree of genetic diversity and variation? (2) whether microsatellite markers provided a favorable evidence for population structure which is not revealed by mtDNA markers, (3) are genetic differences consistent with geographic distances? and (4) if this species experienced a genetic bottleneck and/ or a population explosion, what factors should be liable for it?

Sampling and dna extraction
Samples of C. kiangsu were collected in south and south-west of China. All samples were gathered in a period of five years. These sampling locations include Anhui, Fujian, Guangxi, Guangdong, Hunan, Jiangxi, Jiangsu, Sichuan, Sichuan, Yunnan and Zhejiang provinces. All specimens were preserved in 100% ethanol and stored at -20°C. Total DNA was extracted from individual hind legs, using the Genomic DNA Purification Kit (Promega, USA) following the user manual.

Mitochondrial DNA sequencing
We amplified the COI gene using primers described by Simon et al. [19] for 231 individuals from 12 geographical populations (Table  1). The reaction mixture (25 μL) contained 50 ng of DNA template, 2.5 μL of 10x buffer, 2 μL dNTP (2.0 mM), 2.5 μL MgCl 2 (2.0 mM), 1. 5μL primers (10 μM) and 0.2μL rTaq DNA polymerase (5U/μL) (TaKaRa Biotechnology, Dalian, Liaoning, China). The thermal cycle profile was as the following: an initial denaturation of 2 min at 94°C; 35 amplification cycles of denaturation for 30 s at 94°C, annealing for 30 s at Tm and extension for 50 s at 72°C; and a final extension for 10 min at 72°C. PCR products were separated by 1% agarose gel, and PCR amplification products were performed by using ABI 373A automatic DNA sequencing (Applied Biosystems). Each fragment was sequenced in both directions, using the above primers and then aligned in the software MEGA 6 [20].

Microsatellites
Ten microsatellites, WJ621, WJ622, WJ623, WJ624, WJ625, WJ626, WJ627, WJ630, WJ631 and WJ632 [3] were examined in this study. Samples in this part were different from the samples for COI gene. Here we used 433 individuals for genotyping in 18 sampling sites (Table 1). PCR amplifications were performed in a total volume of 15 μL containing 20 ng of template genomic DNA, 0.1 μmol/L of each primer (forward and reverse), 1 × PCR buffer, 0.2 mmol/L dNTPs, and 0.5 units of TaKaRa Taq (Takara Bio, Shiga, Japan) with the following cycling profile: 15 min at 94°C; 35 cycles of 30 s at 94°C, annealing for 30 s (Annealing temperature see Table 2), and extension at 72°C for 30 s, and an 10 min final extension step. ABI 3500XL Genetic Analyzer (Applied Biosystems, Foster City, CA) was used to analyse the amplified PCR products, and GeneMapper version 3.5 software (Applied Biosystems, Foster City, Calif) was used to obtain allele designations.

Mitochondrial data analysis
Sequence alignments of mtDNA data were performed using MEGA 6 [20]. Haplotype and nucleotide diversities were calculated with DNASP 5 [21]. We constructed median joining haplotype network with Network 4.611 [22] and used the MEGA6 under kimura 2-parameter model to generate a bootstrap tree. Genetic differentiations within and among populations were measured by analysis of molecular variance (AMOVA) based on traditional F-statistics with 10,000 permutations in Arlequin V3.5 [23]. The demographic histories were examined with Tajima's D and Fu's F neutrality tests [24,25] using 1,000 simulation steps as implemented in Arlequin V3.5. In these tests, negative significant deviations from neutrality might indicate either balancing selection or population expansion.

Microsatellites data analysis
Allele frequency, number of alleles, observed and expected heterozygosities, F ST [26] and Nei's [27] unbiased genetic distances were calculated, using Genetix 4.02 software [28] and POPGENE version 1.3.1 [29]. We did tests for deviation from Hardy-Weinberg equilibrium (HWE) at each locus for each population and genotypic linkage disequilibrium among loci, using a Markov Chain Monte Carlo method performed in Genepop version 3.4 [30], and then used sequential Bonferroni correction to correct P values [31]. An overall inbreeding coefficient (F IS ) [26] was also of the view to measure the HWE departures by evaluating the probabilities through random permutation procedures (minimum 10,000 permutations). Significant value associated with the HWE analysis was adjusted following sequential Bonferroni procedures [31].
We applied a Bayesian analysis implemented in the program in the STRUCTURE program [32,33] for estimating hidden substructure across the studied species' range of C. kiangsu. This program generated clusters of individuals based on their multilocus genotypes. An admixture ancestry model and the correlated allele frequency model were used to calculate the probable number of genetic clusters (K) with a burn-in period of 100 000 iterations and 1 000 000 Markov chain Monte Carlo (MCMC) repetitions. We performed 10 independent runs for each K to confirm consistency across runs, tested K from 1 to 10 and used ΔK method of Evanno et al. [34] to choose the most likely value of K.
We used the ARLEQUIN version 3.0 [35] to conduct the analysis of molecular variance (AMOVA). We defined the population structure on the basis of phylogenetic clusters which we obtained from the STRUCTURE program. We carried out a hierarchical analysis of variance to partition total variance into variance components attributable to interindividual and/or interpopulation differences. Pairwise estimates of F ST were transformed into pairwise estimates of gene flow Nm among populations, again using ARLEQUIN. We also used the Mantel test to evaluate the correlation between F ST and geographical distance at Isolation By Distance Web Service Version 3.23 [36]. Next, we also used the program ONeSAMP with approximate Bayesian computation method to estimate Ne, the effective population number [37]. We set the prior (lower and upper bounds) on Ne at 10 and 10000 (In order to obtain accurate results, we used populations which individual number ≥30). As a comparison, we also applied the program LDNe to estimate Ne based on linkage disequilibrium [38].

Mitochondrial DNA data
Overall, 231 COI fragment sequences covered 671 base pairs (bp) (GenBank Accession no: KJ667224 -KJ667509). The average nucleotide frequencies of adenine (A), thymine (T), guanine (G) and cytosine (C) were 32.0%, 33.2%, 16.7%, and 18.1%, respectively. Of the 231 COI fragments sequences, 178 polymorphic sites were observed, and 113 haplotypes were defined. Of the 178 sites, 43 were single variable sites, 135 were parsimony-information sites, and insertions or deletions were not observed in the examined sequences. The number of haplotypes, the values of haplotype diversity and nucleotide diversity within each and the total population were presented in Table 2.
The test of neutrality based on 1000 simulating samplings was negative in all populations, and in HR and SC significantly negative. The populations of all sample sites were also significantly negative, with Tajima's D value of -2.30157, P < 0.01 and Fu's Fs value of -34.373, P>0.1. Harpending's Raggedness index (HRI) = 0.05799, P<0.1. The negative value of Fu's Fs and Tajima's D commonly indicate that the populations have been through a population expansion. Large differences between θ 0 and θ 1 within all collections indicate a rapid population expansion. The τ-values varied among all populations, indicating that the population expansions date back to very wordy time period. The τ-value for the pooled sequences was calculated to be 2.954. We used the τ=2ut (t is population expansion time, u is nucleotide mutation rate per sequence per generation), u=2μk (μ is mutation rate of each nucleotide). Here, we used μ=2.30% per million years of Melanoplus [39] to calculate the population expansion time for C. kiangsu and got a t value of 0.0478.
The NJ tree of COI haplotypes observed in this study was shown in   The AMOVA test of COI data disclosed that 27.61% occurred among populations within a group with a F ST = 0.27611 (p = 0.000), and 72.39% of the genetic variations occurred within a population, showing a high level of variations within a population. According to the Neighbour-joining phylogenetic tree of COI haplotypes, we grouped all populations into three groups, i.e. group 1(GN), group 2(AH), and group 3 for the AMOVA test (Table 4). The results showed that a larger percent of variations within a population and the fixation indexes in each level were moderate.

Microsatellite data
Genotypes analysis of 433 individuals of C. kiangsu revealed a diverse degree of polymorphism across the 18 sampled locations. All loci showed significant deviation from HWE. There was no evidence of significant levels of linkage disequilibrium for multiple comparisons in the pooled data. The observed allele number per location ranged from 8.2 to 18.8. The number of effective alleles ranged from 4.9779 to 12.8480. All 10 microsatellite loci tested were polymorphic with PIC value ranged from 0.528 to 0.941. Mean H O and H E for each locus were 0.7100 and 0.9465, respectively. The average H O per sampling site ranged from 0.5600 to 0.8543 with the lowest value in the sample from GD (Anhui province) and the highest value in the sample from AA (Hunan province). The average H E per sampling site ranged from 0.7636 to 0.9143 ± 0.0312 with the lowest value in the sample from GD (Anhui province) and the highest value in the sample from TY (Hunan province). F IS (inbreeding coefficient) is the proportion of variance in the subpopulation contained in an individual. F IS value ranged from -0.01084 (AA) to 0.31436 (BG). High F IS indicated a high degree of inbreeding among individuals within populations (Table 5).
Nei's unbiased genetic distances and F ST value for all the population pairs were estimated (Table 6). Genetic distance varied from 0.2880 between WH and CS to 2.4410 between TJ and SC. In brief, F ST values ranged from 0.0177 (HR-RA) to 0.1561 (GD-SC). F ST values for most breed combinations were greater than 0.01 indicated little and moderate differentiation between populations, suggesting a close genetic association among all populations. Gene flow among all populations is shown in Table 7. Nm value ranged from 1.41 (SC-GD) to 18.8 (RA-HR).     Table 7. Nm value between all populations of Ceracris kiangsu (above diagonal).
The STRUCTURE software, which analyzes genetic structure through a Bayesian approach, showed that the number of genetic groups (K value) best fitting our data was K = 2, with most individuals from the SC, TY, AH, RA, HR, ZJ, GN, JX, BG, HC, QZ and FJ populations included in Group I, while most individuals from CS, GD, SP, TJ, WH and AA populations included in the Group II. This indicated that populations of C. kiangsu consist of two genetically differentiated groups (Figure 3). It is to be noted that TY, ZJ, and GN populations consist of two groups, indicating a frequent gene exchange. Mantel tests revealed that there was not evident relationship between geographical and genetic distance (F ST ), R 2 = 0.0288, P = 0.1040 (Figure 4).
The results of AMOVA, based on the groups obtained from structure analysis, indicated that the greatest percentage of variation (97.72%) was contained within a population, 5.93% of the variance was among populations within a group, and 1.35% of the variance was among groups (Table 8). There was a significant genetic differentiation (F ST = 0.20010, P < 0.01) within a population, based on F ST estimates for hierarchical levels analyzed.
ONeSAMP analysis results showed that Ne values ranged from 52 (TY) to 614 (GN); LDNe analysis results showed that Ne ranged from 77 (TY) to 813 (GN) ( Table 9). Global evaluation in all populations showed the Ne of 211, which indicates a moderate level of Ne.

Discussion
The findings of the present study have provided a new light for this bamboo locust, both in ecological and population genetics. Based on both microsatellite and mtDNA, estimations of levels of gene flow revealed little population structure in C. kiangsu, supporting the recent findings of Fan et al. [9]. This is surprising because C. kiangsu is a nonsocial pest, mainly restricted to bamboo forest. This lack of population structure was not attributable to a lack of genetic variation. Although there was not any significant differentiation between populations, there were evidences for higher genetic diversity in all populations. In particular, for microsatellite loci, there are high levels of genetic diversity, but levels of genetic differentiations between populations are moderate. In this respect, the genetic differentiation in C. kiangsu was similar to that (H E =0.80-0.90 and H O =0.60-0.75; F ST =0.00-0.04) reported for the steppe grasshopper Chorthippus parallelus, doing a low dispersal range per generation [40].
Mitochondrial DNA variation in C. kiangsu was similarly uniform, with no apparent population differentiation. There were also unmistakable evidences of inbreeding because H O were always lower than H E and all F IS values were significant. In geographical scales, the F ST values were generally all slightly low, suggesting that, at sometime in the past, there has indeed been sufficiently long-distance gene flow to effectively homogenise population structures. This view is also corroborated by the lack of isolation -by-distance (IBD) and obvious genotypic structuring using the STRUCTURE analyses. Apparently, our results supported the correlation between genetic distances and geographic distances were not distinct. At the same time, we thought that it is important to sample populations over a sufficient spatial scale to detect IBD. More emphasis should be put on regional processes as patterns of IBD over hundreds of kilometres may be due to processes other than an equilibrium between genetic drift and dispersal [41].
Underlying the lack of genetic structure and dispersal, one possible reason could be gene flow. In recent reviews [42,43], some apparently sedentary species displayed extensive gene flow, whereas a small proportion of apparently mobile species exhibited minimal gene flow between populations [44]. There was a number of studies where different genetic markers can produce very different genetic patterning, such as in the case of aphids [45], and/or give results contrary to expectations, such as in the case of the brachypterous saltmarsh planthopper [44], which showed homogenous patterns of population genetic variation despite its apparent inability to fly. Table 7 showed a pair-wise gene flow among all populations, suggesting a frequent gene exchange while moderate F ST exist. Fan et al. [9] took this phenomenon as evidence of preliminary geographical isolation, although the whole population structure of C. kiangsu is still vague.
Population explosion of C. kiangsu was detected via the negative Tajima's D and Fu's Fs values. Our mismatch analysis indicated that the population explosion was likely to be about 0.0478 Ma, which is entirely different from the value of 0.242 Ma inferred in [9]. Although we and [9] used the same nucleotide mutation rate of the genus  Table 9. Estimates of effective population size obtained using microsatellite genotypic data from a single sample of each population, and two different methods of estimation. Melanoplusas reference, the results inferred from the different genes are very different. We think the great difference should be attributed to different nucleotide mutation rate of the different genes. Thus, the practical value of a mitochondrial gene used for explaining population structure was required in further research.

Effective population sizes in C. kiangsu
Using two methods to calculate the effective population sizes, the results obtained had a certain correlation. These data can reflect the effective population sizes of the samples. Our results indicated that GN population had a high level of effective population size, whereas TY's Ne is lower. Suggesting that the reproductive capacity of populations in the local environment is higher than other geographic populations. Thus, we speculated that the bottleneck effect may be the reason for it.
Our results revealed that the differentiation mainly exists within a population rather than between or among populations, whether we used mtDNA or microsatellites. The results indicated there are no significant structuring among all samples. In summary, the data obtained from the analyses of ten polymorphic microsatellite loci demonstrates that there is no unambiguous population structure and indicates the consistent results of microsatellite and mtDNA markers on detecting genetic differentiation of C. kiangsu. Moreover, microsatellite markers provided a new evidence. Despite the fact that genetic differences between populations were small, high levels of gene flow indicated that depletion or extinction in anyone of these populations may not be balanced by recruitment from others.
Distributional patterns of species are molded by a number of factors, including barriers to dispersal, physical and biological factors that make distinct regions of habitat unsuitable for viability and reproduction. What is the distribution pattern of the C. kiangsu populations currently? The complex interaction of environments determined definite geographic distribution, the species' fundamental ecological niche, and particular biological realities and historical events [46,47]. Divergence status can be maintained once populations have become genetically differentiated, if they have differentially adapted to populations. Abscissa is the geographic distance, ordinate is the F ST value regional ecological conditions, since geographic variation in selection can act as a strong barrier to gene flow [48,49]. In this study, we did not find obvious population subdivision. In contrast, our data showed that all populations had a low level of genetic differentiations. That is, different ecological regions have not restrict their distribution and gene flow. Here, our results supported the hypothesis that C. kiangsu populations are in a primary differentiation stage [9]. As a consequence, the current effective gene flow had a genetic consequence; it has wiped out the patterns of differentiation created during historical isolation.
In conclusion, we have found a little genetic differentiation among populations of C. kiangsu. Our study suggests that levels of gene flow of this locust are probably higher than expected, likely reflecting movement of this locust into the surrounding areas. The levels of genetic differentiation were low, which indicated that geographical patterns in quantitative traits of this species are likely to be adaptive.