Which of the following best explains the cause of phenotypic variation observed in the butterflies?

  • Journal List
  • Genome Res
  • v.24(8); 2014 Aug
  • PMC4120085

Genome Res. 2014 Aug; 24(8): 1316–1333.

Nicola J. Nadeau,1,2 Mayté Ruiz,3 Patricio Salazar,1,4 Brian Counterman,5 Jose Alejandro Medina,6 Humberto Ortiz-Zuazaga,6,7 Anna Morrison,1 W. Owen McMillan,8 Chris D. Jiggins,

Which of the following best explains the cause of phenotypic variation observed in the butterflies?
1,8 and Riccardo Papa3

Abstract

Hybrid zones can be valuable tools for studying evolution and identifying genomic regions responsible for adaptive divergence and underlying phenotypic variation. Hybrid zones between subspecies of Heliconius butterflies can be very narrow and are maintained by strong selection acting on color pattern. The comimetic species, H. erato and H. melpomene, have parallel hybrid zones in which both species undergo a change from one color pattern form to another. We use restriction-associated DNA sequencing to obtain several thousand genome-wide sequence markers and use these to analyze patterns of population divergence across two pairs of parallel hybrid zones in Peru and Ecuador. We compare two approaches for analysis of this type of data—alignment to a reference genome and de novo assembly—and find that alignment gives the best results for species both closely (H. melpomene) and distantly (H. erato, ∼15% divergent) related to the reference sequence. Our results confirm that the color pattern controlling loci account for the majority of divergent regions across the genome, but we also detect other divergent regions apparently unlinked to color pattern differences. We also use association mapping to identify previously unmapped color pattern loci, in particular the Ro locus. Finally, we identify a new cryptic population of H. timareta in Ecuador, which occurs at relatively low altitude and is mimetic with H. melpomene malleti.

Natural hybrid zones occur where divergent forms meet, mate, and hybridize. Narrow hybrid zones can be maintained by strong selection that prevents mixing or favors particular forms in particular areas (Barton and Hewitt 1985). Studies of hybrid zones have provided many insights into the origins of diversity and the process of speciation (Mallet et al. 1990; Harrison 1993; Kawakami and Butlin 2001). High-throughput sequencing technologies now provide the opportunity for hybrid zones to fully meet their potential as windows into the evolutionary process by allowing us to move beyond studies of neutral variation at a handful of loci and identify the genetic loci under selection (Rieseberg and Buerkle 2002; Gompert et al. 2012; Crawford and Nielsen 2013).

Butterflies of the Neotropical genus Heliconius are extremely diverse in their wing color patterns and combine within species diversity with convergence among species in wing phenotypes. Their bright wing patterns are used as aposematic warnings to predators and are under positive frequency-dependent selection favoring common color patterns that predators learn to avoid. This strong selection also maintains narrow hybrid zones between subspecies with different patterns (Benson 1972; Mallet and Barton 1989a; Kapan 2001; Langham 2004). In addition, frequency-dependent selection leads to Müllerian mimicry between many distinct species (Müller 1879). For instance, H. erato and H. melpomene are two distantly related species that diverged ∼15–20 million years ago, but have converged on common color patterns across most of the Neotropics. Divergent races of both species meet in parallel hybrid zones (Fig. 1). Evidence suggests that convergent color patterns in these two species have evolved independently (Hines et al. 2011; Supple et al. 2013). It has also been suggested that H. erato is more ancient and H. melpomene diversified more recently to mimic the H. erato forms (Brower 1996; Flanagan et al. 2004; Quek et al. 2010). Nevertheless, it appears that the same handful of genetic loci are responsible for producing most of the color pattern variation in both species (Joron et al. 2006; Baxter et al. 2008; Reed et al. 2011; Martin et al. 2012). This pattern of parallel adaptive radiation makes Heliconius an excellent system in which to address the predictability of the evolutionary process and the extent to which particular genes are re-used when evolving the same phenotypes (Papa et al. 2008a; Nadeau and Jiggins 2010).

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

(A) Distribution in South America of the subspecies included in this study. (B) Maximum likelihood phylogenies with approximate likelihood branch supports. Co-mimics from outside the focal hybrid zones are connected with dotted lines. Focal hybrid zone individuals are shown in color. (Blue) H. m. plesseni and H. e. notabilis; (purple) Ecuador hybrids; (dark red) H. m. malleti and H. e. lativitta; (red) H. m. aglaope and H. e. emma; (orange) Peru hybrids; (yellow) H. m. amaryllis and H. e. favorinus. Additional populations are in black. Country abbreviations: (Ec) Ecuador; (FG) French Guiana; (Co) Colombia; (Pa) Panama.

In this study, we use high-resolution genome scans to investigate patterns of divergence across two pairs of parallel hybrid zones in Peru and Ecuador. These occur between subspecies with different wing color patterns in both H. erato and H. melpomene (Fig. 1). In both regions, the clines in color pattern alleles between species are highly coincident (Mallet et al. 1990; Salazar 2012). The two hybrid zones in Peru have been the focus of several previous studies, whereas those in Ecuador have been less well studied. In Peru, strong natural selection has been shown to maintain color pattern differences (Mallet and Barton 1989a) and loci controlling color patterns show enhanced divergence (Baxter et al. 2010; Counterman et al. 2010; Nadeau et al. 2012; Martin et al. 2013; Supple et al. 2013). However, we still lack a complete picture of how many loci are divergent between subspecies and the extent to which the genomic architecture of divergence is the same between mimetic species.

Extensive genetic mapping using experimental crosses between different color pattern forms has identified the chromosomal regions responsible for color pattern variation (Sheppard et al. 1985; Joron et al. 2006; Baxter et al. 2008; Papa et al. 2013). Three major clusters of loci control most of the color pattern variation observed in both species. The tightly linked B and D loci on chromosome 18 in H. melpomene control the red forewing band, and the red/orange hindwing rays and proximal “dennis” patches on both wings, respectively. These loci are homologous to the D locus in H. erato (Baxter et al. 2008) and appear to be cis regulatory elements of the optix gene (Reed et al. 2011; Supple et al. 2013). The Ac and Sd loci, in H. melpomene and H. erato, respectively, control the shape of the forewing band via regulation of the WntA gene on chromosome 10 (Martin et al. 2012). The presence of most yellow and white elements on the wing is largely controlled by three tightly linked loci, Yb, Sb, and N, on chromosome 15 in H. melpomene (Ferguson et al. 2010), which are homologous to the Cr locus in H. erato (Joron et al. 2006). Quantitative trait locus (QTL) mapping has identified other loci of minor effect, including at least seven additional QTL in H. erato (Papa et al. 2013), and QTL in H. melpomene on chromosomes 2, 7, and 13 that affect forewing band shape (Baxter et al. 2008). In some cases, mapping studies have been followed up by population genetic studies of the mapped intervals across natural hybrid zones, where many generations of backcrossing have led to narrow regions of association, permitting fine scale mapping (Baxter et al. 2010; Counterman et al. 2010; Nadeau et al. 2012; Supple et al. 2013). High-throughput sequencing technologies now provide the feasibility to generate a high density of genomic markers to identify the narrow QTL present in these hybrid zones without the need to perform controlled laboratory crosses (Crawford and Nielsen 2013). Here we test this approach, using a system in which some of the loci responsible for phenotypic differences are known.

The Peru and Ecuador hybrid zones occur across altitudinal gradients (Fig. 2A). Therefore, it is possible that traits other than color pattern may also be differentiated by altitudinal selection; for example, traits related to temperature or changes in larval host plants. Such selection on additional regions of the genome could help to stabilize the geographic location of the hybrid zone (Barton and Hewitt 1985; Mallet and Barton 1989b; Mallet 2010; Bierne et al. 2011). Therefore another important question that we will address is whether there are divergent regions of the genome that are not controlling color pattern. These might be candidates for loci controlling other aspects of ecological adaptation.

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

Population structure at each of the hybrid zones using the reference aligned data. (A) Sampling locations with altitude in meters, sample size in parentheses, and pie charts of the proportion of individuals of each type sampled from each site. Colors are the same as in Figure 1, except black indicates H. timareta in Ecuador. (B) Structure analysis with k = 2 (H. timareta individuals excluded). Each individual is shown as a horizontal bar with the allelic contribution from population 1 (gray) and population 2 (black). (C) Principal components analysis. (D) Distribution of FST values from BayeScan.

In this study we use restriction-associated DNA (RAD) sequencing (Baird et al. 2008) to determine, for the first time:

  1. If association mapping in these hybrid zones can identify known and novel loci underlying phenotypic variation;

  2. How much of the genome is differentiated and under divergent selection between subspecies;

  3. How much of this differentiation is due to loci controlling color pattern variation;

  4. If the same regions are divergent between co-mimetic species.

Although previous studies have touched on questions 2 and 3 (Kronforst 2013; Martin et al. 2013), here we focus on divergence at the subspecies level where hybridization is frequent, rather than between occasionally hybridizing species. Compared to the study by Martin et al. (2013), we explored additional hybrid zones (Ecuador) and species (H. erato) using larger sample sizes, which allowed more robust tests to identify genomic regions under divergent selection. We also investigate the advantages and limitations of alignment and assembly methods when only a single reference genome is available. We compare two widely used approaches: de novo assembly of just the restriction-associated reads, using the program Stacks (Catchen et al. 2011), versus alignment of paired-end reads to the reference H. melpomene genome.

Results

Summary of the data and comparison of alignment and assembly techniques

We sequenced a total of 129 individuals of H. erato and H. melpomene from the four hybrid zones in Peru and Ecuador, including a small number of additional individuals from across the range of H. erato. Using restriction-associated DNA sequencing (RAD-seq), we obtained a total of 1496M 150 bp paired-end reads from the hybrid zone individuals, and an additional 115M 100 bp paired-end reads from the other H. erato populations and outgroups. In our analyses, we also include data from additional H. melpomene populations and outgroups (H. cydno, H. timareta, and H. hecale) already published in a previous study (Nadeau et al. 2013).

Our reference genome for H. melpomene is highly divergent from H. erato. Nonetheless, for both species, alignment of reads to the H. melpomene reference sequence yielded more usable data when compared to de novo assembly carried out independently within each species. De novo assembly produced more bases in assembled contigs (Table 1), but only ∼2% of contigs assembled in the H. erato populations were present in more than 10 individuals, with the figure being ∼7% in H. melpomene. By comparison, when the same data (plus the paired-end reads) were aligned to our reference sequence, ∼38% of aligned bases were found in more than 10 individuals in H. erato and >50% in H. melpomene. We hypothesized that high levels of within-population variation led to homologous reads being separated into distinct contigs in the de novo assembly. We could confirm that this was the case for one region of the H. erato genome for which a high quality reference sequence is available (Supple et al. 2013). Across 960 kb at the D color pattern locus, we observed that RAD-seq reads that were highly divergent between subspecies could be aligned to homologous positions in the reference but were assembled into separate contigs in the de novo assembly. Overall, we also found a higher frequency of single nucleotide polymorphisms (SNPs) in the reference alignments than the de novo assemblies (Table 1). These SNPs were defined as sites that were polymorphic within the sampled populations and so are not inflated by fixed differences from the reference genome.

Table 1.

Summary statistics from alignment and assembly approaches

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

As expected, given that H. erato is ∼15% divergent from H. melpomene in the aligned data, fewer H. erato reads aligned to the H. melpomene genome as compared to those from H. melpomene, leading to fewer confidently called bases. Nevertheless, the use of the reference H. melpomene genome for aligning the H. erato reads resulted in more bases being called across multiple individuals and around 10× more SNPs identified when compared to the de novo assembly approach. In addition, the gaps between aligned RAD loci were similar across both species (Table 1), indicating that the reduced number of bases is not due to fewer RAD loci aligning but to fewer confidently called bases at each RAD locus. The power to detect loci under selection or responsible for phenotypic variation should therefore be similar in both species or slightly reduced in H. erato due to its larger genome (Tobler et al. 2004). Nevertheless, much of the additional genomic sequence in H. erato is likely to be repetitive DNA (Papa et al. 2008b), which would be difficult to align and score variants in, even if a complete reference was available. In summary, it seems that the aligned data should give the most power to detect divergent regions and phenotypic associations for both species. However, we performed outlier and association analyses using the output of both approaches for comparison. It is also possible that the de novo assembly might detect divergent regions important in adaptation that could not be aligned to the H. melpomene reference.

Phylogenetics and population structure

Using the reference aligned sequence data, we constructed maximum likelihood phylogenies for the H. melpomene and H. erato clades, including individuals from additional populations and outgroup taxa (Fig. 1). This revealed remarkably similar patterns of divergence between co-occurring, co-mimetic subspecies in both groups. Population divergence in H. erato is thought to be deeper than that in H. melpomene (Flanagan et al. 2004, but see Cuthill and Charleston 2012), but this was not evident in our tree as branch lengths were very similar between the two species. This may be due to the lower quality of alignments for H. erato, with the H. erato tree based on about a third as many sites as that for H. melpomene. These sites in H. erato are likely to be more conserved, resulting in some compression of the tree topology.

The most striking finding from the phylogenetic reconstruction was that eight of the presumed H. melpomene individuals from Ecuador were strongly supported as clustering within the H. timareta clade (Fig. 1). All these individuals had a H. melpomene malleti-like phenotype with the exception of one individual which had been characterized as a possible hybrid due to a large and rounded yellow forewing band, but was otherwise H. m. malleti-like. This finding was surprising because although populations of H. timareta mimetic with H. m. malleti have previously been described in Colombia (Giraldo et al. 2008) and Northern Peru (Lamas 1997), they are all found in highland areas above ∼1000 m. Similar populations are not known from lowland sites anywhere in the range. To compare our individuals to these and other populations, we also directly sequenced part of the mitochondrial COI gene that overlaps with the region sequenced in previous studies (Giraldo et al. 2008; Mérot et al. 2013). Our phylogeny based on these sequences also robustly supported these eight individuals as being H. timareta and placed them closer to the highland H. timareta timareta in Ecuador than to H. timareta florencia in Colombia that they resemble phenotypically (Supplemental Fig. 1).

The newly identified H. timareta subspecies was also clearly evident in a principal components analysis (PCA) of the combined H. melpomene, H. timareta, and H. cydno data. The first principal component separated the Peruvian H. melpomene from H. timareta and H. cydno (which were very similar on this axis) (Supplemental Fig. 2). The grouping of the Ecuadorian samples was consistent with the phylogeny, with the same eight individuals clustering with H. timareta. No individuals were intermediate between H. melpomene and H. timareta, indicating that the level of genetic isolation between the two species is similar to elsewhere in their range. This was also confirmed by a Structure analysis of the Ecuador “H. melpomene” population, where a model with two populations had the best fit to the data (posterior probability = 1). Under this model, which allowed for admixture between populations, the H. timareta individuals all had 100% of their allelic contribution from population 1, whereas for H. melpomene the maximum contribution of population 1 to any individual’s genotype was 1.8% (Supplemental Table 1). In summary, we can conclude that these are distinct species with little gene flow between them.

We conducted further analyses of the genetic structure of each of the hybrid zone populations using the reference aligned data, excluding the H. timareta individuals. Overall, these results suggest only very low genetic differentiation between any of the parapatric subspecies. Structure analyses of each population generally showed very little structure and strongest support for only a single population cluster being present. The only exception was the Peruvian H. melpomene, where two population clusters gave the highest posterior probability (P = 1). However, these clusters did not correspond to the two subspecies. The genetic diversity was partitioned such that most individuals were admixed with about a quarter of their allelic variation from population 2, except for two “hybrid” individuals that had pure population 2 genotypes and two other individuals (one “hybrid” and one aglaope) that had almost pure population 1 genotypes (Fig. 2B). PCA revealed very similar patterns, with small groups of hybrid phenotype individuals giving the clearest clusters, which in most cases were also identified by Structure (with K = 2) (Fig. 2C). Three of the populations did reveal some separation of the subspecies at one of the first two principal components, but with a gradual change from one genomic “type” to another. The H. melpomene subspecies in Ecuador were separated by PC1, which explained 10% of the variation in this population. The two H. erato populations both showed some separation by subspecies at PC2, which explained 5.7% and 6.7% of the variation in Peru and Ecuador, respectively. We found very similar results with PCA on the de novo assembled data (Supplemental Fig. 3), suggesting that the underlying genetic signal in both data sets is very similar. The lack of strong differentiation between subspecies was also supported by the FST distributions (calculated by BayeScan), which gave very low FST values between subspecies at >99% of the genome, with only a small percentage of SNPs showing high levels of differentiation (Fig. 2D; Supplemental Fig. 3).

Association mapping of loci responsible for phenotypic variation

We performed association mapping to identify genetic regions responsible for the phenotypic variation that segregates across each of the hybrid zones. In general, the expected associations were found at the three major loci known to control color pattern variation on chromosomes 10, 15, and 18 (Figs. 3A,D, 4A,D; Table 2; Supplemental Table 2). The majority of SNPs showing significant phenotypic associations fell within or tightly linked to these loci in all populations except in Peruvian H. erato, where only 26% were tightly linked to the known loci (Supplemental Figs. 4–8; Supplemental Table 2).

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

Association mapping (A,D) and outlier analysis (B,E) for H. melpomene (A–C) and H. erato (D–F) in Peru. Each phenotype used for the association mapping is shown in a different color as illustrated in C and F. For clarity, only the top 20 associated SNPs are shown for each phenotype. The outlier analysis results show FST values for all SNPs, with significant outliers shown in red. Results from the de novo assembled data are shown as crosses (and in orange for the outlier analysis) and positioned based on the top BLAST hit to the H. melpomene genome; those that were not confidently or uniquely assigned to these positions are shown as stars (e.g., those at the end of chromosome 10 in D). (Unmapped) Scaffolds of the H. melpomene reference genome that were not assigned to chromosomes in v1.1 of the genome assembly.

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

Association mapping (A,D) and outlier analysis (B,E) for H. melpomene (A–C) and H. erato (D–F) in Ecuador. See Figure 3 legend for further information.

Table 2.

Accuracy of identification of genomic regions known to control color pattern variation

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

Independent analyses were performed on both the reference alignments and de novo assemblies of the data. In all populations, more associated SNPs were identified in the alignments than in the de novo assemblies (Fig. 5; Table 1). We used BLASTN (Altschul et al. 1990) to place de novo contigs containing associated SNPs onto the H. melpomene genome, and most could be confidently assigned to a unique locus. There was almost no overlap in the particular SNPs detected in the assembled and aligned data sets (Fig. 5), although in many cases the SNPs detected were in similar regions (Figs. 3, 4). There was evidence for a higher false positive rate in the de novo data, because the majority of the SNPs that were uniquely significantly associated in these data were present in the aligned data but did not reach significance. This, rather than detection of novel regions, appears to be the main cause of the higher proportion of associations found scattered across the genome and away from known color pattern loci in the de novo data.

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

Venn diagrams of SNPs detected in the de novo assembled (blue and green) and reference aligned (yellow and red) data by BayeScan outlier detection (red and blue) and association mapping (yellow and green), for each of the four populations.

Red color pattern elements and the B and D loci

Our mapping of red color pattern variation was generally consistent with previous studies (Baxter et al. 2008; Counterman et al. 2010; The Heliconius Genome Consortium 2012) and in almost all populations was mapped to the expected region of chromosome 18 (Table 2; Supplemental Table 2). The only exception was H. melpomene in Ecuador, where SNPs in this region did not reach significance (Fig. 4A). This is likely to be due to the reduced sample size of this population (22) after removing the H. timareta individuals. Color patterns were scored both as independent elements and also using known patterns of segregation to score the predicted genotype at the B/D locus. The red forewing band and red hindwing rays are both dominant traits but are controlled by a single locus (or very tightly linked loci) and inherited in repulsion, meaning that individuals with both traits can be inferred to be heterozygotes (Sheppard et al. 1985; Baxter et al. 2008). This genotypic scoring generally gave stronger associations (Figs. 3, 4), although both methods gave some significant associations for at least one of the traits. In all populations, the strongest associations in this region were >60 kb downstream from the optix gene that controls red color pattern (Reed et al. 2011), overlapping the region identified in previous analyses as likely containing the functional regulatory variation (Table 2; Supplemental Fig. 4; Nadeau et al. 2012; Supple et al. 2013).

In several populations we found additional associations with B/D phenotypes on linked chromosome 18 scaffolds (Supplemental Table 2). The furthest from the B/D locus was HE671488 in Peruvian H. melpomene, which is ∼2 Mb away. This scaffold was also associated with differences in altitude in this population, which were stronger than the associations with color (Fig. 3A; Table 3; Supplemental Fig. 4). This could suggest that this B/D linked region is responsible for ecological adaptation, although color and altitude are strongly correlated, so we do not have much power in the data set to separate the two.

Table 3.

Novel genomic regions showing phenotypic associations or divergence outliers

Which of the following best explains the cause of phenotypic variation observed in the butterflies?

Both the Peruvian and Ecuadorian H. melpomene populations had a SNP at position 97 on an unmapped scaffold, HE670458, that was highly associated with rays (Table 3; Supplemental Table 3). This scaffold appears to consist largely of repetitive elements (BLAST hits match many other regions of the H. melpomene genome), suggesting that there may be a copy of a repetitive element that is associated with the presence of rays in both populations. All rayed individuals were heterozygous, and all nonrayed individuals were homozygous at this SNP in both H. melpomene populations. This would be consistent with multiple alleles aligning to this genomic region or the presence of a unique haplotype sequence linked to the rayed allele. The existence of such a repetitive element is consistent with previous findings that repetitive elements are present in the region of highest divergence at the B/D locus (Papa et al. 2008b; Nadeau et al. 2012).

Surprisingly, in the Peruvian H. erato population the strongest associations with red color pattern elements were not on chromosome 18, but at two scaffolds (HE670771 and HE670235) on chromosome 2 (Fig. 3D; Table 3; Supplemental Fig. 4; Supplemental Table 2). In addition, two SNPs significantly associated with rays and D genotype in the de novo assembled data of this population could not be confidently assigned to a position in the genome.

Yellow color pattern elements and the Yb, N, and Cr loci

In the Peruvian H. melpomene population the presence of the yellow hindwing bar and yellow in the forewing band both mapped to chromosome 15, with positions that were consistent with previous work on the Yb locus (Fig. 3A; Table 2; Supplemental Fig. 4; Supplemental Table 2; Ferguson et al. 2010; The Heliconius Genome Consortium 2012; Nadeau et al. 2012). Associations with altitude were also found at these associated SNPs but were weaker than the association with color and so may simply be due to correlations between the altitude of the sampling site and color pattern (Supplemental Table 2).

In the Peruvian H. erato population we did not recover the expected associations with the yellow hindwing bar, which is known to be controlled by the Cr locus on chromosome 15 (Joron et al. 2006; Counterman et al. 2010). Instead, the strongest association with this phenotype in the reference aligned data was found on chromosome 17 (Fig. 3D; Supplemental Table 2; Table 3). Moreover, we also identified significant associations with the yellow hindwing bar on chromosome 10 in both the reference aligned and de novo assembled data. These associations can be explained by the presence of Sd on this scaffold (Table 2; Martin et al. 2012), which is known to influence the expression of the yellow hindwing bar, particularly in individuals that are heterozygous at the Cr locus (Mallet 1989). The Cr locus is not thought to control any aspects of phenotypic variation in Ecuadorian H. erato (Salazar 2012), and consistent with this expectation, we did not detect any phenotypic associations in this region.

We scored the Ecuadorean H. melpomene individuals for their predicted genotype at the N locus (Fig. 4C), which controls several wing color pattern elements such as the yellow forewing band, the amount and location of red in the forewing, and the length of the orange hindwing “dennis” bar. Its scoring therefore depends on interactions with the red B/D locus (Salazar 2012). Despite the epistatic interaction between the B/D and the N loci, we could still score them independently. Associations with N were found to overlap the Yb region (Ferguson et al. 2010; Nadeau et al. 2012) on chromosome 15 (Table 2; Supplemental Fig. 4), in both the reference aligned and de novo assembled data (Fig. 4A). Although N is known to be tightly linked to Yb (Sheppard et al. 1985; Mallet 1989), these are the first genetic mapping results for the N locus.

Significant associations with yellow in the forewing band were present in the D region in Ecuadorian H. erato (Fig. 4D; Supplemental Table 2), consistent with the fact that the D locus controls both yellow and red coloration in the forewing band in H. erato (Sheppard et al. 1985; Salazar 2012; Papa et al. 2013). No significant associations were found with the presence of yellow color in the forewing band in either Peruvian H. erato or Ecuadorian H. melpomene.

Forewing band shape and the Ac, Sd, and Ro loci

In Peruvian H. melpomene, the strongest associations with forewing band shape (cell spot 8 and cell spot 11) were on chromosome 18, within the B/D region (Fig. 3A). This suggests that the B/D locus controls the shape as well as the color of the forewing band in Peruvian H. melpomene. However, we did also find a cluster of eight SNPs associated with band shape on an unmapped scaffold, HE671554. New mapping analyses suggest that this scaffold is on chromosome 20 (J Davey, pers. comm.) and therefore not linked to any previously described color pattern controlling loci (Table 3).

In Peruvian H. erato, the SNP in the Sd region that was associated with the yellow hindwing bar also showed the expected association with forewing band shape in the de novo assembly but not the reference alignment. This SNP was just 5 kb upstream of the WntA gene (Fig. 3D; Table 2; Supplemental Fig. 4). Associations with forewing band shape were also found on chromosome 2 in this and the Ecuadorian H. erato populations (Figs. 3D, 4D; Supplemental Fig. 4), in similar regions to those associated with red color in Peruvian H. erato (Table 3; Supplemental Table 2).

In both species from the Ecuadorian hybrid zone, we found SNPs associated with forewing band shape (cell spot 7/8/11) within introns of the WntA gene (Fig. 4; Table 2; Supplemental Table 2). In Ecuadorian H. erato, we also found two tightly linked SNPs on chromosome 13 and three tightly linked SNPs on an unmapped scaffold (HE669551) that were associated with forewing band shape and also rounding of the band (Fig. 4D; Supplemental Table 2). More recent mapping analysis suggests that both these scaffolds are on chromosome 13 and within 1 cM of each other (J Davey, pers. comm.), so these associations are most likely due to a single locus on this chromosome. Rounding of the distal edge of the band in this population has previously been described as being under the control of the unmapped Ro locus (Sheppard et al. 1985; Salazar 2012). We have therefore mapped the Ro locus to a region of chromosome 13 (Table 3).

FST outlier detection

Outlier detection provides an alternative method for identification of loci under selection that does not depend on phenotypic association. BayeScan detected <0.06% of SNPs as outliers in each of the analyses (Table 1). In the de novo assembly, Peruvian hybrid zones showed a greater percentage of SNPs as outliers in both H. erato (0.059%) and H. melpomene (0.040%), with no outliers detected in Ecuadorian H. erato and only five in Ecuadorian H. melpomene (0.012%). The overall proportion of SNPs detected in the reference aligned data was similar. However, unlike the de novo assemblies, in the reference alignments the proportions of outliers found within each species were more similar than within each locality. Reference aligned data from H. melpomene contained ∼0.025% outlier SNPs in both Peru and Ecuador, whereas reference aligned data from H. erato had 0.005% outliers in Peru and 0.017% outliers in Ecuador (Table 1). This would be consistent with some of the most rapidly diverging regions being lost in H. erato when aligned against the reference H. melpomene genome.

As suggested by results from the de novo assemblies, there do appear to be differences in population structure between the geographic regions that are consistent across both species. This is also reflected in the FST distributions (from both alignment and assembly approaches), with both H. erato and H. melpomene having higher mean and background levels of FST in Ecuador as compared to Peru (Fig. 2; Table 1; Supplemental Fig. 3), despite the average distance between sampling locations of “pure” subspecies individuals being similar for both hybrid zones (∼56 km in Ecuador and 58–60 km in Peru). However, the altitudinal range across the hybrid zone in Ecuador is greater than that in Peru (931 m versus 318 m, respectively). Within both regions, H. melpomene has a lower mean FST than H. erato, which would be consistent with higher dispersal distances in H. melpomene, as previously suggested (Mallet et al. 1990). Similar outlier regions were detected by both the alignment and assembly approaches (Figs. 3B,E, 4B,E), although only Peruvian H. melpomene gave a good overlap in the specific SNPs detected (Fig. 5). Some of the outlier contigs detected in Peruvian H. erato could not be positioned on the H. melpomene genome with confidence (Fig. 3B).

Overall, there was considerable overlap between the genomic regions containing outlier SNPs and those showing phenotypic associations (Figs. 3, 4), and to some extent in the specific SNPs, with the majority of phenotypically associated SNPs also being outliers (Fig. 5). The exception to this general trend was the Peruvian H. erato population, where a large proportion of the phenotypically associated SNPs were not strongly divergent between subspecies. In general, the majority of outlier SNPs were within 1 Mb of a known color pattern locus (including the newly identified Ro region; excluding these, 37.5% of outliers in Ecuadorian H. erato were within 1 Mb of the D and Sd loci) (Supplemental Table 2). The strongest outliers on chromosome 10 in the Ecuadorian populations and Peruvian H. erato were within introns of the WntA gene, and the strongest outliers on the B/D scaffold were all 3′ of the optix gene (Table 2; Supplemental Fig. 4).

In both H. melpomene populations there was a second strongly divergent region on chromosome 18 ∼2 Mb from the B/D region, which was not divergent in either of the H. erato populations (Fig. 3B; Supplemental Fig. 4). This is the same region on scaffold HE671488 that showed associations with color pattern and altitude in the Peruvian H. melpomene population (Table 3). In the Peruvian H. melpomene population, we detected two clusters of outlier divergent SNPs on chromosome 6, which do not appear to be associated with color pattern (Fig. 3B; Table 3; Supplemental Fig. 4). Outliers were also detected on chromosome 2 in both H. erato populations, some of which were in similar regions to those detected in the association mapping (Table 3; Supplemental Fig. 4).

Discussion

It has long been recognized that convergent and parallel evolution provides a natural experimental system in which to study the predictability of adaptation (Stewart et al. 1987; Wood et al. 2005). This approach has come to the fore with the recent integration of molecular and phenotypic studies of adaptive traits (Stinchcombe and Hoekstra 2007; Nadeau and Jiggins 2010). Here, we have studied parallel divergent clines in two co-mimic species of butterflies, using RAD sequencing to generate an extensive data set covering 1%–5% of the entire genome. Previous genomic studies of these species have sampled only a few individuals of divergent wing pattern races (The Heliconius Genome Consortium 2012; Nadeau et al. 2012, 2013; Kronforst et al. 2013; Martin et al. 2013; Supple et al. 2013), whereas previous hybrid zone studies have yet to integrate next-generation sequencing approaches (Mallet and Barton 1989a; Baxter et al. 2010; Counterman et al. 2010; Salazar 2012). Here we have shown that association mapping in the hybrid zones can be used to find known loci and also to identify previously unmapped loci, such as Ro in H. erato and N in H. melpomene. Divergence observed been Peruvian H. melpomene races in the region of Yb has previously been suggested to be due to divergence at both Yb and N (The Heliconius Genome Consortium 2012; Nadeau et al. 2012), but the location of N and distance from Yb had not been established previous to this study. Moreover, we conducted the first genome-wide scan for divergent loci and identified some that are not wing-color-pattern–related and so may have a role in other aspects of ecological divergence. With these data, we also identify a cryptic population of H. timareta in Ecuador and reveal parallel patterns of divergence between co-mimetic species.

Comparison of de novo assembly and reference alignment of RAD data

Genome-wide association studies (GWAS) are now common in studies of admixed human populations (Visscher et al. 2012). The use of GWAS studies outside of model organisms has mostly been hampered by lack of reference genomes or methods for typing sufficient numbers of markers. However, these limitations are rapidly being eroded as the cost of sequencing decreases and more reference genomes become available. Furthermore, we have shown that alignment of reads to a fairly distantly related reference genome (∼15% divergent) can generate meaningful results. In the absence of a reference genome, de novo assembly also detects the same loci, but with somewhat reduced efficacy.

Alignment of sequence reads to the reference genome produced data for more sites, even in the more distantly related species, H. erato. One drawback of the Stacks pipeline that we used for de novo assembly of the reads is that it does not assemble and call sequence variants in the paired-end reads. Hence the available sequence for analysis is almost double in the reference alignments as compared to the de novo assembly. However, it also seems that data was lost in the de novo assembly due to divergent alleles not being assembled together. This may have had a larger influence on the H. erato assemblies because this species harbors greater genetic diversity than H. melpomene (Hines et al. 2011) and so explain why a much lower proportion of the de novo assembled contigs were present across multiple individuals in H. erato (Table 1). We also found a higher proportion of variable sites in the reference alignments as compared to the de novo assemblies. This may again be due to poor assembly of the de novo contigs, but it could also represent genetic variability contained in the paired-end reads. It is possible that paired-end reads might be located in more variable regions, particularly if restriction-site–associated reads were biased toward more conserved regions (The Heliconius Genome Consortium 2012).

The larger number of SNPs in the reference alignments resulted in larger numbers of outlier and associated SNPs being detected, most of which cluster in the expected genomic regions. Moreover, there appears to be a higher false positive rate in the association mapping using the de novo assembled data. The most likely explanation for this result is that the smaller number of SNPs generated from the de novo assembly gave less power to correct for underlying population structure. Nevertheless, many of the expected associations and outlier regions were detected in the de novo assembled data. The results from assembly and alignment approaches are more concordant in H. melpomene than in H. erato, particularly at the level of individual SNPs (Fig. 5). This is very likely due to the fact that the H. melpomene reference genome was used to generate the sequence alignments in both species. In addition, the lower within-population diversity in H. melpomene, may also have led to improved de novo assemblies in this species.

Overall, our results suggest that detection of loci underlying adaptive change is likely to be more effective where reads can be mapped to a reference genome. This is perhaps most likely to be the case in populations with high levels of polymorphism, which prevents divergent alleles from assembling. The de novo approach could, and no doubt will, be improved by developing methods that allow paired-end reads to be incorporated into the SNP typing pipeline (Baxter et al. 2011). This would not only allow a higher density of SNPs to be detected but could also improve alignment of divergent alleles. In the meantime, one approach that has been used in other studies is to first perform de novo assembly of RAD-seq reads to generate a consensus reference and to then map reads to this reference for SNP calling (Keller et al. 2013).

Association mapping across hybrid zones is a rapid way of detecting loci underlying phenotypic differences

We have successfully used association mapping in hybrid zone individuals to identify virtually all the genomic regions known to control color pattern in these populations (Reed et al. 2011, Martin et al. 2012; Nadeau et al. 2012; Supple et al. 2013). It has commonly been supposed that large sample sizes will be necessary in order to identify genes in wild populations. Here we have confirmed recent theoretical predictions from simulated data (Crawford and Nielsen 2013), that for large effect adaptive loci, even small sample sizes can be highly effective in identification of narrow genomic regions underlying adaptive traits (Table 2; Supplemental Fig. 4). We also confirm the prediction that in populations with low background levels of divergence, both divergence outlier and association mapping approaches are effective in detecting regions under divergent selection. In our study, association mapping has the added benefit of identifying the phenotypic effects of the selected loci. One anticipated pitfall of this method was that many of the phenotypes covary across the hybrid zone. However, it appears that with just 10 individuals with admixed phenotypes, we can disassociate most of the variation and thus find distinct genetic associations for known loci. This therefore gives us some confidence that the novel associations that we have detected are real and not due to covariation with other phenotypes.

In Ecuador, we intentionally sampled from sites at the edges of the hybrid zone where both pure and hybrid individuals were present, because we anticipated that individuals from these sites would have the highest levels of admixture between selected alleles. This may explain the clearer patterns observed in H. erato in Ecuador as compared to Peru (Figs. 3D, 4D; Supplemental Table 2). In Peruvian H. erato, we also find several genomic regions showing phenotypic associations that are not divergence outliers, which may suggest that these are false positives. However, the less-clear signal in Peruvian H. erato could also be due to the reduced sample size in this population (27 individuals). Certainly, the reduced number of Ecuadorian H. melpomene individuals (22) seems to have reduced the power of the association mapping (Fig. 4B). The lack of any signal at the Cr locus in Peruvian H. erato is surprising, and may be because the Cr associated region is very narrow. There were 789 SNPs present in H. erato within the 607-kb region that is associated with the yellow hindwing bar in H. melpomene. This is only slightly below the genome-wide average for H. erato (∼1800 SNPs/Mb), but linkage disequilibrium breaks down rapidly in H. erato (Counterman et al. 2010), and so this may not have been sufficient coverage to identify the Cr locus. Nevertheless, contrary to previous suggestions (Kronforst et al. 2013), the density of RAD markers we have obtained was sufficient to identify many narrow divergent genomic regions.

Although we have clearly demonstrated the utility of this approach for association mapping, it should be noted that scoring of some phenotypes was informed by previous crossing experiments. For example, the N locus in Ecuadorian H. melpomene was scored taking into account the genetic background at B/D (Salazar 2012), and the scoring of the predicted genotype at the B/D locus yielded stronger associations than scoring of individual color pattern elements. Nonetheless, scoring based purely on phenotypic variation did successfully identify color pattern loci in several cases (e.g., Ro, Ac/Sd, and Yb). Overall, the prospects for mapping individual phenotypic components and identifying epistatic relationships without prior knowledge are considerable, especially with larger sample sizes.

The possibility of using hybrid zones for association mapping has long been recognized (Kocher and Sage 1986), but few studies have successfully applied this technique. Studies in younger hybrid zones, for example Helianthus sunflowers, have found that linkage disequilibrium between unlinked genomic regions in early generation hybrids can produce spurious associations (Rieseberg and Buerkle 2002). Heliconius hybrid zones seem ideal in this regard because they appear to be fairly ancient and close to linkage equilibrium. However, it seems likely that many other suitable systems do exist for this type of approach (Lexer et al. 2006; Crawford and Nielsen 2013). An additional benefit of the Heliconius system is that much of the phenotypic variation is controlled by major effect loci, which can be detected with small sample sizes. Although many adaptive phenotypes appear to involve major effect loci (Orr 2005; Nadeau and Jiggins 2010), in order to move beyond these and detect minor effect loci, much larger sample sizes will be required (Beavis 1997). However, by incorporating methods that use a probabilistic framework to infer allele frequencies in low coverage sequencing data (Gompert and Buerkle 2011), it should be feasible to sequence large enough samples for analysis of quantitative traits.

Identification of a novel color pattern locus

Our association mapping results have robustly identified the H. erato Ro locus, that controls the shape of the distal edge of the forewing band, as being on chromosome 13 near gene HE669551. This gene has a predicted Gene Ontology (GO) molecular function of microtubule binding and is similar to other insect Radial Spoke Head 3 proteins, which are components of the cilia (Avidor-Reiss et al. 2004). It is therefore not an obvious candidate for control of color pattern, so may simply be linked to the causative site. Our results are contrary to the suggestion of a recently published QTL study that Ro may be linked to Sd (Papa et al. 2013). However, that study also identified a major unlinked QTL for forewing band shape that could not be assigned to a H. melpomene chromosome and so may be homologous to the locus we detected here. Furthermore, a QTL for several aspects of forewing band shape and size, including the shape of the distal edge, has previously been identified in H. melpomene on chromosome 13 (Baxter et al. 2009). This was located to a fairly broad region but its positioning is consistent with our results for the Ro locus in H. erato. It therefore seems likely that we have identified a new wing patterning locus that is homologous in H. melpomene and H. erato.

Ecological selection across the hybrid zones

Our results support previous assertions that selection acting on color pattern is the most important factor in maintaining these hybrid zones (Mallet and Barton 1989a; Baxter et al. 2010; Counterman et al. 2010; Nadeau et al. 2012; Supple et al. 2013). The most divergent genomic regions correspond to color pattern controlling loci, and at least half of all divergence outliers are in these regions. Nevertheless, some divergent regions do not seem to correspond to color pattern loci and could be candidates for adaptation to other ecological factors. The best candidates appear to be the regions on chromosome 2 in H. erato and chromosome 6 in H. melpomene. The regions on chromosome 2 in the Peruvian H. erato population are also associated with color pattern, but such association could be due to the high covariation of color pattern and sampling location in this population. These regions overlap with predicted genes, including basic metabolic genes and a heat shock protein (Table 3), which could be candidates for adaptation to different temperature regimes. Chemosensory genes were also detected on chromosome 2 and could be candidates for divergent mate preference or host plant adaptation (Briscoe et al. 2013). However, no differences in host plant preference have been observed in Peru, where these outliers were detected, and mating within the hybrid zone appears to be random (Mallet and Barton 1989a), although marginal differences in mate preference have been observed in H. melpomene (Merrill et al. 2011a).

There appear to be multiple dispersed divergent regions on chromosome 2 in H. erato (Supplemental Fig. 4). These could be evidence of divergence hitchhiking, whereby new mutations that cause differential fitness are more likely to be fixed by selection if they arise close to other loci already under divergent selection. This could lead to clustering of divergently selected loci in the genome (Feder et al. 2012; Via 2012). The same process could also have led to additional loci under divergent selection to arise in linkage with the color pattern loci. This could explain the second divergent and altitude-associated region on chromosome 18 (linked to the B/D locus) in H. melpomene (Table 3; Supplemental Fig. 4). One possibility is that this could be the B/D linked mate preference locus that has previously been identified (Merrill et al. 2011b), although it is not clear if the mate preference locus is an additional linked locus or a pleiotropic effect of the wing color locus itself. It is also possible that these apparently distinct but linked divergent regions could simply reflect the heterogeneous nature of FST resulting from divergent selection on a single locus combined with other background and neutral processes (Charlesworth et al. 1997). A broad region of divergence around the B/D locus in H. melpomene would fit with other suggestions that it has undergone stronger or more recent selection than other color pattern loci (Nadeau et al. 2012). The D region in H. erato does not appear to be extended in the same way as in H. melpomene (Supplemental Fig. 4), suggesting that either the architecture or the selective history of this region is different between these species.

Comparison of the genomic architecture of divergence between convergent species

One interesting question that can be addressed with our results is the extent to which species undergoing parallel divergence will show parallel patterns at the genomic level. In order to address this, we first need to know whether the species really have undergone parallel divergence, i.e., that both the phenotypic start and end points have been similar. Several previous studies have suggested that this is not the case and that H. erato diverged earlier and followed a different trajectory compared to H. melpomene (Brower 1996; Flanagan et al. 2004; Quek et al. 2010). However, our phylogenetic results are more consistent with a recent analysis suggesting that the two species do appear to have undergone codivergence in multiple populations across their range (Cuthill and Charleston 2012). Our results are based on significantly more data than any of the previous analyses (>5 Mb in H. melpomene and >1 Mb in H. erato), and should produce a better signal for phylogenetic analysis as compared to AFLPs used previously (Quek et al. 2010). Although the striking similarities in tree topology do seem to support the codivergence hypothesis, alignment to a reference genome means that the evolutionary rates in our data for H. erato and H. melpomene are not directly comparable. In addition to the phylogenetic signal, our data also suggested similar patterns of population structure between species in each of the regions, with higher background divergence levels in Ecuador as compared to Peru (Fig. 2; Table 1; Supplemental Fig. 3).

Although some loci show parallel divergence in both species (B/D in Peru; B/D and Ac/Sd in Ecuador), there is surprisingly little similarity in the other loci that are divergent when comparing parallel hybrid zones. This is contrary to the general perception that there are strong genetic parallels in this system (Joron et al. 2006; Baxter et al. 2008; Papa et al. 2008a; Supple et al. 2013). Some of these differences were known previously, for example, that in Peru the Sd/Ac locus controls band shape variation in H. erato but not in H. melpomene (Mallet 1989). Our results extend this further through the identification of the Ro locus on chromosome 13 in Ecuadorian H. erato, which is not divergent in its co-mimic H. melpomene, and the identification of divergent regions of chromosome 2 in H. erato and chromosome 6 in Peruvian H. melpomene.

In general, it seems that although the same color pattern loci are present in both species (Joron et al. 2006; Baxter et al. 2008; Martin et al. 2012), they are being used in different ways and combinations in order to produce convergent phenotypes. This is particularly surprising given the pattern of codivergence observed in the phylogeny, which would appear to suggest that similar color patterns have arisen at a similar time and from similar ancestral forms in both species. Nonetheless, the apparent pattern of codivergence could simply reflect more recent patterns of gene flow between geographically proximate populations in both species. This has recently been highlighted by studies showing that patterns of divergence at color pattern controlling loci can be very different to those found at the rest of the genome (Hines et al. 2011; The Heliconius Genome Consortium 2012; Pardo-Diaz et al. 2012; Supple et al. 2013). Therefore, the differences that we observe in the use of particular loci in the two species could reflect different mimetic histories that will only be resolved by studies of the evolutionary history of particular loci.

Discovery of a new cryptic H. timareta population

An unexpected finding of our study was the discovery of a previously undescribed population of H. timareta, which appears phenotypically virtually indistinguishable from H. melpomene malleti in Ecuador but is clearly genetically distinct (Fig. 1; Supplemental Figs. 1, 2). H. timareta florencia is a malleti-like population that has previously been described in Colombia and also co-occurs with H. melpomene malleti. In that population, the length of the red line on the anterior edge of the ventral forewing was diagnostic (Giraldo et al. 2008). This character was not diagnostic in our genotyped individuals, with overlapping length distributions between the species (data not shown). We noted a tendency toward H. timareta having a shorter line on average, but given the small sample sizes in the current study, this remains to be confirmed.

A polymorphic high altitude population of H. timareta (H. timareta timareta) also occurs in this area of Ecuador, overlapping in distribution with H. melpomene plesseni. The polymorphism in this population has been something of a puzzle, as none of the forms mimic other co-occurring butterflies (Mallet 1999). Our finding of a new H. timareta population may help to explain the polymorphism in H. timareta timareta if it is being generated in part by gene flow from this newly identified population.

The H. timareta radiation has only been recognized in the last 10 yr (Giraldo et al. 2008; Jiggins 2008; Mérot et al. 2013). The H. timareta individuals in our study were collected from sites at 824 m and 376 m. They appear to be fairly common at low altitude as four out of the five individuals sequenced from the site at 376 m were H. timareta. In a large data set compiled by Rosser et al. (2012) containing 232 H. timareta individuals from all known populations (including H. tristero, now thought to be a subspecies of H. timareta), the lowest sampling location is around 600 m, with 95% of individuals occurring over 800 m. Therefore, the population of H. timareta that we have discovered occurs below the usual altitudinal range of H. timareta. This extends the possible range of this species and suggests that the overlap in distribution of H. timareta and H. melpomene is greater than previously considered.

Conclusions

We have demonstrated that high-resolution genome scans using admixed individuals from hybrid zones can be used to identify loci underlying phenotypic variation. Only a small proportion of the genome (∼0.025%) is strongly differentiated between subspecies, and most of this can be explained by divergence at loci controlling color pattern. This is consistent with previous studies based on smaller numbers of markers (Turner et al. 1979; Baxter et al. 2010; Counterman et al. 2010; Nadeau et al. 2012) and suggests that the hybrid zones are ancient or have formed in primary contact, and are maintained by strong selection on color pattern (Mallet and Barton 1989a; Mallet 2010). However, we also find, for the first time, some divergent loci that do not appear to be associated with color pattern, suggesting that there may be other differences between subspecies. This could explain why several Heliconius hybrid zones occur across ecological gradients (Benson 1982), if they are coupled with extrinsic selection acting on other loci in the genome (Bierne et al. 2011). However, this needs to be confirmed with detailed phenotypic analyses of the subspecies to identify whether differences are present that could be explained by ecological adaptation. In general, we find that although some loci are divergent in all populations, the genomic pattern of divergence between co-mimetic species is not particularly similar, suggesting that the level of parallel genetic evolution between H. erato and H. melpomene is in fact quite low, despite parallel phylogenetic patterns of divergence. Finally, our analysis shows that alignment to a distantly related reference genome can improve analyses over a de novo assembly of the data.

Methods

Samples and sequencing

Thirty H. erato and 30 H. melpomene individuals were selected from a larger sample taken from the hybrid zone region in Peru. Similarly, 30 H. erato and 30 H. melpomene were also selected from a larger study of a subspecies hybrid zone in Ecuador (Salazar 2012). Each set of 30 samples comprised 10 pure forms of each subspecies and 10 hybrids (based on color pattern). See Figure 2 and Supplemental Table 4 for further details of the samples and locations.

RAD sequencing libraries were prepared using previously described methodologies (Baird et al. 2008; Baxter et al. 2011; The Heliconius Genome Consortium 2012). Briefly, DNA was digested with the restriction enzyme PstI prior to ligation of P1 sequencing adaptors with five-base molecular identifiers (MIDs) (Supplemental Table 4). We then pooled samples into groups of six before shearing, ligation of P2 adaptors, amplification, and fragment size selection (300–600 bp). Libraries were then further pooled such that 30 individuals were sequenced on each lane of an Illumina HiSeq 2000 sequencer to obtain 150 bp paired-end sequences. We obtained an average of 374M sequence pairs from each lane. Following sequencing, three of the H. erato individuals from Peru were found to have been incorrectly assigned to this species and were excluded from all further analyses.

In order to compare patterns of phylogenetic divergence of the focal subspecies, we also used sequence data from additional subspecies and closely related species in each group. Two individuals, each from six additional H. erato populations and the closely related H. himera, were also PstI RAD sequenced with five individuals pooled per lane of Illumina GAIIx (100 bp paired-end sequencing). These sequences were obtained in the same run as a comparable set of individuals from the H. melpomene clade, which have been used in previous analyses and also included H. cydno, H. timareta, and H. hecale (The Heliconius Genome Consortium 2012; Nadeau et al. 2013) (European Nucleotide Archive, accession ERP000991). We also obtained whole-genome shotgun sequence data from an outgroup species, H. clysonimus, which was sequenced on a fifth of a HiSeq 2000 lane, giving 53.5M 100 bp read pairs for this individual.

Alignment to reference genome

We separated paired-end reads by MID using the RADpools script in the RADtools (v1.2.4) package (Baxter et al. 2011), which also filters based on the presence of the restriction enzyme cut site, using the option to allow one mismatch within the MID. Reads from each individual were then aligned to the H. melpomene reference genome (The Heliconius Genome Consortium 2012) using Stampy v1.0.17 (Lunter and Goodson 2011), with default parameters except substitution rate, which was set to 0.03 for alignments of H. melpomene and 0.10 for alignments of H. erato.

We then realigned indels and called genotypes using the Genome Analysis Tool Kit (GATK) v1.6.7 (DePristo et al. 2011), outputting all confident sites (those with quality ≥30). This was first run on each set of 30 (or 27) individuals from each population group. These genotype calls were used for analyses of genetic variation within each of the groups, including outlier detection, association mapping, and analyses of subpopulation structure. In addition, genotype calling was also performed on a combined data set of all H. melpomene and outgroup taxa (H. timareta, H. cydno, and H. hecale) as well as a combined set of all H. erato and its outgroups (H. himera and H. clysonimus). These genotype calls were used for the phylogenetic analyses and broader analyses of genetic structure. For all downstream analyses, calls were further filtered to only accept those based on a minimum depth of five reads and minimum genotype and mapping qualities (GQ and MQ) of 30 for H. melpomene and 20 for H. erato.

De novo assembly

We quality-filtered the single-end raw sequence data and separated sequences by MID with the process_radtags program within Stacks (Catchen et al. 2011). This program corrects single errors in the MID or restriction site and then checks quality score using a sliding window across 15% of the length of the read. We discarded sequences with a raw phred score below 10, removed reads with uncalled bases or low quality scores, and trimmed reads to 100 bases to eliminate potential sequencing error occurring at ends of reads. Table 1 shows the mean read numbers per individual obtained after filtering. For each population group, we assembled loci de novo using the denovo_map.sh pipeline in Stacks (Catchen et al. 2011). We set the minimum depth of coverage (m) to 6, allowed 4 mismatches both in creating individual stacks (M) and in secondary reads (N), and removed or separated highly repetitive RadTags. Due to the high level of polymorphism in our data set, we used these parameters to minimize the exclusion of interesting loci with high variability between populations. De novo assembly was conducted both including (for association mapping) and excluding (for BayeScan outlier detection) hybrid individuals in the analysis. Individuals from Ecuador that were identified as being H. timareta were excluded.

Phylogenetics and analysis of population structure

Only the reference aligned data were used for phylogenetics and Structure analyses. We used custom scripts to convert from VCF to PHYLIP format and to filter sites with a minimum of 95% of individuals with confident calls. Maximum likelihood phylogenies were constructed in PhyML (Guindon and Gascuel 2003) with a GTR model using the resulting 5,737,351 sites (including invariant sites) for the H. melpomene group and 1,693,024 sites for the H. erato group. Approximate likelihood branch supports were calculated within the program.

Population structure within and across each of the hybrid zones was analyzed using the program Structure v2.3 (Pritchard et al. 2000). We prepared input files using custom scripts, and only sites with 100% of individuals present for H. melpomene populations or at least 75% of individuals present for H. erato populations and with a minor allele frequency of at least 20% were retained. This reduced the number of sampled sites, keeping just the most informative ones, for easier handling by the program. Initial short runs (103 burn-in, 103 data collection, K = 1) were used to estimate the allele frequency distribution parameter λ. We then ran longer clustering runs (104 burn-in, 104 data collection) with the obtained values of λ for each of the four population groups for K = 1−3. For H. melpomene in Ecuador, the analysis was first run with all individuals included and then excluding the individuals identified as being H. timareta.

We also performed principal components analysis of the genetic variation in each population group. This was done with the “cmdscale” command in R (R Development Core Team 2011), using genetic distance matrices calculated as 0.5-ibs, where ibs was the identity by sequence matrix calculated in GenABEL (see below). As further confirmation that some of the H. melpomene individuals sampled in Ecuador were in fact cryptic H. timareta, we also performed principal components analysis on the combined H. melpomene and outgroup data set. We also ran principal components analysis on the de novo assembled data for each population group to test whether both methods were detecting similar underlying patterns of genetic variation.

In order to compare our newly identified H. timareta individuals to other populations, we Sanger sequenced a 745-bp region of mitochondrial COI that overlapped with the regions sequenced in previous studies (Giraldo et al. 2008; Mérot et al. 2013). This was PCR amplified as in Mérot et al. (2013) with primers “Jerry” and “Patlep” and directly sequenced with “Patlep.” These sequences were then aligned with those available on GenBank, and a maximum likelihood phylogeny was constructed in PhyML (Guindon and Gascuel 2003) with a GTR model and 1000 bootstrap replicates.

Association mapping of loci controlling color pattern variation

We scored components of phenotypic variation that segregate across each of the hybrid zones. The scored phenotypes are shown in Figure 3 (for Peru) and Figure 4 (for Ecuador) and listed in full in Supplemental Table 5. These were scored mostly as binomial (1,0) traits, but in some cases intermediates were also scored (as 0.5). The width and shape of the forewing band was scored based on whether it extended into each of the wing “cells,” demarcated by the major wing veins (as shown in Supplemental Fig. 9). In Peruvian populations, the size and shape of the forewing band was measured as two components (Fig. 3C,F) that extend the band distally (cell spot 8) and proximally (cell spot 11). In Ecuador, three aspects of band shape were scored: cells 8 and 11, which make up the proximal spot in H. m. plesseni and H. e. notabilis, and cell 7, which pushes the band toward the wing margin in H. m. malleti and H. e. lativitta (Fig. 4C,F). In our sample of H. melpomene, the presence of cell spots 8 and 11 were perfectly correlated, whereas in H. erato, the presence of cell spot 7 was perfectly correlated with the absence of cell spot 8. In addition, individuals were also scored for their predicted genotypes at major loci described previously (with predicted heterozygotes scored as 0.5) (Sheppard et al. 1985; Mallet 1989), and the altitude at which they were collected was included as a continuous phenotypic trait.

We performed association mapping using the R Package GenABEL v 1.7-4 (Aulchenko et al. 2007). This was performed on both the de novo assembled and the reference aligned data with a custom script used to convert both from VCF to Illumina SNP format. Individuals identified as being H. timareta were excluded. Filtering was performed within the program to remove sites with >30% missing data and with a minor allele frequency of <3%.

For each population, an analysis of the hindwing ray phenotype using the reference mapped data was first performed using three methods: a straight score test (qtscore), a score test with the first three principal components of genetic variation (calculated as described above) as covariates, and an Eigenstrat analysis (egscore) (Price et al. 2006). The presence of genetic stratification and the ability of these methods to correct for this was analyzed by comparing the inflation factor, λ, which is computed by regression in a Q–Q plot to detect genome-wide skew in association values. In all cases, the analyses incorporating population stratification did not give a reduced value of λ and so were not used for subsequent analyses. As our samples were from hybrid zones with >60% of the samples having extreme values of all scored phenotypes, we would expect similar levels of stratification for all phenotypes, so this test for stratification was not repeated for all phenotypes.

We therefore performed score tests for all scored phenotypes across all population groups. Genome-wide significance was determined empirically from 1000 resampling replicates and corrected for population structure using the test specific λ (Supplemental Table 5).

BayeScan analysis to identify loci under selection

We used the program BayeScan v2.1 (Foll and Gaggiotti 2008) to look for loci with outlier FST values between “pure” individuals of each subspecies type (based on wing color pattern) in each population group. Exclusion of the H. timareta individuals meant that only three pure H. melpomene malleti individuals remained. Therefore, for the purpose of this analysis of H. melpomene in Ecuador, the two hybrid individuals closest to the H. m. malleti side of the hybrid zone (Fig. 2), which also had the most H. m. malleti like phenotypes, were included as H. m. malleti.

The program was run with the prior odds for the neutral model (pr_odds) set to 10, and outlier loci were detected with a false discovery rate (FDR) of 0.05. We ran this analysis using both the de novo assembled and the reference aligned data. Custom scripts were used to convert these to the correct input format. For both analyses, sites were only kept if at least 75% of individuals were sampled for both subspecies in a given comparison.

Data access

DNA sequence reads from this study have been submitted to the European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) under accession number ERP003980. COI sequences have been submitted to the EMBL Nucleotide Sequence Database (EMBL-Bank) at ENA under accession numbers HG710096-HG710125. Custom scripts and wing images are available from Data Dryad with doi: 10.5061/dryad.1nc50.

Acknowledgments

We thank Simon Baxter, Doug Turnbull, and William Cresko for their help and advice with RAD library preparation and sequencing. Sequencing was performed at the University of Oregon, Genomics Core Facility and The Gene Pool genomics facility in the University of Edinburgh. We would also like to thank Julian Catchen for his help with Stacks. We thank the governments of Peru and Ecuador for their permission to collect and export specimens. Santiago Villamarín from the Museo Ecuatoriano de Ciencias Naturales provided institutional support in Ecuador. We also thank Ismael Aldás, Carlos Robalino, and Patricia Salazar for their assistance with fieldwork. Joanna Riley assisted with DNA extractions. John Davey gave us access to his unpublished mapping results. We thank three anonymous reviewers for their comments. This project was supported by research grants BBSRC H01439X/1, NSF-CREST #0206200, NSF-DEB-1257839, and NSF-IOS 1305686. N.J.N. was funded by a Leverhulme Trust award to C.D.J.; M.R. was funded through the Ford Foundation Postdoctoral Fellowship Program administered by the National Academies; H.O.Z. and J.A.M. were partially supported by NIH-NIGMS INBRE award P20GM103475 and NSF-EPSCoR award 1002410.

Footnotes

References

  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215: 403–410 [PubMed] [Google Scholar]
  • Aulchenko YS, Ripke S, Isaacs A, van Duijn CM. 2007. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23: 1294–1296 [PubMed] [Google Scholar]
  • Avidor-Reiss T, Maer AM, Koundakjian E, Polyanovsky A, Keil T, Subramaniam S, Zuker CS. 2004. Decoding cilia function: defining specialized genes required for compartmentalized cilia biogenesis. Cell 117: 527–539 [PubMed] [Google Scholar]
  • Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, Selker EU, Cresko WA, Johnson EA. 2008. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3: e3376. [PMC free article] [PubMed] [Google Scholar]
  • Barton NH, Hewitt GM. 1985. Analysis of hybrid zones. Annu Rev Ecol Syst 16: 113–148 [Google Scholar]
  • Baxter SW, Papa R, Chamberlain N, Humphray SJ, Joron M, Morrison C, ffrench-Constant RH, McMillan WO, Jiggins CD. 2008. Convergent evolution in the genetic basis of Müllerian mimicry in heliconius butterflies. Genetics 180: 1567–1577 [PMC free article] [PubMed] [Google Scholar]
  • Baxter SW, Johnston SE, Jiggins CD. 2009. Butterfly speciation and the distribution of gene effect sizes fixed during adaptation. Heredity (Edinb) 102: 57–65 [PubMed] [Google Scholar]
  • Baxter SW, Nadeau NJ, Maroja LS, Wilkinson P, Counterman BA, Dawson A, Beltran M, Perez-Espona S, Chamberlain N, Ferguson L, et al.. 2010. Genomic hotspots for adaptation: the population genetics of Müllerian mimicry in the Heliconius melpomene Clade. PLoS Genet 6: e1000794. [PMC free article] [PubMed] [Google Scholar]
  • Baxter SW, Davey JW, Johnston JS, Shelton AM, Heckel DG, Jiggins CD, Blaxter ML. 2011. Linkage mapping and comparative genomics using next-generation RAD sequencing of a non-model organism. PLoS ONE 6: e19315. [PMC free article] [PubMed] [Google Scholar]
  • Beavis WD. 1997. QTL analyses: power, precision, and accuracy. In Molecular dissection of complex traits (ed. Paterson AH), pp. 145–162. CRC Press, Boca Raton, FL. [Google Scholar]
  • Benson WW 1972. Natural selection for Miillerian mimicry in Heliconius erato in Costa Rica. Science 176: 936–939 [PubMed] [Google Scholar]
  • Benson WW. 1982. Alternative models for infrageneric diversification in the humid tropics: tests with passion vine butterflies. In Biological Diversification in the Tropics (ed. Prance GT), pp. 608–640. Columbia University Press, New York. [Google Scholar]
  • Bierne N, Welch J, Loire E, Bonhomme F, David P. 2011. The coupling hypothesis: why genome scans may fail to map local adaptation genes. Mol Ecol 20: 2044–2072 [PubMed] [Google Scholar]
  • Briscoe AD, Macias-Muñoz A, Kozak KM, Walters JR, Yuan F, Jamie GA, Martin SH, Dasmahapatra KK, Ferguson LC, Mallet J, et al.. 2013. Female behaviour drives expression and evolution of gustatory receptors in butterflies. PLoS Genet 9: e1003620. [PMC free article] [PubMed] [Google Scholar]
  • Brower AV 1996. Parallel race formation and the evolution of mimicry in Heliconius butterflies: a phylogenetic hypothesis from mitochondrial DNA sequences. Evolution 50: 195–221 [PubMed] [Google Scholar]
  • Catchen JM, Amores A, Hohenlohe P, Cresko W, Postlethwait JH. 2011. Stacks: building and genotyping loci de novo from short-read sequences. G3 (Bethesda) 1: 171–182 [PMC free article] [PubMed] [Google Scholar]
  • Charlesworth B, Nordborg M, Charlesworth D. 1997. The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations. Genet Res 70: 155–174 [PubMed] [Google Scholar]
  • Counterman BA, Araujo-Perez F, Hines HM, Baxter SW, Morrison CM, Lindstrom DP, Papa R, Ferguson L, Joron M, ffrench-Constant RH, et al.. 2010. Genomic hotspots for adaptation: the population genetics of Müllerian mimicry in Heliconius erato. PLoS Genet 6: e1000796. [PMC free article] [PubMed] [Google Scholar]
  • Crawford JE, Nielsen R. 2013. Detecting adaptive trait loci in non-model systems: divergence or admixture mapping? Mol Ecol 22: 6131–6148 [PubMed] [Google Scholar]
  • Cuthill JH, Charleston M. 2012. Phylogenetic codivergence supports coevolution of mimetic Heliconius butterflies. PLoS ONE 7: e36464. [PMC free article] [PubMed] [Google Scholar]
  • DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al.. 2011. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43: 491–498 [PMC free article] [PubMed] [Google Scholar]
  • Feder JL, Gejji R, Yeaman S, Nosil P. 2012. Establishment of new mutations under divergence and genome hitchhiking. Philos Trans R Soc Lond B Biol Sci 367: 461–474 [PMC free article] [PubMed] [Google Scholar]
  • Ferguson L, Lee SF, Chamberlain N, Nadeau NJ, Joron M, Baxter S, Wilkinson P, Papanicolaou A, Kumar S, Kee TJ, et al.. 2010. Characterization of a hotspot for mimicry: assembly of a butterfly wing transcriptome to genomic sequence at the HmYb/Sb locus. Mol Ecol 19: 240–254 [PubMed] [Google Scholar]
  • Flanagan NS, Tobler A, Davison A, Pybus OG, Kapan DD, Planas S, Linares M, Heckel D, McMillan WO. 2004. Historical demography of Müllerian mimicry in the neotropical Heliconius butterflies. Proc Natl Acad Sci 101: 9704–9709 [PMC free article] [PubMed] [Google Scholar]
  • Foll M, Gaggiotti O. 2008. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180: 977–993 [PMC free article] [PubMed] [Google Scholar]
  • Giraldo N, Salazar C, Jiggins CD, Bermingham E, Linares M. 2008. Two sisters in the same dress: Heliconius cryptic species. BMC Evol Biol 8: 324. [PMC free article] [PubMed] [Google Scholar]
  • Gompert Z, Buerkle CA. 2011. A hierarchical Bayesian model for next-generation population genomics. Genetics 187: 903–917 [PMC free article] [PubMed] [Google Scholar]
  • Gompert Z, Lucas LK, Nice CC, Fordyce JA, Forister ML, Buerkle AC. 2012. Genomic regions with a history of divergent selection affect fitness of hybrids between two butterfly species. Evolution 66: 2167–2181 [PubMed] [Google Scholar]
  • Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52: 696–704 [PubMed] [Google Scholar]
  • Harrison RG. 1993. Hybrid zones and the evolutionary process. Oxford University Press, Oxford, UK. [Google Scholar]
  • The Heliconius Genome Consortium 2012. Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature 487: 94–98 [PMC free article] [PubMed] [Google Scholar]
  • Hines HM, Counterman BA, Papa R, Albuquerque de Moura P, Cardoso MZ, Linares M, Mallet J, Reed RD, Jiggins CD, Kronforst MR, et al.. 2011. Wing patterning gene redefines the mimetic history of Heliconius butterflies. Proc Natl Acad Sci 108: 19666–19671 [PMC free article] [PubMed] [Google Scholar]
  • Jiggins CD 2008. Ecological speciation in mimetic butterflies. Bioscience 58: 541–548 [Google Scholar]
  • Joron M, Papa R, Beltrán M, Chamberlain N, Mavárez J, Baxter S, Abanto M, Bermingham E, Humphray SJ, Rogers J, et al.. 2006. A conserved supergene locus controls colour pattern diversity in Heliconius butterflies. PLoS Biol 4: e303. [PMC free article] [PubMed] [Google Scholar]
  • Kapan DD 2001. Three-butterfly system provides a field test of müllerian mimicry. Nature 409: 338–340 [PubMed] [Google Scholar]
  • Kawakami T, Butlin RK 2001 Hybrid zones. In eLS. John Wiley, New York. http://onlinelibrary.wiley.com/doi/10.1002/9780470015902.a0001752.pub2/abstract.
  • Keller I, Wagner CE, Greuter L, Mwaiko S, Selz OM, Sivasundar A, Wittwer S, Seehausen O. 2013. Population genomic signatures of divergent adaptation, gene flow and hybrid speciation in the rapid radiation of Lake Victoria cichlid fishes. Mol Ecol 22: 2848–2863 [PubMed] [Google Scholar]
  • Kocher TD, Sage RD. 1986. Further genetic analyses of a hybrid zone between leopard frogs (Rana pipiens complex) in Central Texas. Evolution 40: 21–33 [PubMed] [Google Scholar]
  • Kronforst MR, Hansen MEB, Crawford NG, Gallant JR, Zhang W, Kulathinal RJ, Kapan DD, Mullen SP. 2013. Hybridization reveals the evolving genomic architecture of speciation. Cell Rep 5: 666–677 [PMC free article] [PubMed] [Google Scholar]
  • Lamas G 1997. Comentarios taxonomicos y nomenclaturales sobre Heliconiini Neotropicales, con designacion de lectotipos y descripcion de cuatro subespecies nuevas (Lepidoptera: Nymphalidae: Heliconiinae). Rev Per Ent 40: 111–125 [Google Scholar]
  • Langham GM 2004. Specialized avian predators repeatedly attack novel color morphs of Heliconius butterflies. Evolution 58: 2783–2787 [PubMed] [Google Scholar]
  • Lexer C, Buerkle CA, Joseph JA, Heinze B, Fay MF. 2006. Admixture in European Populus hybrid zones makes feasible the mapping of loci that contribute to reproductive isolation and trait differences. Heredity 98: 74–84 [PubMed] [Google Scholar]
  • Lunter G, Goodson M. 2011. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21: 936–939 [PMC free article] [PubMed] [Google Scholar]
  • Mallet J 1989. The genetics of warning colour in Peruvian hybrid zones of Heliconius erato and H. melpomene. Proc R Soc Lond B Biol Sci 236: 163–185 [Google Scholar]
  • Mallet J 1999. Causes and consequences of a lack of coevolution in Müllerian mimicry. Evol Ecol 13: 777–806 [Google Scholar]
  • Mallet J 2010. Shift happens! Shifting balance and the evolution of diversity in warning colour and mimicry. Ecol Entomol 35: 90–104 [Google Scholar]
  • Mallet J, Barton N. 1989a. Strong natural selection in a warning-color hybrid zone. Evolution 43: 421–431 [PubMed] [Google Scholar]
  • Mallet J, Barton N. 1989b. Inference from clines stabilized by frequency-dependent selection. Genetics 122: 967–976 [PMC free article] [PubMed] [Google Scholar]
  • Mallet J, Barton N, Lamas G, Santisteban J, Muedas M, Eeley H. 1990. Estimates of selection and gene flow from measures of cline width and linkage disequilibrium in Heliconius hybrid zones. Genetics 124: 921–936 [PMC free article] [PubMed] [Google Scholar]
  • Martin A, Papa R, Nadeau NJ, Hill RI, Counterman BA, Halder G, Jiggins CD, Kronforst MR, Long AD, McMillan WO, et al.. 2012. Diversification of complex butterfly wing patterns by repeated regulatory evolution of a Wnt ligand. Proc Natl Acad Sci 109: 12632–12637 [PMC free article] [PubMed] [Google Scholar]
  • Martin SH, Dasmahapatra KK, Nadeau NJ, Salazar C, Walters JR, Simpson F, Blaxter M, Manica A, Mallet J, Jiggins CD. 2013. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res 23: 1817–1828 [PMC free article] [PubMed] [Google Scholar]
  • Mérot C, Mavárez J, Evin A, Dasmahapatra KK, Mallet J, Lamas G, Joron M. 2013. Genetic differentiation without mimicry shift in a pair of hybridizing Heliconius species (Lepidoptera: Nymphalidae). Biol J Linn Soc Lond 109: 830–847 [Google Scholar]
  • Merrill RM, Gompert Z, Dembeck LM, Kronforst MR, McMillan WO, Jiggins CD. 2011a. Mate preference across the speciation continuum in a clade of mimetic butterflies. Evolution 65: 1489–1500 [PubMed] [Google Scholar]
  • Merrill RM, Van Schooten B, Scott JA, Jiggins CD. 2011b. Pervasive genetic associations between traits causing reproductive isolation in Heliconius butterflies. Proc Biol Sci 278: 511–518 [PMC free article] [PubMed] [Google Scholar]
  • Müller F 1879. Ituna and Thyridia; a remarkable case of mimicry in butterflies. Trans Entomol Soc Lond 1879: xx–xxix [Google Scholar]
  • Nadeau NJ, Jiggins CD. 2010. A golden age for evolutionary genetics? Genomic studies of adaptation in natural populations. Trends Genet 26: 484–492 [PubMed] [Google Scholar]
  • Nadeau NJ, Whibley A, Jones RT, Davey JW, Dasmahapatra KK, Baxter SW, Quail MA, Joron M, ffrench-Constant RH, Blaxter M, et al.. 2012. Genomic islands of divergence in hybridizing Heliconius butterflies identified by large-scale targeted sequencing. Philos Trans R Soc Lond B Biol Sci 367: 343–353 [PMC free article] [PubMed] [Google Scholar]
  • Nadeau NJ, Martin SH, Kozak KM, Salazar C, Dasmahapatra KK, Davey JW, Baxter SW, Blaxter ML, Mallet J, Jiggins CD. 2013. Genome-wide patterns of divergence and gene flow across a butterfly radiation. Mol Ecol 22: 814–826 [PubMed] [Google Scholar]
  • Orr HA 2005. The genetic theory of adaptation: a brief history. Nat Rev Genet 6: 119–127 [PubMed] [Google Scholar]
  • Papa R, Martin A, Reed RD. 2008a. Genomic hotspots of adaptation in butterfly wing pattern evolution. Curr Opin Genet Dev 18: 559–564 [PubMed] [Google Scholar]
  • Papa R, Morrison CM, Walters JR, Counterman BA, Chen R, Halder G, Ferguson L, Chamberlain N, ffrench-Constant R, Kapan DD, et al.. 2008b. Highly conserved gene order and numerous novel repetitive elements in genomic regions linked to wing pattern variation in Heliconius butterflies. BMC Genomics 9: 345. [PMC free article] [PubMed] [Google Scholar]
  • Papa R, Kapan DD, Counterman BA, Maldonado K, Lindstrom DP, Reed RD, Nijhout HF, Hrbek T, McMillan WO. 2013. Multi-allelic major effect genes interact with minor effect QTLs to control adaptive color pattern variation in Heliconius erato. PLoS ONE 8: e57033. [PMC free article] [PubMed] [Google Scholar]
  • Pardo-Diaz C, Salazar C, Baxter SW, Merot C, Figueiredo-Ready W, Joron M, McMillan WO, Jiggins CD. 2012. Adaptive introgression across species boundaries in Heliconius butterflies. PLoS Genet 8: e1002752. [PMC free article] [PubMed] [Google Scholar]
  • Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. 2006. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38: 904–909 [PubMed] [Google Scholar]
  • Pritchard JK, Stephens M, Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics 155: 945–959 [PMC free article] [PubMed] [Google Scholar]
  • Quek S-P, Counterman BA, Albuquerque de Moura P, Cardoso MZ, Marshall C, McMillan WO, Kronforst MR. 2010. Dissecting comimetic radiations in Heliconius reveals divergent histories of convergent butterflies. Proc Natl Acad Sci 107: 7365–7370 [PMC free article] [PubMed] [Google Scholar]
  • R Development Core Team 2011 R: A Language and Environment for Statistical Computing (version 2.14). R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
  • Reed RD, Papa R, Martin A, Hines HM, Counterman BA, Pardo-Diaz C, Jiggins CD, Chamberlain NL, Kronforst MR, Chen R, et al.. 2011. optix drives the repeated convergent evolution of butterfly wing pattern mimicry. Science 333: 1137–1141 [PubMed] [Google Scholar]
  • Rieseberg L, Buerkle CA. 2002. Genetic mapping in hybrid zones. Am Nat 159: S36–S50 [PubMed] [Google Scholar]
  • Rosser N, Phillimore AB, Huertas B, Willmott KR, Mallet J. 2012. Testing historical explanations for gradients in species richness in heliconiine butterflies of tropical America. Biol J Linn Soc Lond 105: 479–497 [Google Scholar]
  • Salazar PA. 2012. “Hybridization and the genetics of wing colour-pattern diversity in Heliconius butterflies.” PhD thesis, University of Cambridge, UK. [Google Scholar]
  • Sheppard PM, Turner JRG, Brown KS, Benson WW, Singer MC. 1985. Genetics and the evolution of Muellerian mimicry in Heliconius butterflies. Philos Trans R Soc Lond B Biol Sci 308: 433–610 [Google Scholar]
  • Stewart CB, Schilling JW, Wilson AC. 1987. Adaptive evolution in the stomach lysozymes of foregut fermenters. Nature 330: 401–404 [PubMed] [Google Scholar]
  • Stinchcombe JR, Hoekstra HE. 2007. Combining population genomics and quantitative genetics: finding the genes underlying ecologically important traits. Heredity 100: 158–170 [PubMed] [Google Scholar]
  • Supple MA, Hines HM, Dasmahapatra KK, Lewis JJ, Nielsen DM, Lavoie C, Ray DA, Salazar C, McMillan WO, Counterman BA. 2013. Genomic architecture of adaptive color pattern divergence and convergence in Heliconius butterflies. Genome Res 23: 1248–1257 [PMC free article] [PubMed] [Google Scholar]
  • Tobler A, Kapan D, Flanagan NS, Gonzalez C, Peterson E, Jiggins CD, Johntson JS, Heckel DG, McMillan WO. 2004. First-generation linkage map of the warningly colored butterfly Heliconius erato. Heredity 94: 408–417 [PubMed] [Google Scholar]
  • Turner JR, Johnson MS, Eanes WF. 1979. Contrasted modes of evolution in the same genome: allozymes and adaptive change in Heliconius. Proc Natl Acad Sci 76: 1924–1928 [PMC free article] [PubMed] [Google Scholar]
  • Via S 2012. Divergence hitchhiking and the spread of genomic isolation during ecological speciation-with-gene-flow. Philos Trans R Soc Lond B Biol Sci 367: 451–460 [PMC free article] [PubMed] [Google Scholar]
  • Visscher PM, Brown MA, McCarthy MI, Yang J. 2012. Five years of GWAS discovery. Am J Hum Genet 90: 7–24 [PMC free article] [PubMed] [Google Scholar]
  • Wood TE, Burke TM, Rieseberg LH. 2005. Parallel genotypic adaptation: when evolution repeats itself. Genetica 123: 157–170 [PMC free article] [PubMed] [Google Scholar]


Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press


Which of the following best explains the differences in flower color of the African violet in the greenhouse?

Which of the following best explains the differences in flower color of the African violets in the greenhouse? Warmer temperatures result in genotypic alterations, which result in flower color differences. The plants along the walls of the greenhouse are homozygous recessive and therefore have white flowers.

Which of the following claims is most consistent with the observed butterfly behavior?

Which of the following claims is most consistent with the observed butterfly behavior? Butterflies that express two variants of the enzyme two variants of the enzyme are active over a greater range of temperature.

Which of the following is most likely the primary cause of the pattern of frequency of trisomy 21 births?

The majority of full trisomy 21 is caused by chromosomal nondisjunction occurring during maternal meiotic division (∼90%). Errors occur more frequently in the first maternal meiotic division than the second (73% vs.

What is the best explanation for the pink flowered offspring?

What is the best explanation for the pink-flowered offspring? Blue flowers are incompletely dominant to pink flowers. Pink flower color is a trait recessive to blue flower color. Pink flower color is the result of somatic mutations in the flower color gene.