- Research Article
- Open access
- Published:
Positive selection on rare variants of IGF1R and BRD4 underlying the cold adaptation of wild boar
Genetics Selection Evolution volume 57, Article number: 40 (2025)
Abstract
Background
Domestic piglets often die of hypothermia, whereas Eurasian wild boar (Sus scrofa) thrives from tropical lowlands to subarctic forests. The thermoregulation of wild boar offers a natural experiment to uncover the genetic basis of cold adaptation.
Methods
We conducted whole-genome resequencing on wild populations from cold regions (northern and northeastern Asia, with six samples) and warm regions (southeastern Asia and southern China, with five samples). By integrating publicly available data, we compiled a core dataset of 48 wild boar samples and an extended dataset of 445 wild boar and domestic pig samples to identify candidate genes related to cold adaptation. To investigate the functional effects of two candidate variants under positive selection, we performed CUT&Tag and RNA-seq using the northeastern Asian Min pig breed as a proxy for a cold-adapted population.
Results
Our study identified candidate genes associated with cold adaptation, which are significantly enriched in thermogenesis, fat cell development, and adipose tissue pathways. We discovered two enhancer variants under positive selection: an intronic variant of IGF1R (rs341219502) and an exonic variant of BRD4 (rs327139795). These variants exhibited the highest differentiation between populations of wild boar and domestic pigs in cold and warm region populations. Furthermore, these rare variants were absent in outgroup species and warm-region wild boars but were nearly fixed in cold-region populations. The H3K27ac CUT&Tag profiling revealed that the rs341219502 variant of IGF1R is linked to the gain of novel binding sites for three transcription factors involving regulatory changes in enhancer function. In contrast, the rs327139795 variant of BRD4 may result in the loss of a phosphorylation site due to an alteration in the amino acid sequence.
Conclusion
Our study identified candidate genes for cold adaptation in wild boar. The variant rs341219502 in the IGF1R enhancer and the variant rs327139795 in the BRD4 exon, both of which were under positive selection and nearly fixed in populations from cold regions, suggest they may have originated de novo in these populations. Further analysis indicated that rs341219502 could influence enhancer function, while rs327139795 may affect amino acid alterations. Overall, our study highlights the adaptive evolution of genomic molecules that contribute to the remarkable environmental flexibility of wild boar.
Background
One of the most fundamental questions in evolution is to understand how populations adapt to new environments [1, 2]. Peripheral or marginal populations, which inhabit the edge of a species' distribution area, provide a valuable model for exploring this question. Peripheral populations often migrate from their ancestral territories to adapt to exploit new niches through population expansion. However, they face harsher and sometimes insurmountable environmental conditions at these boundary zones, such as extremely low temperatures and a scarcity of food resources, which can hinder their further expansion. The challenges presented by these survival conditions make peripheral populations ideal natural experiments for investigating the genetic bases of novel adaptive strategies in response to environmental constraints.
Advancements in sequencing technology and population genetics are increasingly allowing for the identification of functional genetic variants under natural selection, enabling organisms to adapt to new environments [3,4,5,6,7]. Strong selective forces in derived populations can leave distinct genomic signals that differentiate them from source populations [2, 8]. Research indicates that 1 to 15% of genes in the mammalian genome may experience positive Darwinian selection [9,10,11]. For example, in humans, it is estimated that approximately 10% of genes are under positive selection, although the concordance rates among different statistical tests typically range from 8 to 27% [10, 11].
To identify selective signals in peripheral populations, studies often examine patterns across multiple genomic parameters, including allele frequencies [2, 12, 13], nucleotide diversity [14,15,16], and haplotype segregation [17, 18]. Human populations have demonstrated adaptability to various environments, from tropical regions in Africa to the cold peripheral regions of Siberia. A genome-wide scan analysing haplotype and allele frequency patterns in Siberian populations revealed selective signals associated with genes that contribute to cold adaptation [3]. One example is the identification of a globally low-frequency nonsynonymous variant in the CPT1A gene, which is thought to be the most likely causative mutation, as it has a high allele frequency in local Siberian populations [19].
In this study, we focus on the wild boar (Sus scrofa), a species known for its remarkable adaptability and wide distribution across various climatic regions in Eurasia. Wild boar occupy diverse ecological niches, ranging from the humid tropics of Southeast Asia, through temperate zones, to the extreme environments of the Qinghai-Tibet plateau and subarctic Siberia [20, 21].
Phylogenetic analyses indicate that the Eurasian wild boar diverged from a clade of closely related Sus species at the beginning of the Pliocene epoch, approximately 5.3 to 3.5 million years ago, in tropical Asia [22]. Between 1 and 2 million years ago, the species expanded beyond tropical Asia, establishing several geographically distinct populations across Eurasia [20]. Biogeographic studies have traced the migratory path of wild boars from southern to northern Asia [23, 24].
This long-range migration of wild boar suggests that the populations in tropical regions acted as source populations, while those in Siberian areas are peripheral populations. By examining genomic changes in these peripheral wild boar populations, we can investigate the genetic basis of traits that are subject to positive selection in response to novel climatic challenges under cold environments.
Wild boar has migrated from tropical Asia and established their northernmost natural habitat in Siberia, reaching latitudes as far north as 61°N [21]. Given their tropical origins, we hypothesise that the genomes of wild boar populations in cold regions may show signs of adaptation to these environments. However, this hypothesis has not yet been empirically tested [21].
In our study, we conducted whole-genome sequencing of wild boar populations from both tropical Asian regions (specifically Vietnam, with five samples) and cold Asian regions (Siberia, with six samples). We also included 445 publicly available samples from both domestic pigs and wild boars for comparison [17, 25,26,27,28,29,30,31,32,33,34].
Our research focused on identifying candidate genes and associated biological pathways that may underlie the cold adaptation of wild boar. We examined whether these candidate genes have been highlighted in previous studies involving other species. By analysing these selectively advantageous genes, we discovered the leading genetic variants that exhibit significant differentiation between warm- and cold-tolerant populations, as well as site-level signals indicating selective sweeps among all regulatory and missense variants.
Additionally, we investigated the functional implications of these leading variants through experiments, including the Cleavage Under Targets and Tagmentation (CUT&Tag) and RNA-seq. Our study provides valuable insights into the positively selected genes and rare variants that may contribute to the bioclimatic adaptation of wild boar in cold regions.
Methods
Whole genome sequencing and variants calling
Genomic DNA was extracted from hair follicles of five Vietnamese wild boars (Son La province, Vietnam, ~ 20°N), three wild boar from the Novosibirsk region (Novomyhaylovka village, Kochenyovskiy district, Latitude 55°17′35ʺN, Longitude 81°48′38ʺE), one wild boar from Tyva (~ 51°N), one wild boar from Buryat (~ 51°N), and one wild boar from Zabaykalsky Krai (~ 52°N). The extraction protocol was based on a modified phenol–chloroform method [35]. The procedure is summarised as follows: more than ten hair follicles were collected from each pig and transferred into a 1.5 mL microcentrifuge tube. A lysis buffer containing SDS and Proteinase K was added, and the samples were incubated at 56 °C for 1–2 h to ensure complete tissue digestion. After lysis, an equal volume of phenol/chloroform/isoamyl alcohol (25:24:1) was added and mixed thoroughly. Following centrifugation, the clear aqueous phase was transferred to a new tube, and genomic DNA was precipitated using two volumes of pre-chilled ethanol. Ethanol was chosen instead of isopropanol to avoid interference in downstream applications due to the lower volatility of isopropanol. The DNA was pelleted through high-speed centrifugation, washed, and resuspended in TE buffer. The extracted DNA exhibited A260/A280 ratios between 1.85 and 2.00, with yielded exceeding 4 μg per sample, which was sufficient for downstream PCR-based applications. Whole-genome sequencing (WGS) was performed on all samples using the DNBSEQ-T7 platform (MGI) with a paired-end library (2 × 125 bp).
Whole-genome mapping and calling processes were primarily carried out following established methodology [17, 36, 37]. In brief, whole-genome short-read data for samples representing wild boars, domestic pigs, and outgroup species were cleaned using the fastp software with its default parameters [38]. The cleaned data were then mapped to the genomic reference Sscrofa 11.1 using BWA v0.7.17 [39]. Notably, sex chromosome variants were excluded from further analyses. Despite the availability of several reference-level assembled genomes for pigs, including various domestic breeds [40, 41] and the wild boar [37], we selected Sscrofa11.1 as the reference due to its superior annotation and sequencing quality [42].
For variant calling, we utilised two software programs SAMtools v1.15.1 [43] and GATK v4.2.6.1 (https://gatk.broadinstitute.org/hc/en-us). The primary steps included marking duplicates, recalibrating base quality scores, performing per-sample calling with HaplotypeCaller, and conducting joint-calling with GenotypeGVCFs. We filtered variants using the expression “QUAL < 100.0 || QD < 2.0 || MQ < 40.0 || FS > 200.0 || SOR > 10.0 || MQRankSum < − 12.5 || ReadPosRankSum < − 8.0”. These procedures were applied to both newly sequenced samples and publicly available ones [17, 25,26,27,28,29,30,31,32,33,34]. To minimise the impact of samples with low sequencing depth, a minor allele frequency (MAF) threshold of 0.05 and a missing genotyping rate of 0.2 were implemented. Population-based phasing was conducted using Beagle5.2 [44].
Use of other datasets and grouping strategy
To reduce computational burden while gaining a comprehensive understanding, we designed two datasets: a core dataset focusing on wild boar and an extended dataset that included both wild boar and domestic pigs. The simplified core dataset consisted of 48 wild boar samples and 15 samples from outgroup species (see Additional file 1, Table S1). This core dataset was utilised for nearly all analyses, except for the initial confirmation of population identity and the analysis of allele frequency distributions.
For the core dataset, we included 24 wild boars from northern and northeastern China to represent the cold region population, and 24 wild boars from southern China, Sumatra, and southern Europe (Italy and Greece) to represent the warm region populations. The 15 samples of outgroup species comprised four different Sus species: Sus verrucosus, Sus celebensis, Sus cebifrons, and Sus barbatus [22] (see Additional file 1 Table S1).
Incorporating European pigs into our dataset enabled us to distinguish between alleles of European origin and those that arose locally in Asia, which helped minimise the risk of misinterpreting recent gene flow as indicators of natural selection for cold adaptation. Additionally, while southern Europe differs from tropical Asia in environmental conditions, both regions share a more temperate climate and are closer in latitude compared to Siberia, which is classified as the cold region in this study. For simplicity, we categorise Italy and tropical Asia together under the non-cold region label, referring to them as the warm region in this analysis.
The second dataset was significantly larger, comprising 488 samples that included domestic pigs, wild boar, and outgroup species. This dataset was utilised to confirm the population identity of new samples and to evaluate the distribution of allele frequencies across different geographical populations (see Additional file 1, Table S2). Specifically, the dataset included 11 new samples along with 477 samples from the public databases, consisting of wild boar and domestic pig samples (see Additional file 1, Table S2).
The geographic populations were categorized as follows: European wild boar (EUW, 47 samples), East Asian northern wild boar (EANW, 30 samples), East Asian southern wild boar (EASW, 18 samples), southeastern Asian wild boar (SEAW, 8 samples), and outgroups species (OG, including African warthog, African bush hog, African red river hog, pygmy hog, Southeast Asian Sus species, 32 samples).
Additionally, domestic pig samples were classified into the following categories: European domestic pigs (EUD, 186 samples), East Asian northern domestic pigs (EAND, 84 samples), East Asian southern domestic pigs (EASD, 31 samples), and East Asian western domestic pigs (EAWD, which are also Tibetan pigs, 52 samples). After applying filters for MAF (0.05) and genotype quality (0.2), we retained 21,845,142 variants for the extended dataset and 7,804,721 variants for the core dataset (excluding outgroup samples from both) (Table 1).
Population relationships, ancestry, and gene flow
To investigate whether sample size affects population relationship, we analysed two datasets: a core dataset consisting of 63 genomes of wild boar and outgroup species samples and an extended dataset comprising 488 genomes of wild boar, domestic pigs, and outgroup samples. We began by evaluating and comparing the phylogenetic topologies of the genomes using a distance matrix generated based on identity by state (IBS) (PLINK (v1.9) “-maf 0.05 –geno 0.2 –distance square 1-ibs -thin-count 100,000“).
For the core dataset, we also conducted principal component analysis (PCA) to analyse inter-group relationships and population clustering. To assess potential gene flow between western and eastern Siberian populations, we employed TreeMix analysis [45]. We estimated the optimal number of migration events (m) using OptM inference, applying a model threshold of 99% [46]. We also performed ancestry estimation and admixture analyses using ADMIXTURE v1.3 [47].
Genome-wide scan for natural selection signals in the North Asian wild boar populations
We conducted a genome-wide scan for signals of natural selection in the North Asian wild boar population based on the core dataset (MAF 0.05) (see Additional file 1 Table S1), using the following four complementary statistics: the fixation index (Fst) and nucleotide diversity ratio (θπ_warm/θπ_cold); the XP-CLR test and nucleotide diversity ratio (θπ_warm/θπ_cold) [48]; the extended haplotype homozygosity (EHH)-based statistic (ihh12) [49, 50]; and The Hudson–Kreitman–Aguadé (HKA)-like test [51, 52]. To focus on selection analyses, we removed samples of putative European origin based on PCA, ADMIXTURE results, and phylogenetic data. This allowed us to retain closely related samples of Asian origin, representing populations from cold and temperate regions, with a total of 15 samples from cold regions and 21 samples from temperate regions.
The fixation index (Fst) and nucleotide diversity (π) are classical parameters used to understand genetic differentiation. It is expected that recent positive selection reduces nucleotide diversity while increasing Fst between two populations that exhibit differentiated phenotypes. Therefore, we anticipated that the selected chromosomal regions in North Asian wild boars will show higher Fst values but lower π values.
The XP-CLR test, a cross-population composite likelihood ratio test based on the site frequency spectrum (SFS), is a powerful test for identifying genomic regions under positive selection within a single population [48, 53]. This test detects such regions by identifying shifts in allele frequencies and patterns of linkage disequilibrium, making it effective for pinpointing both recent and ancient selection events [48, 53].
The method known as ihh12, which is based on haplotype analysis, was developed to detect both hard and soft selective sweeps. It provides greater power than other tools, such as iHS, for identifying soft sweeps [49, 50]. Hard sweeps occur when a beneficial mutation arises and quickly reaches fixation, leading to a reduction of genetic variation in the surrounding region. In contrast, soft sweeps involve multiple beneficial mutations at the same locus or rely on pre-existing genetic variation, resulting in a more gradual and partial reduction in genetic diversity [49, 50].
Additionally, the HKA-like test is grounded in the neutral theory prediction that species divergence (fixed site differences) should correlate with population polymorphism [51]. If selection operates at the population level, we expect a faster reduction in polymorphisms compared to species divergence [52, 54].
These methods and principles have also been extensively applied in medical genetics to identify disease variants [55,56,57]. In our analysis, we used inter-species differences with numbers of fixed sites (> 60 SNPs) as a measure of divergence between S. scrofa and other four Sus species (S. verrucosus, S. celebensis, S. cebifrons, and S. barbatus).
The results from the four methods were normalised based on ranked genes, considering only genes that fell outside 99% of the parameter distribution as significant departures from a strict neutral expectation for each method. To ensure a more rigorous selection of candidates for cold adaptation, we classified genes as positively selected only if at least three of the four methods supported them (see Additional file 1, Table S7).
Initially, positively selected sites were identified using a conservative approach, which involved calculating the largest allele frequency difference (ΔdAFcold-warm) between cold- and warm-region wild boar populations for both regulatory and exonic variants. We further validated the signal of recent positive selection using the Extended Haplotype Homozygosity (EHH) method applied to the focal variants [58], as implemented in rehh v2 [59]. The ancestral-derived relationships were inferred based on polarising sites from the outgroup Sus species. We estimated the local phylogeny using FastTree v2.1 [60], based on haplotype consensus sequences generated with SAMtools and BCFtools v1.15 [43]. Finally, we assessed potential convergent evolution by comparing our identified candidate genes with genes that have been reported to be under positive selection in cold regions for other species (human [3] and cattle [61]).
Functional pathway enrichment analysis for genes supported by at least three methods and allele frequency distribution for focal variant(s)
The intricate nature of climatic adaptation involves the responsiveness of hundreds of genes, many of which are likely under positive selection [3, 62]. To better understand the overall patterns of biological pathways, we conducted a functional enrichment analysis. For this analysis, we included only genes that were supported by at least three different methods, utilising the Metascape database [63].
We examined the origin of the derived allele by analysing allele frequency distributions across different populations and species. The population distribution analysis was based on an extended dataset of 488 samples (Additional file 1, Tables S2 and S10). We defined the focal allele across species as fixed if the homozygous allele was present in all outgroup samples.
To validate the presence or absence of an orthologous mutation, we used Ensembl v105 (https://dec2021.archive.ensembl.org/index.html). This allowed us to confirm presence or absence based on a whole-genome alignment between S. scrofa (Suidae) and Catagonus wagneri, a species from the sister family Tayassuidae.
The CUT&Tag and RNA-seq data collection and processing
To investigate whether the identified variants under selection have distinct functional effects in the cold-adapted population, we performed functional assays and RNA-seq analysis. The samples for Cleavage Under Targets and Tagmentation (CUT&Tag) and RNA-seq were collected from 180-day-old Min pigs, a local breed from Northeast China known for its cold adaptation [64]. We specifically chose 180-day-old pigs to control for age-related variability in gene expression, ensuring that the observed differences reflect genetic mechanisms related to cold adaptation rather than developmental effects.
Tissue samples were snap-frozen in liquid nitrogen immediately after collection to preserve RNA integrity and histone modifications. All sample collections were approved by the Ethics Committee of Huazhong Agricultural University (2022-0031). The CUT&Tag experiment followed the original CUT&Tag guidelines for tissues [65]. The H3K27ac histone modification antibody was employed for antibody enrichment in the CUT&Tag analysis of fat and diencephalon tissues from the Min pig [66].
The RNA-seq protocols for fat and diencephalon tissues were based on rRNA-depletion and strand-specific RNA-seq methods provided by Illumina (Illumina, San Diego, CA). Sequencing for both CUT&Tag and RNA-seq was performed using the Illumina NovaSeq6000 (PE150) platform.
The sequencing data from CUT&Tag were cleaned using Cutadapt to remove the adapter sequences [67]. The clean data were then mapped to the Sus scrofa 11.1 reference genome using Bowtie2 [68]. Peak calling was performed with MACS [69]. The vertebrate motifs referenced in JASPAR2020 were utilized to match the enhancer sequence of focal genes [70]. The RNA-seq data were processed in accordance with our previous study [71].
The origin of candidate variants under positive selection
There are four candidate scenarios that could explain the evolutionary origin of the derived alleles under positive selection. In the first scenario, the two alleles emerged de novo in the cold-region population. In the second scenario, the low-frequency standing variant was transferred from warm-region populations to cold-region populations through gene flow. In the third scenario, the two alleles appeared de novo in domestic pigs within the last 10,000 years and were subsequently transferred to cold-region wild boar populations via gene flow. And in the fourth scenario, low-frequency standing variation existed at the time of divergence of the Northern populations and later increased in frequency due to selection or drift.
To differentiate between the first two scenarios, we performed gene flow analysis on the genomic region surrounding the focal variants. Specifically, we conducted the localised TreeMix analysis to determine the direction of gene flow [45]. Additionally, we confirmed potential gene flow by analysing the phylogenetic discordance between local background topologies, using a method similar to that of the previous study [17]. The third scenario would be valid if the allele frequencies of the positively selected derived alleles are significantly higher in domestic pigs compared to wild boar populations. The fourth scenario is challenging to assess in this study due to the lack of ancient Asian wild boar samples.
Historical population demography for cold-region populations
To explore the demographic history of tropical Asian wild boar and Siberian wild boar, we conducted the PopSizeABC analysis [72], using the same samples as used for the selection scan. Historical demographic analysis can help us ascertain whether the signals of natural selection we identified are instead due to genetic drift. This is important because genetic drift, which can occur when the effective population size (Ne) is small, might lead to the random fixation of neutral variants [73].
Specifically, we began by summarising the folded allele frequency spectrum and the average zygotic linkage disequilibrium. Next, we simulated 400,000 datasets based on random population size histories. These pseudo-observed datasets (PODs) matched the sample size and covered 100 independent regions, each 2 Mb long. We independently estimated ancestral population sizes using the ABC method for each POD. Finally, we compared the estimated values to the true values, with a tolerance rate of 0.001.
Results
The phylogenetic origin of wild boar from cold and warm regions
We obtained a total of 821 Gb of whole-genome data from 11 new samples: six wild boars from Siberia and five from Southeast Asia (Fig. 1a). After mapping the data to the pig genome reference (Sscrofa v11.1), the average sequencing depth was estimated to be 28.32x. We compiled two datasets for different analytical purposes, each with varying sample sizes: a core dataset comprising 63 genomes and an extended dataset encompassing 488 genomes (see Table 1, Additional file 1 Table S1 and Table S2).
The population distribution, phylogeny, structure, and Admixture analysis of wild boar populations. a The wild boar population distribution (map retrieved from the IUCN red list: https://www.iucnredlist.org/species/41775/44141833). The red and blue colors indicate samples from warm and cold regions, respectively. Triangles and stars indicate populations with newly sequenced data and publicly available data, respectively. The population locations are Sumatra (A), Vietnam (B), South China (C), Northeast China and Korea (D), Zabaykalsky Krai (E), Buryat (F), Tyva (G), Novosibirsk (H), and Italy and Greece (I). b The phylogenetic tree for the extended dataset of 448 samples is based on IBS distances among samples. Background shadows highlight major clades, and brackets indicate newly sequenced samples. c The principal component analysis (PCA) and phylogenetic tree based on autosomal SNPs. In the phylogenetic tree, all major clade divisions were supported by 1000 bootstrap replicates (100%). d The admixture analysis (K = 2, 3, 4) for population ancestry. The green color indicates the outgroup Sus species (S. verrucosus, S. celebensis, S. cebifrons, and S. barbatus) from the islands of Southeast Asia. The red symbols show the Southeast and East Asian population ancestry in populations from A to G. The light blue shows the European ancestry in H and I
Using a distance matrix based on IBS, along with the neighbor-joining method, we initially evaluated the phylogenetic relationships of all 488 samples (Fig. 1b). Our analysis revealed that the eastern Siberian samples (E, F, and G in Fig. 1a) clustered within the clade of wild boar populations from northern China, northeastern China, and northeastern Asia, including South Korea and Japan (D). In contrast, the western Siberian samples (H in Fig. 1a) clustered with the European wild boar, while the southeastern Asian wild boar showed close clustering with wild boar and indigenous breeds from southern China.
For the core dataset, we conducted IBS phylogenetic inference and performed PCA to validate population relationships (Fig. 1c). These analyses showed a similar population phylogeny to that obtained with the extended dataset. Specifically, our newly sequenced tropical population (B) clustered together with the temperate wild boar from southern China (C in Fig. 1a). The newly sequenced Siberian population was divided into two clusters (E, F, and G vs. H, Fig. 1c), which was consistent with the pattern observed in the extended dataset (Fig. 1b). These results demonstrated a greater divergence between populations from western (H) and eastern Siberia (E, F, and G). In contrast, the tropical population (B, Vietnam) showed a close relationship with the temperate Asian wild boar populations (southern China, C). Overall, these findings indicate a substantial genomic differentiation between the peripheral populations in cold regions and the source populations in tropical regions.
Further analysis on population structure was conducted using ADMIXTURE v1.3 [47] (Fig. 1d). A cross-validation approach identified the optimal number of distinct ancestries at K = 4 (Fig. 2a). The investigation of ancestral composition, considering ancestry counts ranging from two to four, revealed two primary clades that aligned with the established ancestries of Eurasian wild boar and domestic pigs, namely Asian and European lineages.
ADMIXTURE and gene flow analyses among major populations. a The cross-validation errors of the ADMIXTURE tool for inferring population ancestry and admixture. b The optimal number of migration events (m) for TreeMix based on the inference of OptM estimation. Over 99.8% of the variance was explained when m = 3. c The δm estimation supported the migration events of 3. d The direction of gene flow revealed by TreeMix, based on autosomal SNPs. The gene flow from eastern to western Siberia was detected (EFG to H). Three arrows show the directions from donor populations to recipient populations. Note: all the population codes are the same as in Fig. 1a
To examine gene flow between western and eastern Siberian populations, we utilised TreeMix (v1.12) [45] (Fig. 2). This analysis determined the optimal number of migration events (m) to be three, which accounted for over 99.8% of the variance in genetic relatedness among the populations (Fig. 2b, c). Notably, the inferred migration events indicated a predominant westward gene flow from eastern to western Siberian populations (Fig. 2d).
This division was further supported by PCA and phylogenetic assessments, which clarified the major population relationships (Figs. 1b and c). Interestingly, the population in western Siberia exhibited a 9.64% composition of Asian ancestry when analysed at K = 3 (Fig. 1d), suggesting limited gene flow from Asia into the western Siberian demographic.
Genes under selective sweeps and their functional enrichment in thermogenic and adipose-related pathways
We used four complementary analytical approaches to identify candidate genes that exhibit signs of selective pressure in cold-region wild boar, specifically those native to Siberia, northern and northeastern China, and northeastern Asia (see Additional file, 1 Table S2–7). Our comprehensive analysis revealed that 1.54% of genes (313 out of 20,306, Ensembl v105) were consistently identified as being under selection by at least three of the four methods, reflecting conservative estimation. Furthermore, 0.45% of the identified genes (92 out of 20,306) were confirmed by all four analytical methods (see Additional file, 1 Table S7).
We analysed the functional enrichment of genes that were supported by at least three methods (Fig. 3a, Table 2, and Additional file 1, Table S8). This analysis revealed four pathways that may be related to cold resistance: the ‘thermogenesis’ pathway (ADCY9, NDUFB6, PPARG, PRKACB, SMARCC1, TSC2); the ‘regulation of cold-induced thermogenesis’ pathway (ACADL, IGF1R, JAK2, NOVA1, NOVA2); the ‘positive regulation of adipose tissue development’ pathway (PPARG, NCOA2, SIRT1); and the ‘fat cell differentiation’ pathway (PPARG, TGFB1, SIRT1, WWTR1) (p < 0.05, Fig. 3a). These findings suggest that specific pathways, particularly those regulating thermogenesis and fat cell differentiation, were crucial for the adaptation of cold-region wild boars to their harsh environmental conditions.
Functional enrichment analysis and selection signals for the candidate gene IGF1R underlying cold adaptation in the cold-region wild boar. a The enriched pathways were analyzed using the Metascape database. Only the significantly enriched pathways (p < 0.05) are listed (see Additional file 1 Table 8). b Haplotype blocks around IGF1R for the cold and warm region samples based on variants with allele frequencies higher in cold than in warm regions (ΔdAFcold-warm > 0.5). c The plot of the HKA-like test. The vertical axis shows the decreased numbers of polymorphic sites relative to the counts of interspecies divergent sites. d The distribution of the Fixation index (Fst) and nucleotide diversity ratio [log2(θπ_warm/θπ_cold)]
By analysing genetic variants with higher allele frequencies in wild boar from cold region (Siberia, Korea, northern and northeastern China) compared to those from warmer regions (temperate and tropical Asia), we discovered a long haplotype block that exhibited strong linkage disequilibrium (LD, ΔdAFcold-warm > 0.5, D’ > 0.9) among the variants of seven genes (FAM169B, PGPEP1L, IGF1R, SYNM, TTC23, LRRC28, and MEF2A). Notably, this region was the longest gene cluster we identified showing selective signals, spanning 1.3 Mb on chromosome 1 (137.2 Mb–138.5 Mb, Fig. 3b).
Among the linked genes, the Insulin-like Growth Factor 1 Receptor (IGF1R) received support from all four evaluation methods (see Additional file 1, Table S7). This suggests a reliable indication of positive selection that acted on this gene. Using the HKA-like test [51], we identified IGF1R as the most distinguished gene, demonstrating the highest reduction of polymorphism relative to divergence, which supports a significant deviation from neutral evolution (Fig. 3c).
At the population level, IGF1R, ALDH1A2, and PGPEP1L exhibited the highest inter-population divergences between wild boar from cold and warm regions among all protein-coding genes (Fst = 0.65, Fig. 3d and see Additional file 1, Table S3). Notably, among these three genes, IGF1R showed the greatest reduction in nucleotide diversity for wild boar from cold regions compared to those from warm regions (Fig. 3d), indicating strong positive selection on linked variants that are advantageous for cold adaptation.
The potential convergent evolution for Siberian mammals
The potential role of convergent evolution was examined by analysing genes under positive selection for cold adaptation in Siberian human populations. This analysis was conducted using the iHS and XP-EHH methods (top 1% windows) from a previous study [3] (see Additional file 1, Table S9). By concentrating on shared genes that were supported by four rigorous methods (see Additional file 1, Table S7), we identified three consensus genes, SLCO1C1, PDE3A, and TTC28, which suggests instances of convergent evolution. Additionally, we found positive selection signals by comparing nucleotide diversity between cold- and warm-regions populations and by conducting the HKA test. This analysis highlighted two neighboring genes, SLCO1C1 and PDE3A (Fig. 4a), as well as another gene, TTC28 (Fig. 4b).
The most differentiated intronic and exonic variants were detected in IGF1R and BRD4, respectively
By analysing polymorphic variants across 92 candidate genes identified by four selection screening approaches (see Additional file 1, Table S7), we identified the variant with the greatest allele frequency difference (ΔdAFcold-warm) between cold- and warm-region wild boar populations for both regulatory and exonic variants. For regulatory variants, we identified the highest ΔdAFcold-warm in an intronic variant of IGF1R (NC_010443.5:g.137677482C > T, c.94 + 12830G > A, intron 1, rs341219502), with ΔdAFcold-warm = 0.896.
We validated selection on this variant using the method of extended haplotype homozygosity (EHH) [58, 74]. This showed a decay of haplotype homozygosity as the distance from the focal core allele increased (Fig. 5a). This evidence of positive selection was further supported by changes in nucleotide diversity (π) and the level of polymorphism relative to divergence between populations (Fig. 5a). The nucleotide diversity (π) near rs341219502 was significantly lower in cold-region wild boar populations compared to their warm-region counterparts (the chi-square test, p < 1.2 × 10−4). Similarly, we found a significant deficiency of derived polymorphic variants surrounding rs341219502 in cold-region wild boar relative to polarised divergent sites at the interspecific level (chi-square test, p < 2.2 × 10−16, red curve, Fig. 5b). The EHH decay was much more rapid for ancestral variant haplotypes (blue curve) than for derived variant haplotypes (red curve, Fig. 5c).
Site-level and haplotype-based selection signals, and allele frequency changes for IGF1R and BRD4. a The local signals of IGF1R selective sweep based on evidence of nucleotide diversity (π) for cold- and warm-region populations (left axis) and the polarized HKA test (right axis). The red arrow indicates the focal site of rs341219502 of IGF1R. b The EHH bifurcation diagram for haplotype density and breakdown around the site rs341219502 of IGF1R. Ancestral haplotypes are blue and derived ones are red. The line thickness is positively correlated to the density of haplotypes. c The EHH “hat” diagram for ancestral and derived haplotypes around IGF1R allele rs341219502 of IGF1R. d The population and evolutionary conservation for the ancestral state of rs341219502 (“C”) of IGF1R based on the whole-genome alignment. The first sequence represents the mutant sequence of cold-region wild boars. The next 13 sequences are retrieved from genomes of pig breeds in Ensembl (v105) and the last sequence shows Catagonus wagneri genome (Ensembl v105). e The allele frequency distribution for rs341219502 of IGF1R in multiple populations. f The local signals of BRD4 based on evidence of nucleotide diversity (π) for cold- and warm-region populations (left axis) and the polarized HKA test (right axis). The red arrow indicates the focal site of rs327139795. g The EHH bifurcation diagram for haplotype density and breakdown around the site rs327139795 of BRD4. Ancestral haplotypes are blue and derived ones are red. The line thickness is positively correlated to the density of haplotypes. h The EHH “hat” diagram for ancestral and derived haplotypes around BRD4 allele rs327139795. i The population and evolutionary conservation for the ancestral state of rs327139795 of BRD4 based on the whole-genome alignment. j The allele frequency distribution for rs327139795 of BRD4 in multiple populations
A cross-species orthologous alignment of Amniota vertebrates (Ensembl v105) revealed that the ancestral state ‘C’ was highly conserved in Suidae species, ranging from Catagonus wagneri to 13 pig breeds, suggesting that the'C'allele is likely the ancestral state (Fig. 5d). The allele frequency distribution indicated that this variant was absent in tropical Asian populations but fixed (100%) in northern Asian wild boar populations (Fig. 5e). All these findings support the recent selective sweep in the region surrounding rs341219502 of IGF1R.
For exonic variants, we identified those with disruptive or protein-altering effects. Based on the ΔdAFcold-warm metric, we revealed a missense derived variant in the BRD4 gene (NC_010444.4:g.62317232G > A, c.1043G > A, p.Ser348Asn, rs327139795) that exhibited the highest cold-warm differentiation (ΔdAFcold-warm = 0.854) among all exonic variants analysed. We observed decreased nucleotide diversity in cold-region populations compared to warm-region populations, along with reduced polymorphism relative to divergence in cold-region populations (Fig. 5f). We also noted a delayed decay of derived haplotype homozygosity (Fig. 5g and h). These findings support the hypothesis that recent positive selection acted on this variant.
The nucleotide change from ‘G’ to ‘A’ results in an amino acid change from Ser to Asn in the second exon of BRD4. Multiple species alignment indicated that the ancestral state of the variant ‘G’ is highly conserved among mammals (Fig. 5i). We did not detect the derived allele ‘A’ in nine outgroup species, including African warthogs, pygmy hogs, and Sus species, which suggests that the derived allele ‘A’ likely emerged after the specification of Sus scrofa. We also found no evidence of the derived'A'allele in wild boar populations from Southeast Asia or Europe, indicating its origin in East Asian (Fig. 5i).
Within East Asian wild populations, the derived allele of rs327139795 was nearly fixed in cold-region populations (96.30%) but was a rare allele in warm-region populations (2.78%, Fig. 5j). Homozygotes for the derived allele “AA” were prevalent in cold-region populations (92.59%, 25/27) but completely absent in warm-region populations (0%, 0/18).
Transcriptional changes in rs341219502 of IGF1R and post-translational changes in rs327139795 of BRD4
Based on public data from H3K4me1 ChIP-seq of adipose and cerebellum tissue, we identified that the rs341219502 variant is located within the enhancer region of the first intron of IGF1R (Fig. 6). This suggests that this variant plays a significant role in regulating gene expression. To further explore this, we investigated the expression of IGF1R in the fat and diencephalon tissues of the Min pig, a local breed native to the cold region of northeastern China. We collected tissue samples from individuals carrying the mutant allele of IGF1R. Results from the RNA-seq expression analysis indicated that IGF1R is highly expressed in both the fat and diencephalon of adult Min pigs (Fig. 7a).
Mapping signals based on H3K4me1 and H3K27ac ChIP-seq data. The mapping signals based on H3K4me1 and H3K27ac ChIP-seq data retrieved from UCSC Genome Browser (http://genome.ucsc.edu/s/zhypan/susScr11_15_state_14_tissues_new). The red arrow shows the coordinate to the intronic variant rs341219502 of IGF1R
Regulatory enhancer mapping and mutational effects of rs341219502 in IGF1R and of rs327139795 in BRD4. a The RNA-seq expression of IGF1R in tissues of fat and diencephalon for the Min pig. b The H3K27ac intensity around the gene IGF1R. (c) The H3K27ac intensity around the rs341219502 in the IGF1R intron. d The predicted TF binding sites gained for rs341219502 (C > T) at IGF1R intron. e The RNA-seq expression profile of TFs in tissues of fat and diencephalon from the Min pig. Note: CPM, or Counts Per Million, is a gene expression normalization to make the expression levels comparable across different samples by accounting for sequencing depth and library size. f The expression of BRD4 in fat and thalamus tissue of Min pig. g The H3K27ac intensity around the BRD4 gene. h The H3K27ac intensity around the rs327139795 in BRD4 exon. i The TF binding nearby the rs327139795. j The amino acid change of the rs327139795 in exon 6 of BRD4. k The absence of phosphorylation site in mutant type of rs327139795 in exon 6 of BRD4
We conducted the CUT&Tag experiment [65] and identified the H3K27ac modification around the IGF1R gene (Fig. 7b). Specifically, we observed that signals related to the rs341219502 variant are within the enhancer region of the first intron of IGF1R (Fig. 7c). This finding confirms that this variant plays a role in regulating gene expression.
Furthermore, our prediction of the transcription factor (TF) motif showed that the derived allele ‘T’ introduces three novel TF binding sites, including the NFATC3, SPI1, and RFX5 (Fig. 7d). We then analysed the expression levels of these three TFs using RNA-seq and found that they are consistently expressed in adipose and diencephalon tissues (Fig. 7e). These results indicate that the rs341219502 derived allele ‘T’ acts as an enhancer mutation, potentially increasing the activity of the IGF1R enhancer by creating new binding sites for these transcription factors.
We also conducted an analysis of BRD4, a gene expressed in both adipose tissues and the diencephalon. We observed enrichment of the H3K27ac modification around BRD4 and its exonic variant, rs327139795 (Fig. 7f–h). In contrast, transcription factor (TF) motif analyses did not identify any TF binding sites at the locus of the BRD4 exonic variant rs327139795 (Fig. 7i).
Since this variant is located in the exon of BRD4, we further examined the amino acid changes associated with allele G and A (rs327139795). The results indicated that when the genomic sequence changes from G to A, the amino acid at this position (348aa) shifts from Serine to Asparagine (Fig. 7j). Serine is typically a site for phosphorylation and substituting it with Asparagine is predicted to eliminate this phosphorylation capability. To confirm this prediction, we performed a phosphorylation site analysis of both the wild-type BRD4 and the rs327139795 variant using NetPhos 3.1 software, which employs neural network ensembles. The analysis verified the loss of the phosphorylation site at amino acid position 348 due to the substitution of Serine with Asparagine (Fig. 7k).
Potential de novo origin and recent selective sweep of rs341219502 and rs327139795 in cold-region wild boar
The localised TreeMix analysis of the genomic regions upstream and downstream of the intronic variant rs341219502 in IGF1R (137.3 Mb–137.6 Mb) indicated a gene flow from cold-region populations to warm-region populations (Fig. 8a, EANW to EASW). Phylogenetic relationships confirmed this direction of gene flow (Fig. 8b). Specifically, within the background topology of chromosome 1, East Asian wild boar populations were divided into a warm clade and cold clade (Fig. 8b).
Derived allele frequency distribution and gene flow direction for rs341219502 and rs327139795. Description: a The localized gene flow around variant rs341219502 (600 Kb) revealed by TreeMix. Three arrows show the directions from donor populations to recipient populations. All population codes are the same as in Fig. 1a. b The background topology of chromosome 1 with a highlight on the two clades of North-region (cold) and South-region (warm) populations, represented by blue and red branches, respectively. c The local haplotype tree of 600 Kb around the rs341219502 inferred with the Maximum likelihood method of FastTree v2.1. The red haplotypes from the warm-region population were nested into the clade of the cold-region population. The “1” and “2” represent the two haplotypes for each sample. The black asterisks indicate major clade branching with 100% support values. The triangles represent wild boar samples from temperate or tropical regions that are nested within the clade of the cold region. The abbreviations of regions are: EUW, European wild boar; EAWD, East Asian western domestic pigs (Tibetan breed); SEAW, southeastern Asian wild boar; EASW, East Asian southern wild boar; EANW, East Asian northern wild boar (including populations from northern China, Korea, eastern Siberia, and northeastern China). d The localized TreeMix migration events (m = 3) for the region from 61.5 Mb to 63 Mb around the 811 selected variant rs327139795 of BRD4. The directions of gene flow are shown with arrows
However, the local haplotype tree around rs341219502 revealed that five haplotypes from the warm clade had dispersed into the cold clade (Fig. 8c). This suggests that some warm-clade haplotypes may have been replaced by those from the cold clade, indicating a typical process of gene flow. Consequently, the presence of the derived ‘T’ allele in two warm-region wild boar samples likely resulted from southward gene flow from the cold-region populations.
For gene flow analysis of the exonic variant rs327139795, we noted a small number of genic variants (only 21 within the gene). Therefore, we expanded our analysis to encompass broader surrounding regions of the focal variant (61.5 to 63 Mb) and estimated local gene flow events using TreeMix [45]. The results supported a gene flow direction from cold- to warm-region wild boars (Fig. 8d). This observation suggests that the low frequency of allele ‘A’ in rs327139795 within the warm-region population was likely introduced into the warm-region population via gene flow from cold regions.
To evaluate the likelihood of the third scenario, we analysed the allele frequency distribution among domestic pigs (353 samples, see Additional file 1, Table S10). We did not find the derived ‘T’ allele of rs341219502 in either European or southern Chinese domestic pig populations. This allele was present as a rare variant in East Asian northern and western domestic pigs, with low frequencies (2.98 and 4.81%, respectively). Only 2.55% (9/353) of the northern Chinese domestic samples carried this derived allele, specifically in Min, Meishan, and Tibetan breeds. Furthermore, the majority of the domestic genotypes (8/9) that carried the derived allele were heterozygous, with only one Tibetan domestic pig being homozygous. In sharp contrast, in cold-region wild populations from Japan, Korea, Siberia, and northern China, 93.3% (28/30) of the genotypes carrying this allele were homozygous. For rs327139795, homozygotes for allele ‘A’ were also rare in both European and Chinese domestic populations (1.07 and 6.58%, respectively). Overall, among all domestic pigs, individuals homozygous for the variant were rare, comprising just 0.28% (1/353) of the population for rs341219502.
Discussion
Positively selected genes, functional enrichment, convergent evolution, and functional studies
A key challenge in biology is to identify the functional elements within the genome and understand their roles in adaptive processes. Natural selection can produce distinctive patterns in the genomic regions surrounding positively selected genes, which differ from those predicted under neutral evolution. Several key indicators, such as variations in genetic diversity, shifts in allele frequencies, and divergences between species, are essential tools for identifying genes under positive selection that are vital for complex adaptations [15, 17, 75,76,77,78].
For instance, research on the plateau wild boar has revealed genes with selective signals that are crucial for surviving harsh environmental conditions [78,79,80,81]. Additionally, the haplotype approach has identified a significant section of the X chromosome involved in climate adaptation in both wild and domestic pig populations from northern China and Europe [17]. These studies have greatly enhanced our understanding of the dynamics of natural selection.
Northern Asia, particularly the vast expanse of Siberia, is known for its extremely cold winters, which create significant challenges for endothermic mammals in maintaining their thermal balance. The severe low temperatures act as a strong selective force, especially for species like the wild boar, requiring adaptations that enhance thermoregulation in these frigid conditions. In addition to the cold, another major environmental challenge is the scarcity of food resources during the long winters [21]. Therefore, investigating the molecular adaptations that allow peripheral or derived populations of wild boar to survive and thrive in the cold Siberian climate and its surrounding regions is a topic of great scientific interest.
In this study, we utilised whole-genome sequencing and four selective sweep scan methodologies to identify candidate genes involved in cold adaptation. Our pathway analysis highlighted several key biological processes, particularly those related to regulation of fat cell differentiation, development of adipose tissue, thermogenesis, and cold-induced thermogenesis. These findings confirm the critical role of brown adipose tissue in facilitating thermogenic adaptation to cold conditions. This process triggers a range of responses to cold, including neural, vascular, and metabolic responses, as demonstrated in research conducted on humans, mice, insects, and polar mammals [82,83,84,85].
We also explored the hypothesis that endothermic mammalian species living in cold regions, such as Siberia, may share genes that have undergone convergent selection for cold resistance. Our analysis identified three positively selected genes (SLCO1C1, PDE3A, and TTC28) that have also been reported in indigenous Siberian human populations [3]. Notably, the relationship between low temperature and SNPs near SLCO1C1 and PDE3A has been also observed in Holstein cattle [61]. Consequently, it is likely that these genes are subject to recurrent convergent evolution in mammals due to their interconnected functions related to cold resistance [62, 86].
Two genes, IGF1R and BRD4, were supported by four methods and showed the strongest site-level selection signals. IGF1R, in particular, is intriguing due to extensive research on its functions relating to thermoregulation. Studies involving transgenic mice have demonstrated that IGF1R can lower core body temperature when subjected to the combined effects of cold stress and calorie restriction [87, 88]. Additionally, IGF1R may play a role in regulating body size [89,90,91].
Natural selection, driven by emerging selective forces, can lead to an increased frequency or fixation of derived alleles, as well as a decrease in the frequency or loss of ancestral alleles within a peripheral population, due to novel adaptation [19, 58]. The variants that cause changes in genes undergoing positive selection are expected to show significant differences in allele frequencies between peripheral and source populations [19].
Among the identified variants, an intronic variant with likely regulatory impact (rs341219502, c.94 + 12830G > A, reverse strand) in IGF1R and a missense variant (rs327139795, c.1043G > A, forward strand) in BRD4 displayed the strongest differentiation between populations from cold and warm regions. Notably, BRD4 was also among the genes supported by all four methods of selective sweep scans.
Fat and diencephalon are among the tissues that contribute to an animal's ability to withstand cold temperatures [92, 93]. Our CUT&Tag experiment on these tissues indicates that the derived allele of rs341219502 is located in the enhancer region of IGF1R. We validated the enhancer signals of this variant in both fat and diencephalon tissues of the Min pig, a local breed adapted to the cold environment of northeastern China. The transition of the allele from ‘C’ to ‘T’ could create novel binding sites for TFs, including NFATC3, SPI1 and RFX5, at the enhancer region.
Interestingly, previous studies have shown that NFATC3 is essential for cardiac development and mitochondrial function [94]. Additionally, NFATC3 can enhance insulin sensitivity, influencing gene expression that affects the development and adaptation of various mammalian cell types, including adipocytes and neurons [95]. SPI1 (PU.1) has the ability to inhibit adipocyte differentiation [96, 97], while RFX5 is involved in regulating resistance to nutrient stress [98].
Moreover, the ‘A’ allele of rs327139795, which is located within the exon of BRD4, leads to a change from Serine to Asparagine. This alteration, based on the inherent characteristics of amino acids, could impact phosphorylation sites. As a result, this change may affect the interaction between the mutated Asparagine and the surrounding amino acids, potentially altering protein function.
Future research that integrates single-cell multi-omics and spatial transcriptomics may provide a clearer understanding of the functional mechanisms behind these variants, offering valuable insights into evolutionary adaptation and its practical applications in agriculture and conservation genetics.
Evolutionary origin of the candidate variants under positive selection for cold resistance
The two mutations, ‘T’ of rs341219502 in IGF1R and ‘A’ of rs327139795 in BRD4, are absent in all the outgroup species from the Suidae and Tayassuidae families. This observation supports the idea that these variants are of recent origin rather than being ancient. At the population level, these alleles were fixed only in wild boar from cold regions, while they were rare or absent in wild boars from warmer regions, which are closer to the origin of S. scrofa (Southeast Asia). This strongly suggests that these mutations likely originated de novo in the wild populations of cold regions.
Although these two alleles were detected in a small fraction of the wild boar population in southern Chinese and in Tibetan domestic pigs, it is highly probable that they were introduced from northern populations through gene flow. This conclusion is based on evidence from local TreeMix signals (or phylogeny misplacement), very low allele frequencies, and an extremely high heterozygosity rate among individuals carrying these alleles in those population groups.
Our analysis of allele frequencies for candidate variants under positive selection suggests that these two derived alleles likely did not originate from domestic pigs. The patterns observed in the origin and emergence of rs341219502 and rs327139795 align with the most straightforward interpretation that these alleles originated de novo origin in wild boar populations. The alternative hypothesis—that these variants stem from ancestral polymorphism— is also less convincing, as no such variants have been identified in Southeast Asian and European wild boars or in outgroup species within the Suidae family.
Considering that the divergence time between Northern and Southern Chinese wild boars dates back 25,000 to 50,000 years [99, 100], it is likely that these two variants are less than 50,000 years old. After their de novo emergence, natural selection probably facilitated their fixation in wild populations located in colder regions. However, we cannot completely rule out the possibility that these variants originated from low-frequency standing variants that were present at the time of divergence of Northern Asian populations, which then increased to near fixation. Future extensive sampling of wild boar populations in colder regions could help clarify this question.
Another open question is whether the selected alleles in cold region populations were near-fixed due to positive Darwinian selection or genetic drift. The strong signals of positive selection detected at these two sites (rs341219502 and rs327139795) support the first hypothesis. However, rapid genetic drift can sometimes produce patterns similar to those caused by positive selection [101]. Therefore, it is essential to further investigate the drift hypothesis. Since rapid genetic drift would necessitate a significant reduction in the historical effective population size, we formally evaluated this possibility using PopSizeABC, a simulation-based method for inferring demographic history under the framework of approximate Bayesian computation (ABC). We analysed the ancestral dynamics of effective population size for both warm- and cold-region wild populations of Asian origin. Based on population-level diploid genomes for warm- and cold-region populations, we identified distinct trends in demographic changes over the last 100,000 to 1000 years ago (Fig. 9).
Historical effective population sizes inferred with the approximate Bayesian computation analysis. a The time range from 50,000 to 1000 years ago has the lowest errors (< 20%, scaled) under the tolerance rate of 0.001. The red and blue lines show the warm- and cold-region populations, respectively. b The historical demography of cold-region wild boar populations. c The historical demography of warm-region wild boar populations. The dotted lines indicate the 5% and 95% quantiles of the posterior distribution. The red and blue frames show the time range from 50,000 to 1000 years ago
Numerous studies have found that the divergence time between Northern and Southern Chinese wild boar ranges from 25,000 to 50,000 years ago [99, 100]. This indicates that wild boars may have arrived in northern China prior to this period, with the specified range representing the upper and lower limits of the divergence time between the two populations. Using an approximate Bayesian computation framework, which accommodates more samples than other tools (PSMC [102] and MSMC [103]) and is robust against sequencing errors and complex population dynamics [72], we reconstructed demographic history for both warm- and cold-region samples. This analysis suggested that the emergence of these variants is unlikely to be due to random effects (such as genetic drift). The prediction errors from the PopSizeABC inference were found to be within the acceptable range [72] (Fig. 9a).
PopSizeABC indicated a consistently increasing trend in historical population size for cold-region wild boar populations during the period of ~ 25,000 to 50,000 years ago (Fig. 9b). In contrast, the population sizes of warm-region wild boar populations sharply declined during this timeframe (Fig. 9c). Therefore, the increasing trend in the historical population size of cold-region wild boars does not support the hypothesis of rapid genetic drift.
Our findings lay the groundwork for enhancing cold-adapted livestock breeds through marker-assisted selection, utilizing both computational biology and experimental validation. Future studies involving larger sample sizes of wild boar or other free-ranging species from cold regions may reveal additional positively selected variants associated with cold resistance. These insights can be used to improve the economic performance of local domestic breeds while helping farmers reduce energy costs. For instance, more precise techniques like CRISPR-based genome editing could enhance thermoregulation and metabolic traits. Furthermore, advancements in AI-driven genomic prediction models could expedite the identification of adaptive variants across different populations, thereby informing data-driven breeding programs and simulations of gene-environment interactions in response to changing climatic conditions.
Similar methodologies and principles could be employed to explore various scenarios of climatic adaptation, including heat stress, cold tolerance, and high-altitude adaptation, yielding insights into the genetic and physiological mechanisms underlying environmental resilience.
Conclusions
The wild boar has been remarkably successful in colonising Eurasia, including a rapid expansion from tropical Asia into a variety of climates. This expansion includes their movement into the extreme cold of arctic Siberia less than a million years ago. By employing whole-genome sequencing and various methods of summary statistics—such as analysing the allele frequency spectrum (Fst and the ratio of θπ), haplotype, and species divergence—we identified genes undergoing selective sweeps in populations from cold regions.
These genes were found to enhance metabolic pathways critical for cold resistance, which include thermogenesis, fat cell development, and the regulation of adipose tissue. The most significant selection signal was detected in a 1.3 Mb region on chromosome 1, characterised by linkage disequilibrium surrounding the IGF1R gene. At the variant level, a regulatory variant within IGF1R and a missense variant within BRD4 showed the most pronounced differences in allele frequency between the warm- and cold-region populations. Analysis of allele frequency distributions suggested that these variants likely originated de novo within cold-region wild populations. Demographic reconstructions indicated that genetic drift is unlikely to have contributed to the emergence of these variants.
Given the known roles of BRD4 and IGF1R in regulating bioenergy homeostasis and body temperature [87, 104, 105], our finds illuminate the molecular adaptations of wild boar populations cold climates in Siberia and its surrounding regions. Because cold stress is a leading cause of neonatal piglet mortality [105], our study could inform breeding programs aimed at enhancing piglet cold tolerance.
Data availability
The whole-genome sequence data can be accessed through NCBI BioProject code PRJNA859556.
References
Freeman S, Herron JC. Evolutionary analysis. 5th ed. New Jersey: Pearson Prentice Hall; 2007.
Liu YH, Wang L, Zhang Z, Otecko NO, Khederzadeh S, Dai Y, et al. Whole-genome sequencing reveals lactase persistence adaptation in European dogs. Mol Biol Evol. 2021;38:4884–90.
Cardona A, Pagani L, Antao T, Lawson DJ, Eichstaedt CA, Yngvadottir B, et al. Genome-wide analysis of cold adaptation in indigenous Siberian populations. PLoS ONE. 2014;9: e98076.
Axelsson E, Ratnakumar A, Arendt M-L, Maqbool K, Webster MT, Perloski M, et al. The genomic signature of dog domestication reveals adaptation to a starch-rich diet. Nature. 2013;495:360–4.
Qiu Q, Wang L, Wang K, Yang Y, Ma T, Wang Z, et al. Yak whole-genome resequencing reveals domestication signatures and prehistoric population expansions. Nat Commun. 2015;6:10283.
Hufford MB, Xu X, van Heerwaarden J, Pyhäjärvi T, Chia J-M, Cartwright RA, et al. Comparative population genomics of maize domestication and improvement. Nat Genet. 2012;44:808–11.
Yu J, Zhao P, Zheng X, Zhou L, Wang C, Liu J-F. Genome-wide detection of selection signatures in duroc revealed candidate genes relating to growth and meat quality. G3. 2020;10:3765–73.
Chen J, Ni P, Li X, Han J, Jakovlić I, Zhang C, et al. Population size may shape the accumulation of functional mutations following domestication. BMC Evol Biol. 2018;18:4.
Harris SE, Munshi-South J. Signatures of positive selection and local adaptation to urbanization in white-footed mice (Peromyscus leucopus). Mol Ecol. 2017;26:6336–50.
Kelley JL, Swanson WJ. Positive selection in the human genome: from genome scans to biological significance. Annu Rev Genom Hum Genet. 2008;9:143–60.
Bustamante CD, Fledel-Alon A, Williamson S, Nielsen R, Hubisz MT, Glanowski S, et al. Natural selection on protein-coding genes in the human genome. Nature. 2005;437:1153–7.
Siewert KM, Voight BF. Detecting long-term balancing selection using allele frequency correlation. Mol Biol Evol. 2017;34:2996–3005.
Chen J, He X, Jakovlić I. Positive selection-driven fixation of a hominin-specific amino acid mutation related to dephosphorylation in IRF9. BMC Ecol Evol. 2022;22:132.
Guo J, Zhong J, Li L, Zhong T, Wang L, Song T, et al. Comparative genome analyses reveal the unique genetic composition and selection signals underlying the phenotypic characteristics of three Chinese domestic goat breeds. Genet Sel Evol. 2019;51:70.
Chen J, Ying L, Zeng L, Li C, Jia Y, Yang H, et al. The novel compound heterozygous rare variants may impact positively selected regions of TUBGCP6, a microcephaly associated gene. Front Ecol Evol. 2022. https://doi.org/10.3389/fevo.2022.1059477.
Zhang L, Ren Y, Yang T, Li G, Chen J, Gschwend AR, et al. Rapid evolution of protein diversity by de novo origination in Oryza. Nat Ecol Evol. 2019;3:679–90.
Ai H, Fang X, Yang B, Huang Z, Chen H, Mao L, et al. Adaptation and possible ancient interspecies introgression in pigs identified by whole-genome sequencing. Nat Genet. 2015;47:217–25.
Wallberg A, Schöning C, Webster MT, Hasselmann M. Two extended haplotype blocks are associated with adaptation to high altitude habitats in East African honey bees. PLoS Genet. 2017;13: e1006792.
Clemente FJ, Cardona A, Inchley CE, Peter BM, Jacobs G, Pagani L, et al. A selective sweep on a deleterious mutation in CPT1A in Arctic populations. Am J Hum Genet. 2014;95:584–9.
Rothschild MF, Ruvinsky A. The genetics of the pig. 2nd ed. Wallingford: CABI; 2011.
Markov N, Economov A, Hjeljord O, Rolandsen CM, Bergqvist G, Danilov P, et al. The wild boar Sus scrofa in northern Eurasia: a review of range expansion history, current distribution, factors affecting the northern distributional limit, and management strategies. Mamm Rev. 2022;52:519–37.
Frantz LAF, Schraiber JG, Madsen O, Megens H-J, Bosse M, Paudel Y, et al. Genome sequencing reveals fine scale diversification and reticulation history during speciation in Sus. Genome Biol. 2013;14:1.
Chen J, Ni P, Tran Thi TN, Kamaldinov EV, Petukhov VL, Han J, et al. Selective constraints in cold-region wild boars may defuse the effects of small effective population size on molecular evolution of mitogenomes. Ecol Evol. 2018;8:8102–14.
Wu G-S, Yao Y-G, Qu K-X, Ding Z-L, Li H, Palanichamy MG, et al. Population phylogenomic analysis of mitochondrial DNA in wild boars and domestic pigs revealed multiple domestication events in East Asia. Genome Biol. 2007;8:R245.
Bianco E. The chimerical genome of Isla del Coco feral pigs (Costa Rica), an isolated population since 1793 but with remarkable levels of diversity. Mol Ecol. 2015;24:2364–78.
Kim H, Song KD, Kim HJ, Park W, Kim J, Lee T, et al. Exploring the genetic signature of body size in Yucatan miniature pig. PLoS ONE. 2015;10: e0121732.
Fang X. The sequence and analysis of a Chinese pig genome. Gigascience. 2012;1:16.
Ramírez O, Burgos-Paz W, Casas E, Ballester M, Bianco E, Olalde I, et al. Genome data from a sixteenth century pig illuminate modern breed relationships. Heredity. 2015;114:175–84.
Li M, Tian S, Yeung CKL, Meng X, Tang Q, Niu L, et al. Whole-genome sequencing of Berkshire (European native pig) provides insights into its origin and domestication. Sci Rep. 2014;4:4678.
Li M. Genomic analyses identify distinct patterns of selection in domesticated pigs and Tibetan wild boars. Nat Genet. 2013;45:1431–8.
Liu L, Bosse M, Megens H-J, Frantz LAF, Lee Y-L, Irving-Pease EK, et al. Genomic analysis on pygmy hog reveals extensive interbreeding during wild boar expansion. Nat Commun. 2019;10:1992.
Frantz LAF, Schraiber JG, Madsen O, Megens H-J, Cagan A, Bosse M, et al. Evidence of long-term gene flow and selection during domestication from analyses of Eurasian wild and domestic pig genomes. Nat Genet. 2015;47:1141–8.
Bosse M, Megens H-J, Madsen O, Crooijmans RPMA, Ryder OA, Austerlitz F, et al. Using genome-wide measures of coancestry to maintain diversity and fitness in endangered and domestic pig populations. Genome Res. 2015;25:970–81.
Groenen MA. Analyses of pig genomes provide insight into porcine demography and evolution. Nature. 2012;491:393–8.
Wang J, Ying Yu, Feng L, Wang H, Zhang Q. Genomic DNA extraction from hair sacs of pigs using modified phe-nol-chloroform method. Hereditas. 2010;32:752–6.
Chen J, Zhang P, Chen H, Wang X, He X, Zhong J, et al. Whole-genome sequencing identifies rare missense variants of WNT16 and ERVW-1 causing the systemic lupus erythematosus. Genomics. 2022;114:110332.
Chen J, Zhong J, He X, Li X, Ni P, Safner T, et al. The de novo assembly of a European wild boar genome revealed unique patterns of chromosomal structural variations and segmental duplications. Anim Genet. 2022;53:281–92.
Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.
Li M, Chen L, Tian S, Lin Y, Tang Q, Zhou X, et al. Comprehensive variation discovery and recovery of missing sequence in the pig genome using multiple de novo assemblies. Genome Res. 2017;27:865–74.
Zhang L, Huang Y, Wang M, Guo Y, Liang J, Yang X, et al. Development and genome sequencing of a laboratory-inbred miniature pig facilitates study of human diabetic disease. iScience. 2019;19:162–76.
Warr A, Affara N, Aken B, Beiki H, Bickhart DM, Billis K, et al. An improved pig reference genome sequence to enable pig genetics and genomics research. Gigascience. 2020;9:051.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.
Browning BL, Tian X, Zhou Y, Browning SR. Fast two-stage phasing of large-scale sequence data. Am J Hum Genet. 2021;108:1880–90.
Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 2012;8: e1002967.
Fitak RR. OptM: estimating the optimal number of migration edges on population trees using Treemix. Biol Methods Protoc. 2021;6:bpab017.
Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:1655–64.
Chen H, Patterson N, Reich D. Population differentiation as a test for selective sweeps. Genome Res. 2010;20:393–402.
Szpiech ZA, Hernandez RD. selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol Biol Evol. 2014;31:2824–7.
Torres R, Szpiech ZA, Hernandez RD. Correction: Human demographic history has amplified the effects of background selection across the genome. PLoS Genet. 2019;15: e1007898.
VanKuren NW, Long M. Gene duplicates resolving sexual conflict rapidly evolved essential gametogenesis functions. Nat Ecol Evol. 2018;2:705–12.
Ford MJ, Aquadro CF. Selection on X-linked genes during speciation in the Drosophila athabasca complex. Genetics. 1996;144:689–703.
Racimo F. Testing for ancient selection using cross-population allele frequency differentiation. Genetics. 2015;202:733–50.
Ingvarsson PK. Population subdivision and the Hudson–Kreitman–Aguade test: testing for deviations from the neutral model in organelle genomes. Genet Res. 2004;83:31–9.
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical genetics and genomics and the association for molecular pathology. Genet Med. 2015;17:405–23.
Jia Y, Chen J, Zhong J, He X, Zeng L, Wang Y, et al. Novel rare mutation in a conserved site of PTPRB causes human hypoplastic left heart syndrome. Clin Genet. 2023;103:79–86.
Marwaha S, Knowles JW, Ashley EA. A guide for the diagnosis of rare and undiagnosed disease: beyond the exome. Genome Med. 2022;14:23.
Sabeti PC, Reich DE, Higgins JM, Levine HZP, Richter DJ, Schaffner SF, et al. Detecting recent positive selection in the human genome from haplotype structure. Nature. 2002;419:832–7.
Gautier M, Klassmann A, Vitalis R. rehh 2.0: a reimplementation of the R package rehh to detect positive selection from haplotype structure. Mol Ecol Resour. 2017;17:78–90.
Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE. 2010;5: e9490.
Dikmen S, Cole JB, Null DJ, Hansen PJ. Genome-Wide Association Mapping for identification of quantitative trait loci for rectal temperature during heat sStress in Holstein cattle. PLoS ONE. 2013;8: e69202.
Igoshin AV, Yurchenko AA, Belonogova NM, Petrovsky DV, Aitnazarov RB, Soloshenko VA, et al. Genome-wide association study and scan for signatures of selection point to candidate genes for body temperature maintenance under the cold stress in Siberian cattle populations. BMC Genet. 2019;20:26.
Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10:1523.
Liu Z, Zhang D, Wang W, He X, Peng F, Wang L, et al. A comparative study of the effects of long-term cold exposure, and cold resistance in Min Pigs and Large White Pigs. Acta Agric Scand A Anim Sci. 2017;67:34–9.
Kaya-Okur HS, Wu SJ, Codomo CA, Pledger ES, Bryson TD, Henikoff JG, et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat Commun. 2019;10:1930.
Guo Y. Epigenomic features associated with body temperature stabilize tissues during cold exposure in cold-resistant pigs. J Genet Genomics. 2024;51:1252–64.
Langmead B, Wilks C, Antonescu V, Charles R. Scaling read aligners to hundreds of threads on general-purpose processors. Bioinformatics. 2019;35:421–32.
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9:R137.
Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2020;48:D87-92.
Zhao Y, Hou Y, Xu Y, Luan Y, Zhou H, Qi X, et al. A compendium and comparative epigenomics analysis of cis-regulatory elements in the pig genome. Nat Commun. 2021;12:2217.
Boitard S, Rodríguez W, Jay F, Mona S, Austerlitz F. Inferring population size history from large samples of genome-wide molecular data—an approximate bayesian computation approach. PLoS Genet. 2016;12: e1005877.
Kimura M. The neutral theory of molecular evolution. Cambridge: Cambridge University Press; 1983.
Gautier M, Vitalis R. rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics. 2012;28:1176–7.
Charlesworth D. Balancing selection and its effects on sequences in nearby genome regions. PLoS Genet. 2006;2: e64.
Kimura M, Ohta T. Theoretical aspects of population genetics, (MPB-4), vol. 4. Princeton: Princeton University Press; 2020.
Darwin C. The origin of species by means of natural selection or the preservation of favoured races in the struggle for life. London: Books Incorporated Pub; 1859.
Li M, Tian S, Jin L, Zhou G, Li Y, Zhang Y, et al. Genomic analyses identify distinct patterns of selection in domesticated pigs and Tibetan wild boars. Nat Genet. 2013;45:1431–8.
Gan M, Shen L, Fan Y, Guo Z, Liu B, Chen L, et al. High Altitude adaptability and meat quality in Tibetan pigs: A reference for local pork processing and genetic improvement. Animals. 2019;9:1080.
Ai H, Yang B, Li J, Xie X, Chen H, Ren J. Population history and genomic signatures for high-altitude adaptation in Tibetan pigs. BMC Genom. 2014;15:834.
Ma Y-F, Han X-M, Huang C-P, Zhong L, Adeola AC, Irwin DM, et al. Population genomics analysis revealed origin and high-altitude adaptation of Tibetan pigs. Sci Rep. 2019;9:11463.
Coolbaugh CL, Damon BM, Bush EC, Welch EB, Towse TF. Cold exposure induces dynamic, heterogeneous alterations in human brown adipose tissue lipid content. Sci Rep. 2019;9:13600.
Chang JC, Durinck S, Chen MZ, Martinez-Martin N, Zhang JA, Lehoux I, et al. Adaptive adipose tissue stromal plasticity in response to cold stress and antibody-based metabolic therapy. Sci Rep. 2019;9:8833.
Grahl-Nielsen O, Andersen M, Derocher AE, Lydersen C, Wiig Ø, Kovacs KM. Fatty acid composition of the adipose tissue of polar bears and of their prey: ringed seals, bearded seals and harp seals. Mar Ecol Prog Ser. 2003;265:275–82.
da Silva CPV, Hernández-Saavedra D, White JD, Stanford KI. Cold and exercise: therapeutic tools to activate brown adipose tissue and combat obesity. Biology. 2019;8:9.
Librado P, Sarkissian CD, Ermini L, Schubert M, Jónsson H, Albrechtsen A, et al. Tracking the origins of Yakutian horses and the genetic basis for their fast adaptation to subarctic environments. Proc Natl Acad Sci U S A. 2015;112:E6889–97.
Cintron-Colon R, Sanchez-Alavez M, Nguyen W, Mori S, Gonzalez-Rivera R, Lien T, et al. Insulin-like growth factor 1 receptor regulates hypothermia during calorie restriction. Proc Natl Acad Sci U S A. 2017;114:9731–6.
Sanchez-Alavez M, Tabarean IV, Osborn O, Mitsukawa K, Schaefer J, Dubins J, et al. Insulin causes hyperthermia by direct inhibition of warm-sensitive neurons. Diabetes. 2010;59:43–50.
Hoopes BC, Rimbault M, Liebers D, Ostrander EA, Sutter NB. The insulin-like growth factor 1 receptor (IGF1R) contributes to reduced size in dogs. Mamm Genome. 2012;23:780–90.
O’Neill BT, Lauritzen HPMM, Hirshman MF, Smyth G, Goodyear LJ, Kahn CR. Differential role of insulin/IGF-1 receptor signaling in muscle growth and glucose homeostasis. Cell Rep. 2015;11:1220–35.
Grochowska E, Lisiak D, Akram MZ, Adeniyi OO, Lühken G, Borys B. Association of a polymorphism in exon 3 of the IGF1R gene with growth, body size, slaughter and meat quality traits in Colored Polish Merino sheep. Meat Sci. 2021;172:108314.
Caron A, Lee S, Elmquist JK, Gautron L. Leptin and brain-adipose crosstalks. Nat Rev Neurosci. 2018;19:153–65.
Rondeel JM, de Greef WJ, Hop WC, Rowland DL, Visser TJ. Effect of cold exposure on the hypothalamic release of thyrotropin-releasing hormone and catecholamines. Neuroendocrinology. 1991;54:477–81.
Bushdid PB, Osinska H, Waclaw RR, Molkentin JD, Yutzey KE. NFATc3 and NFATc4 are required for cardiac development and mitochondrial function. Circ Res. 2003;92:1305–13.
Yang TTC, Suk HY, Yang X, Olabisi O, Yu RYL, Durand J, et al. Role of transcription factor NFAT in glucose and insulin homeostasis. Mol Cell Biol. 2006;26:7372–87.
Rustenhoven J, Smith AM, Smyth LC, Jansson D, Scotter EL, Swanson MEV, et al. PU.1 regulates Alzheimer’s disease-associated genes in primary human microglia. Mol Neurodegener. 2018;13:44.
Chen KY, De Angulo A, Guo X, More A, Ochsner SA, Lopez E, et al. Adipocyte-specific ablation of PU.1 promotes energy expenditure and ameliorates metabolic syndrome in aging Mice. Front Aging. 2021;2:803482.
Hu Z, Zhao TV, Huang T, Ohtsuki S, Jin K, Goronzy IN, et al. The transcription factor RFX5 coordinates antigen-presenting function and resistance to nutrient stress in synovial macrophages. Nat Metab. 2022;4:759–74.
Zhang M, Yang Q, Ai H, Huang L. Revisiting the evolutionary history of pigs via De Novo mutation rate estimation in a three-generation pedigree. Genom Proteom Bioinfor. 2022;20:1040–52.
Groenen MAM, Archibald AL, Uenishi H, Tuggle CK, Takeuchi Y, Rothschild MF, et al. Analyses of pig genomes provide insight into porcine demography and evolution. Nature. 2012;491:393–8.
Barton NH. Natural selection and random genetic drift as causes of evolution on islands. Philos Trans R Soc Lond B Biol Sci. 1996;351:785–94.
Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475:493–6.
Schiffels S, Durbin R. Inferring human population size and separation history from multiple genome sequences. Nat Genet. 2014;46:919–25.
Kim SY, Zhang X, Schiattarella GG, Altamirano F, Ramos TAR, French KM, et al. Epigenetic reader BRD4 (Bromodomain-Containing Protein 4) governs nucleus-encoded mitochondrial transcriptome to regulate cardiac function. Circulation. 2020;142:2356–70.
Barrow JJ, Balsa E, Verdeguer F, Tavares CDJ, Soustek MS, Hollingsworth LRIV, et al. Bromodomain inhibitors correct bioenergetic deficiency caused by mitochondrial disease complex I mutations. Mol Cell. 2016;64:163–75.
Kammersgaard TS, Pedersen LJ, Jørgensen E. Hypothermia in neonatal piglets: Interactions and causes of individual differences. J Anim Sci. 2011;89:2073–85.
Acknowledgements
We extend special thanks to Martin Holzenberger and Bruno Conti for their patient discussion and valuable improvements to the manuscript. We also thank Evgeniy Varisovich Kamaldinov, Valeriy Lavrentyevich Petukhov, and our reviewers for their insightful comments and discussions.
Funding
We are supported by the National Key Research and Development Program of China (2023YFF1001000), the Natural Science Foundation of China (31961143020), the financial support the Fifth Batch of Technological Innovation Research Projects in Chengdu (2021-YF05-01331-SN), the Postdoctoral Research and Development Fund of West China Hospital (2020HXBH087), and the Short-Term Expert Fund of West China Hospital (139190032). Dr. M.S acknowledges the financial support by ZIN RAS (state assignment № 125012800908-0). N.Š. acknowledges the financial support by the Croatian Science Foundation, project IP 2019-04-4096 “The role of hunting related activities in the range expansion of recently established wild ungulate populations in the Mediterranean”.
Author information
Authors and Affiliations
Contributions
JHC, YXZ, and SHZ supervised this work. JHC, MS, ZXX, YPG, RZK, JZ, YYJ, TNTT, TS, HY, HM, NS, JLH, DL, SQX, and IJ designed the research. JHC, JZ, ZXX, YPG, RZK, JZ, and YXZ analysed data. JHC wrote the manuscript. YXZ, SHZ, and IJ revised the draft. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing financial interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
12711_2025_986_MOESM1_ESM.xlsx
Additional file 1: Table S1: Simplified core dataset . Table S2: Larger dataset . Table S3: Selective sweep method 1 based on diversity ratio and fixation index . Only the top 5% regions are shown. Table S4: Selective sweep method 2 based on diversity ratio and XP-CLR. Only the top 5% regions are shown. Table S5: Selective sweep method 3 based on iHH12 based on top 1% SNPs. Table S6: Selective sweep method 4 based on HKA test based on top 1% SNPs. Table S7: Genes shared by three to four methods. Table S8: Gene functional enrichment for 305 genes shared by at least three methods. Table S9: Human Siberian populations selective signals based the report of Cardona et al. [3]. Table S10: Genotyping and group assignment for an intron variant in IGF1R and an exonic variant in BRD4
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Chen, J., Jakovlić, I., Sablin, M. et al. Positive selection on rare variants of IGF1R and BRD4 underlying the cold adaptation of wild boar. Genet Sel Evol 57, 40 (2025). https://doi.org/10.1186/s12711-025-00986-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12711-025-00986-y