The genetic structure and adaptation of Andean highlanders and Amazonians are influenced by the interplay between geography and culture

of positive natural selection in the the in the putative enhancer in HAND2-AS1 (heart neural expressed antisense a noncoding (dual

Western South America was one of the worldwide cradles of civilization. The well-known Inca Empire was the tip of the iceberg of an evolutionary process that started 11,000 to 14,000 years ago. Genetic data from 18 Peruvian populations reveal the following: 1) The between-population homogenization of the central southern Andes and its differentiation with respect to Amazonian populations of similar latitudes do not extend northward. Instead, longitudinal gene flow between the northern coast of Peru, Andes, and Amazonia accompanied cultural and socioeconomic interactions revealed by archeology. This pattern recapitulates the environmental and cultural differentiation between the fertile north, where altitudes are lower, and the arid south, where the Andes are higher, acting as a genetic barrier between the sharply different environments of the Andes and Amazonia. 2) The genetic homogenization between the populations of the arid Andes is not only due to migrations during the Inca Empire or the subsequent colonial period. It started at least during the earlier expansion of the Wari Empire (600 to 1,000 years before present). 3) This demographic history allowed for cases of positive natural selection in the high and arid Andes vs. the low Amazon tropical forest: in the Andes, a putative enhancer in HAND2-AS1 (heart and neural crest derivatives expressed 2 antisense RNA1, a noncoding gene related to cardiovascular function) and rs269868-C/Ser1067 in DUOX2 (dual oxidase 2, related to thyroid function and innate immunity) genes and, in the Amazon, the gene encoding for the CD45 protein, essential for antigen recognition by T and B lymphocytes in viral-host interaction.
Native Americans | human population genetics | natural selection | gene flow L iving Native Americans, the object of this study, are among the most neglected populations in human genetics studies, despite the increasing interest in the study of ancient DNA (aDNA) of their ancestors (1,2). Western South America was one of the cradles of civilization in the Americas and the world (3). When the Spanish conqueror Francisco Pizarro arrived in 1532, the pan-Andean Inca Empire ruled in the Andean region and had achieved levels of socioeconomic development and Significance Native Americans are neglected in human genetics studies, despite recent interest in the study of ancient DNA of their ancestors. Our findings on Andean and Amazonian populations exemplify how the current pattern of genetic diversity in human populations is influenced by the interaction of history and environment. In the present case, this pattern is influenced by 1) altitudinal and climatic differences among the northern, lower, and fertile Andes versus the southern, higher, and arid Andes and 2) the sharp differences between the Andean highlands and the Amazon lowlands, where natural selection and other evolutionary forces acted for millennia, shaping differences in the frequencies of genetic variants related to immune response, drug response, and cardiovascular and hematological functions. population density unmatched in other parts of South America. The Inca Empire, which lasted for around 200 years before the conquest, with its emblematic architecture such as Machu Picchu and the city of Cuzco, was just the "tip of the iceberg" of a millenary cultural and biological evolutionary process (4,5). This process started 11,000 to 14,000 years ago (6)(7)(8) with the peopling of this region, hereafter called western South America, that involves the entire Andean region and its adjacent and narrow Pacific coast.
Tarazona-Santos et al. (9) proposed in 2001 that cultural exchanges and gene flow along time have led to a current relative genetic, cultural, and linguistic homogeneity between the populations of western South America compared with those of eastern South America (a term that hereafter refers to the region adjacent to the eastern slope of the Andes and eastward, including Amazonia), where populations remained more isolated from each other. For instance, only two languages (Quechua and Aymara) of the Quechumaram linguistic stock predominate in the entire Andean region, whereas in eastern South America natives speak a different and broader spectrum of languages classified into at least four linguistic families (5,9,10). This spatial pattern of genetic diversity and its correlation with geography and environmental, linguistic, and cultural diversity was confirmed, enriched, and rediscussed by us and others (2,4,5,(9)(10)(11)(12)(13)(14)(15).
There are, however, pending issues. The first is whether the current dichotomic organization of genetic variation characterized by the between-population homogeneous southern Andes vs. between-population heterogeneous central Amazon extends northward. This is important because scholars from different  (16). The map shows the path used to create the plot elevation and latitude plot and a green area delimiting the area of the Amazon Yunga region. Altitude data were obtained from Google Maps (https:// www.google.com.br/maps). (B) Geographic distribution and genetic structure for 18 native populations from the coast, Andes, and Amazon inferred by ADMIXTURE result (K = 5, corresponding to the lowest cross-validation error). Pie charts show the average percentages over individuals for the five AD-MIXTURE clusters in each population. Three clusters were related to Native American groups: one Andean (brown) and two Amazonian (green clusters). Two clusters were associated with non-Native continental ancestries (European [red] and African [dark blue]). Blue and green dashed lines delimit the groups that showed a highly significant D statistic value, indicating gene flow (gray arrow, |Z score| > 4, SI Appendix, Figs. S14-S17). Matsiguenkas 1= Matsiguenkas-Sepahua, Matsiguenkas 2= Matsiguenkas-Shimaa. The gray horizontal dashed line in the center of the map shows the approximate division between the fertile Andes and arid Andes (16). disciplines emphasize that western South America is not latitudinally homogeneous, differentiating a northern and, in general, lower and wetter fertile Andes and a southern, higher, and more arid Andes (16) (Fig. 1A). These environmental and latitudinal differences are correlated with demography and culture, including different histories and spectra of domesticated plants and animals. Indeed, the development of agriculture, in the first urban centers such as Caral (3) and its associated demographic growth, occurred earlier in the northern fertile Andes (around 5,000 years ago) than in the southern arid Andes (and their associated coast), with products such as cotton, beans, and corn domesticated in the fertile north and the potato, quinoa, and South American camelids domesticated in the arid south (16). In human population genetics studies, the region where the between-population homogeneity was ascertained by Tarazona-Santos et al. (9) was the arid Andes. Consequently, here we test whether the between-population homogenization of western South America and the dichotomy of arid Andes/Amazonia extend to the northward fertile Andes.
A second open issue is the evolutionary relationship between Andean and Amazonian populations, particularly with the culturally, linguistically, and environmentally different neighboring populations of the Amazon Yunga (the rain forest transitional region between the Andes and Lower Amazonia). Harris et al. (5) inferred that Andean and Amazonian populations diverged around 12,000 years ago. Archaeological findings of recent decades have rejected the traditional view of the Amazonian environment as incompatible with complex pre-Columbian societies and have revealed that the Amazonian basin has produced the earliest ceramics of South America, that endogenous agricultural complex societies developed there, and that population sizes were larger than previously thought (17). Population genetics studies (18) have reported episodes of gene flow in Amazonia which suggest that Amazonian populations were not necessarily isolated groups. Moreover, the ancestors of people living on the Peruvian coast, in the Andes, and in the Amazon Yunga had cultural and commercial interactions during the last millennia, sharing practices such as sweet potato and manioc cultivation, ceramic iconography and styles (e.g., Tutishcanyo, Kotosh, Valdivia, and Corrugate), and traditional coca chewing (19). Therefore, here we address whether gene flow accompanied the cultural and socioeconomic interactions between the ancestors of current Andean and Amazon Yunga populations.
Despite some controversy about definitions and chronology, archeologists identify a unique cultural process in western South America which includes three temporal horizons, Early, Middle, and Late, that correspond to periods of cultural dispersion involving a wide geographic area (20) (Fig. 2). In particular, the Middle and Late Horizons are associated with the expansions of the Wari (∼1,000 to 1,400 years before present [YBP]) and Inca (∼524 to 466 YBP) states, respectively (21)(22)(23). The betweenpopulation homogeneity currently observed in the arid Andes results from high levels of gene flow in this region, which is commonly associated with the Inca Empire (20). However, Isbell (22) has suggested that the former Wari expansion led to the spread of the Quechua language in the central Andes and that the Wari were pioneers in developing a road system in the Andes called Wari ñam, which was later used by the Incas to develop their network of roads (the Qapaq ñam) (16). A third relevant question is, therefore, when the current between-population genetic homogenization started in the context of the arid Andean chronology (Fig. 2). Particularly, is this a phenomenon restricted to the period of the Inca Empire (Late Horizon), or did it extend backward to the Middle/Wari Horizon?
Finally, Native Americans had to adapt to different and contrasting environments and stresses. The high and arid Andes are characterized by high ultraviolet radiation, cold, dryness, and hypoxia (a stress that does not allow for cultural adaptations and requires biological changes) (24,25). The Amazon has a low incidence of light, a warm and humid climate typical of the rain forest, and high biodiversity, including pathogens (26). Here we infer episodes of genetic adaptation to the arid Andes and the Amazonian tropical forest.

Results and Discussion
We used data from Harris et al. (5) for 74 indigenous individuals and additional data from 289 unpublished individuals from 18 Peruvian Native populations, genotyped for ∼2.5 million single nucleotide polymorphisms (SNPs) (Fig. 1B and Dataset S1). For population genetics analyses, we created three datasets with different SNP densities and populations (27)(28)(29)(30) Fig. S1 and section 1.3, and Datasets S2 and S3). The institutional review boards of participants' institutions approved this research. The study was led by Peruvian institutions and investigators who have a long record of community engagement activities as an intrinsic component of their research protocols. Bioinformatics pipelines are described in (31). 2), we confirmed that populations in the arid Andes are genetically homogeneous, appearing as an almost panmictic unit, with an ancestry pattern differentiated with respect to Amazonian populations (Fig. 1B). Conversely, populations of the northern coast (Moches and Tallanes) and in the northern Amazon Yunga (i.e., Chachapoyas) share the same ancestry profile between them ( Fig. 1B and SI Appendix, Figs. S8-S13), which is different from the populations from the arid Andes. Thus, the between-population homogenization of the arid Andes and its differentiation with respect to Amazonian populations of similar latitudes do not extend northward and are not characteristic of all western South America. Instead, the genetic structure of western South Amerindian populations recapitulates the environmental and cultural differentiation between the northern fertile Andes and the southern arid Andes. Nakatsuka et al.
(2) (their figure 2), studying aDNA from 86 pre-Columbian individuals, showed that some level of north-south population structure predates the arrival of Spaniards to Peru in 1532. They claim that there was a strong pre-Columbian north-south population structure in the western Andes in pre-Columbian times. However, their claim partly depends on removing from the results of their figure 2 sixteen out of the 86 studied pre-Columbian individuals whom they call "outliers" (18% of their aDNA dataset). The inclusion of these so-called outliers [see SI Appendix, figure S4 of Nakatsuka et al. (2)] shows that the north-south pre-Columbian population structure was not as strong as they claimed.
Longitudinal Gene Flow between the North Coast, Andes, and Amazonia Accompanied the Well-Documented Cultural and Socioeconomic Interactions. Haplotype-based inferences (ChromoPainter/Globetrotter methods) (33, 34) ( Fig. 1B and SI Appendix, Figs. S11-S13 and section 2.1.3), statistical tests of treeness (35) (Fig. 1B and SI Appendix, Figs. S14 and S15 and section 3.2.1), and admixture graphs (35) (SI Appendix, Figs. S16-S19 and section 3.2.2) reveal genetic signatures of gene flow between coastal/Andean and Amazon Yunga populations in latitudes of the northern fertile Andes but not in the southern arid Andes. Thus, longitudinal gene flow between the north coast, Andes, and Amazonia accompanied cultural and socioeconomic interactions documented by archeology, which include ceramic styles and crops, as well as the critical role that Chachapoyas may have played (see Introduction and SI Appendix, section 3.1). This pattern of gene flow recapitulates the differentiation between the fertile north, where altitudes are lower, and the arid south, where the Andes altitudes are higher (Fig. 1A) and may have acted as a barrier to gene flow, imposing a sharper environmental differentiation between the Andes and the Amazon Yunga. Formal comparison of admixture graphs (35) (SI Appendix, Figs. S16-S19) representing different scenarios shows that gene flow was more intense from the north coast to the Amazon than in the opposite direction and that in latitudes of the fertile north, gene flow included important ethnic groups such as the current Chachapoyas of the Amazon Yunga, as well as eastward Lower Amazonian populations such as those of the Jivaro linguistic family (Awajun and Candoshi) and Lamas (Fig. 1B and SI Appendix, Figs. S16-S19). These results are consistent with those of Nakatsuka et al. (2) based on current and pre-Hispanic individuals.
The Homogenization of the Central Arid Andes Started at least during the Wari Expansion (1,400 to 1,000 YBP). We analyzed the distribution of identity-by-descent (IBD) segment lengths between individuals of different arid Andean populations, which is informative about the dynamics of past gene flow (5,36). We observed a signature of gene flow in the interval between 1,400 and 1,000 YBP, within the Wari expansion in the Middle Horizon (Fig. 2). Thus, the homogenization of the central arid Andes is not only due to migrations during the Inca Empire or later during the Spanish Viceroyalty of Peru, when migrations (often forced) occurred (37). The Wari expansion (1,400 to 1,000 YBP) was also accompanied by intensive gene flow whose signature is still present in the between-population genetic homogeneity of the arid central Andes region. We also observed that during the Wari/ Middle Horizon the effective population size (N e ) was rising in the arid Andes (SI Appendix, Fig. S22), a trend that stopped with the European contact, when N e started to decline, consistent with demographic records (38) and with genetic studies by Lindo et al. (39). Because IBD analysis on current individuals does not allow for inferences of gene flow that occurred more than 75 generations ago (36), ancient DNA analysis at the population level will be necessary to infer whether the between-population homogenization of the Andes started even earlier.

Episodes of Genetic Adaptation Occurred in the Arid Andes and the
Amazonian Tropical Forest. Populations from the high and arid Andes and those from the Amazon (Fig. 1B) settled in these contrasting environments more than 5,000 years ago (40) and show little evidence of gene flow between them (i.e., that would homogenize allele frequencies, potentially concealing the effect of diversifying natural selection). We performed genome-wide scans in these two groups of populations using two tests of positive natural selection: 1) population branch statistics (PBSn) comparing arid Andeans (Chopccas, Quechuas_AA, Qeros, Puno, Jaqarus, and Uros; n = 102) vs. Amazonian populations (Ashaninkas, Matsiguenkas, Matses, and Nahua; n = 75) with a Chinese population (Dai in Xishuangbanna, China; n = 100) from 1000 Genomes as an out-group (41) (SI Appendix, section 5.2.1) and 2) long-range haplotypes (xpEHH) (42) estimated for the two groups of populations ( Fig. 3 and SI Appendix, Figs. S24-S27 and section 5.2.2). The complete lists of SNPs with high PBSn and xpEHH statistics for Andean and Amazonian populations are in Datasets S4-S7.
The gene with the consensually strongest signal of adaptation (both from PBSn and xpEHH statistics: PBSn = 0.205, P value = 0.003; xpEHH = 4.481, P value < 0.00001) to the Andean environment ( Fig. 3 and Dataset S4) is a long noncoding RNA gene called HAND2-AS1 (heart and neural crest derivatives expressed 2 RNA antisense 1, chromosome 4), that modulates cardiogenesis by regulating the expression of the nearby HAND2 gene (43,44). This result is consistent with 1) the natural selection genome-wide scan by Crawford et al. (41), who identified three genes related to the cardiovascular system in Andeans, including TBX5, which works together with HAND2 in reprogramming fibroblasts to cardiac-like myocytes (45,46), and 2) a pattern of adaptation of Andean populations preferentially mediated by the cardiovascular system. The derived allele rs2877766-A (frequencies: Amazonians, 0.453; Andeans, 0.880) is the core of the extended haplotype. HAND2-AS1 is located in the antisense 5′ region of HAND2, and the positively selected six SNPs core haplotype is ∼18-kilobase and encompasses a putative human enhancer (GeneHancer identifier GH04J173536, SI Appendix, Fig.  S29). Considering the limitation of our data that come from genotyping arrays, we further recovered from the sequencing data by Harris et al. (5) all nearby SNPs in linkage disequilibrium in Andean populations (r 2 > 0.80) with the core SNP rs2877766. We found that the positively selected haplotype includes the SNP rs3775587, mapped within the putative enhancer GH04J173536. Altogether, these results suggest (but do not demonstrate) that the HAND2-AS1 signature of natural selection is related to regulation of gene expression by an enhancer and reflects cardiovascular adaptations. Andeans have cardiovascular adaptations to high altitude that differ from those of lowlanders exposed to hypoxia and from those of other highlanders, showing higher pulmonary vasoconstrictor response to hypoxia, lower resting middle cerebral flow velocity than Tibetans, and higher uterine artery blood flow than Europeans and lowlanders raised in high altitude (47).
DUOX2 (dual oxidase 2, chromosome 15) is the gene with the highest signal of adaptation to the Andean environment by PBSn analysis (PBSn = 0.22, P value = 0.002) ( Fig. 3 and SI Appendix, Fig. S24). It has already been reported as a natural selection target in the Andes (48,49). DUOX2 encodes a transmembrane component of an NADPH oxidase, which produces hydrogen peroxide (H 2 O 2 ), and is essential for the synthesis of the thyroid hormone and for the production of the microbicidal hypothiocyanite anion (OSCN − ) during mucosal innate immunity response against bacterial and viral infections in the airways and intestines (50,51). Mutations in DUOX2 produce inherited hypothyroidism (52). Here we report the following: 1) The PBSn signal for DUOX2 comprises several SNPs, including two missense mutations (rs269868: C > T: Ser1067Leu, C allele frequencies: Amazon, 0.01, Andean, 0.53; rs57659670: T > C: His678Arg, C allele frequencies: Amazon, 0.01, Andean, 0.53); 2) bioinformatics analysis reveals that rs269868 is located in an A-loop, 1064-1078 amino acids, which is a region of interaction of DUOX2 with its coactivator DUOXA2. Mutations in this region of the protein can affect the stability and maturation of the dimer and, consequently, the conversion of the intermediate product O 2 to the final product H 2 O 2 and their released proportions (53). If the natural selection signal is related to this effect, then the standing ancestral allele has been positively selected in the Andes. It is not clear whether the DUOX2 natural selection signal is related to thyroid function or innate immunity. Before the introduction of the public health program of supplementing manufactured salt with iodine, one of the environmental stresses of the Andes for human populations was iodine deficiency, which impairs thyroid hormone synthesis, increasing the risk of developing hypothyroidism, goiter, obstetric complications, and cognitive impairment (54,55).
Natural selection studies in Amazon populations are scarce. Studies targeting rain forest populations in Africa and Asia have found natural selection signals in genes related to height and immune response (56). In the Amazon region, the strongest natural selection PBSn signal (PBSn = 0.302, P value = 0.002) is in a long noncoding RNA gene on chromosome 18 with unknown function (Dataset S5 and SI Appendix, Fig. S25). The secondhighest signal (which also shows a significant long-range haplotype signal: PBSn = 0.265, P value = 0.004; xpEHH = −4.222, P value = 0.0003) corresponds to the gene PTPRC (Fig. 3), which encodes the protein CD45, essential in antigen recognition by T and B lymphocytes in pathogen-host interaction, in particular for viruses such as human adenovirus type 19 (57), HIV-1-induced cell apoptosis (58,59), hepatitis C (60, 61), and herpes simplex virus 1 (62), even if we cannot exclude a role for unknown viruses endemic in the Amazon region. The core haplotype flanks the rs16843712 derived allele A (frequencies: Amazonia, 0.811; Andes, 0.324), within the putative human enhancer GH01J198660 (sensu GeneHancer; SI Appendix, Fig. S30), and includes the A (Thr193) allele of the nonsynonymous SNP rs4915154 (A > G: Thr193Ala) in exon 6 that affects alternative splicing and alters a potential O-and N-linked glycosylation site. The positively selected allele A (Thr193) has been associated (63) with a lower proportion of CD45R0+ T memory cells and an increased amount of naive phenotype T cells expressing A (exon 4), B (exon 5), and C (exon 6) isoforms. This result is consistent with the hypothesis of CD45 evolution driven by a host-virus arms race model (64).
We use DANCE [Disease Ancestry Network (69)] to present the allele frequencies of our total Native American samples for 30,270 GWAS hits and its associated complex phenotypes (sensu GWAS Catalog, https://www.ebi.ac.uk/gwas/), in comparison with African, European, and Asian allele frequencies from the 1000 Genome Project. While this information is relevant, we recall that the allelic architecture of the complex diseases presented in the GWAS Catalog is biased by the underrepresentation of individuals with non-European ancestry in genetic studies.
In conclusion, in western South America, there is an environmental and cultural differentiation between the fertile north of the Andes, where altitudes are lower, and the arid south of the Andes, where these mountains are higher, defining sharp environmental differences between the Andes and Amazonia. This has influenced the genetic structure of western South Amerindian populations. Indeed, the between-population homogenization of the central southern Andes and its differentiation with respect to Amazonian populations of similar latitudes do not extend northward. Gene flow between the northern coast of Peru, the Andes, and Amazonia accompanied cultural and socioeconomic interactions revealed by archeology, but in the central southern Andes, these mountains have acted as a genetic barrier to gene flow (70). We provide insights on the dynamics of the genetic homogenization between the populations of the arid Andes which is not only due to migrations during the Inca Empire or the subsequent colonial period but started at least during the earlier expansion of the pre-Inca Wari Empire (600 to 1,000 YBP). Nakatsuka et al. (2), comparing ancient with modern individuals from western South America, make the general claim that the genetic structure of current populations "strongly echoed" and "are most closely related to the ancient individuals from their region" (i.e., 500 to 2,000 years ago). However, this general statement is not supported by their own results (see their SI Appendix, figure S7). From nine ancient (500 to 2,000 years ago) vs. current comparisons of populations from the same region, this statement is true only for the five cases of the Southern Highlands of Peru and for Chile (their SI Appendix, figure S7 J and K) and not for the four comparisons from the Peruvian coast and north of Peru (their SI Appendix, figure S7 F-I). Thus, Nakatsuka et al.'s (2) results emphasize and add a temporal perspective to the dichotomy observed by us between the current northern fertile Andes (more associated with trans-Andean gene flow) and the southern arid Andes (more homogeneous between populations and differentiated from the Amazonia). The evolutionary journey of western South Amerindians was accompanied by episodes of adaptive natural selection to the high and arid Andes vs. the low Amazon tropical forest: the noncoding gene HAND2-AS1 (related to cardiovascular function and with the positively selected haplotype encompassing a putative human enhancer) and DUOX2 (related to thyroid function and innate immunity) in the Andes. In the Amazon forest, the gene encoding for the protein CD45, essential for antigen recognition by T and B lymphocytes and viral-host interactions, shows a signature of positive natural selection, consistent with the hostvirus arms race hypothesis. Our results and other studies (70) continue to show how Andean highlanders and Amazonian dwellers provide examples of how the interplay between geography and culture influences the genetic structure and adaptation of human populations.

Materials and Methods
The protocol for the Peruvian Genome Diversity Project was approved by the Research and Ethics Committee (OI003-11 and OI-087-13) of the Peruvian National Institute of Health, and all participants who had samples collected in this project provided informed consent. We genotyped 289 present-day Native Americans from Peru using the Human Omni array of Illumina for 2.5 million SNPs as part of the Peruvian Genome Diversity Project. Quality control was performed using PLINK (71) and Laboratório de Diversidade Genética Humana bioinformatics protocols and scripts (31). We merged our individuals with public datasets (1,(28)(29)(30) and Kaqchikel individuals from M.D. lab from National Cancer Institute. For D statistics and admixture graph analyses, we generate masked data, after phasing our datasets with SHAPEIT2 (72) and inferring the non-Native DNA segments with RFMix (73). To infer population structure, we used two approaches: 1) principal component analysis in Eigenstrat (74) and genetic clustering on ADMIXTURE software (32) using a linkage disequilibrium pruned dataset and 2) fineS-TRUCTURE (33), MIXTURE MODEL (34,75), and SOURCEFIND (76) for haplotype-based analyses, after phase inference. Historical relationships were inferred using D statistics (77) and Admixture Graphs (35). IBD was inferred using refinedIBD (78) and IBDNe (79). For the genetic differentiation analyses, the pairwise genetic distances (F statistics) between Native South American groups (F ST ) and between populations within groups (F SC ) were calculated for multilocus and individual loci using 4P software (80) and the hierfstat R package (81), respectively. The linkage disequilibrium was inferred by the software Haploview (82). Natural selection scans were performed using population branch statistics (41,83) and xpEHH from the package Selscan (42,84).
Data Availability. Data have been deposited in the European Genome-phenome Archive (EGA), https://www.ebi.ac.uk/ega/home (accession nos. EGAD00010001958, EGAD00010001990, EGAD00010001991, EGAD00010001992). processed in the Sagarana HPC cluster at the Centro de Laboratórios Multiusuários at Instituto de Ciências Biológicas-UFMG. This work is a product of the collaboration between investigators from the Peruvian Genome Project at the INS and the Genomics and Bioinformatics group of the Project Proproject Epidemiologia Genômica de Coortes Brasileiras de base populagional (EPIGEN-Brazil, https://epigen.grude.ufmg.br/), funded by the Departamento de Ciência e Tecnologia/Ministério de Saúde (DECIT-MS, Brazil).