Inferring admixture proportions from molecular data

Bertorelle, Giorgio; Excoffier, L.

doi:10.1093/oxfordjournals.molbev.a025858

We derive here two new estimators of admixture proportions based on a coalescent approach that explicitly takes into account molecular information as well as gene frequencies. These estimators can be applied to any type of molecular data (such as DNA sequences, restriction fragment length polymorphisms [RFLPs], dr microsatellite data) for which the extent of molecular diversity is related to coalescent times. Monte Carlo simulation studies are used to analyze the behavior of our estimators. We show that one of them (m(Y)) appears suitable for estimating admixture from molecular data because of its absence of bias and relatively low variance. We then compare it to two conventional estimators that are based on gene frequencies. m(Y) proves to be less biased than conventional estimators over a wide range of situations and especially for microsatellite data. However, its variance is larger than that of conventional estimators when parental populations are not very differentiated. The variance of m(Y) becomes smaller than that of conventional estimators only if parental populations have been kept separated for about N generations and if the mutation rate is high. Simulations also show that several loci should always be studied to achieve a drastic reduction of variance and that, for microsatellite data, the mean square error of m(Y) rapidly becomes smaller than that of conventional estimators if enough loci are surveyed. We apply our new estimator to the case of admired wolflike Canid populations tested for microsatellite data.