科研进展
不完全分类抽样对基于贝叶斯和基因组序列数据总结方法推断基因流的影响(朱天琪与合作者)
发布时间:2026-05-27 |来源:

Interspecific gene flow is commonly inferred using genomic data under the multispecies coalescent model. Incomplete taxon sampling can impact inference of gene flow in multiple ways. First unsampled ghost lineages that are sources of introgression may mislead inference of gene flow in analysis of genomic data from sampled species. Second incomplete taxon sampling causes merges of branches on the species phylogeny and complicates the definition and estimation of the rate or magnitude of gene flow, measured by the expected proportion of immigrants in the recipient population (i.e., the introgression probability). We use mathematical analysis and computer simulation to examine the impact of incomplete taxon sampling on inference of gene flow and estimation of its rate using genomic data. We introduce a Bayesian testing approach to select models of gene flow for a species triplet (such as ghost introgression, inflow, and outflow), using the Savage-Dickey density ratio to calculate Bayes factors. We show that the approach has excellent sensitivity and specificity, whereas heuristic methods based on data summaries typically cannot distinguish among those scenarios. We find that genomic data allow reliable estimation of the proportion of immigrants (rather than the number of immigrants), even when the assumed demographic model is incorrect due to incomplete taxon sampling. When population size differs among species, assuming the same size may lead to seriously biased estimates of the rate of gene flow. The f-branch approach is effective in reducing the number of gene-flow events suggested by triplet analyses but often fails to identify the correct model of gene flow and tends to underestimate the rate of gene flow. Our results highlight the need for improving summary methods to accommodate different population sizes and to infer gene flow between sister lineages.

Publication:

SYSTEMATIC BIOLOGY

http://dx.doi.org/10.1093/sysbio/syag023

Author:

Sirui Cheng

Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK

State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

University of Chinese Academy of Sciences, Beijing 100049, China

Thomas Flouri

 Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK

Tianqi Zhu

State Key Laboratory of Mathematical Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

Ziheng Yang

Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK

Correspondence to be sent to: Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK;

E-mail: z.yang@ucl.ac.uk



附件下载:

    联系我们
    参考
    相关文章