Phylogenetics and algebraic geometry: Identifiability of two-tree mixtures under group-based models

Seminar on Mathematics in the Bio and Geo Sciences

Speaker: Sonja Petrovic, Dept of Statistics, Penn State

Abstract: In this talk I will survey some of the general ideas in algebraic phylogenetics, and then focus on phylogenetic mixture models. Phylogenetic data arising on two possibly different tree topologies might be mixed through several biological mechanisms, including incomplete lineage sorting or horizontal gene transfer in the case of different topologies, or simply different substitution processes on characters in the case of the same topology. I will explain joint work with Allman, Rhodes and Sullivant, where we investigate the question of identifiability for 2-tree mixtures of the 4-state group-based models which are relevant to DNA sequence data. Using algebraic techniques, we show that the tree parameters are identifiable for the JC and K2P models. We also prove that generic substitution parameters for the JC mixture models are identifiable, and for the K2P and K3P models obtain generic identifiability results for mixtures on the same tree. This indicates that the full phylogenetic signal remains in such mixtures. The underlying methods of this work come from algebraic statistics. Some of the open problems suggest developments on the computational frontier.

Room Number: 106 McAllister

Date: 04/04/2012

Time: 1:00pm - 2:00pm