Research
I develop Bayesian statistical methods to resolve conflicting evidence about evolutionary timescales and reconstruct evolutionary history across scales—from fossils and molecules in deep time to somatic mutations in cancer genomes.
Integrating fossils and molecules to constrain deep-time diversification
Molecular clocks often place major clade origins far earlier than fossil evidence suggests. I developed approaches integrating the Bayesian Brownian Bridge model—which translates fossil occurrence data into probabilistic calibrations—with molecular clock methods to reconcile these conflicts.
Applying this framework to flowering plants, I inferred angiosperm origins in the latest Jurassic (~150 Mya), substantially later than previous molecular estimates but consistent with recent fossil discoveries. This work, accepted at Nature Plants, demonstrates how mechanistic fossil calibration can bridge the fossil-molecular gap when both datasets are carefully integrated.
I also lead some onging projects applying these integrating approaches to early eukaryote and animal evolution to illuminate key transitions and macroevolutionary innovations during deep-time diversification.
Morphological data present fundamental limitations: my analysis of panarthropod relationships (onychophorans, tardigrades, euarthropods) showed that morphology alone cannot statistically resolve relationships among these major phyla (Biology Letters), highlighting the need for rigorous uncertainty quantification in paleontological phylogenetics. I continue to investigate model adequacy in morphological evolution using posterior predictive simulations, evaluating whether standard evolutionary models capture real patterns in comparative datasets.
Inferring somatic cell evolutionary history in cancer
Phylogenetic principles apply equally to tumor evolution, yet cancer presents distinct challenges: limited clonal diversity, longitudinal sampling, and complex population dynamics. I participate in developing CNETA, which uses copy number alterations (CNAs) as phylogenetic markers to infer tumor tree topology, divergence times, and CNA rates from multi-sample datasets.
Current work develops multi-scale Bayesian models integrating CNAs from single-cell to bulk tissue levels, accommodating variable evolutionary rates and realistic tumor biology. Alongside tool development, I conduct model adequacy assessments using posterior predictive simulations to evaluate whether standard evolutionary models capture real patterns in phylogenetic and cancer genomic data.
