I'm a biologist with a computational and statistical background, and an interest in advancing clinical decision-making from patient data (precision medicine). I have a particular interest in mental illness.
Keywords: precision medicine; machine learning; genomics; epigenetics; statistics; method development; software
|2004||B.Math. Hon. Computer Science
University of Waterloo, ON, Canada
|2010||Ph.D. Biological Sciences
Watson School of Biological Sciences
Cold Spring Harbor Laboratory, NY, USA
|2010 - 2014||Postdoctoral Fellow
Centre for Addiction & Mental Health, Toronto, Canada
Epigenetics (PI: Art Petronis, MD PhD)
|2015 - now||Postdoctoral Fellow|
The Donnelly Centre, University of Toronto
Computational methods for precision medicine
(PI: Gary Bader, PhD)
As part of my postdoctoral research, I led the project to build netDx, an algorithm that builds a patient classifier by integrating several types of user-provided data. It outperforms most machine-learning methods in binary cancer survival prediction. It has some nice properties for clinical research, including the ability to handle sparse and missing data, and grouping genes into pathways for interpretability.
Read our methods paper here: Pai et al. (2019). Mol Syst Biol 15, and about the general strategy we use (patient simiilarity networks), in this review I co-wrote with Dr Bader. netDx is publicly available as R software. My work and collaboration for netDx was honoured with the 2019 Donnelly Centre Research Excellence award.
I am also interested in understanding how epigenetic factors - particularly DNA methylation - contribute to increase risk for mental illness (read this review I co-wrote for motivation). I co-led the first genome-wide comparison of DNA methylation in neurons isolated from post-mortem brains of individuals with psychosis vs undiagnosed individuals. Funded partly by a 2014 NARSAD Young Investigator award to myself, in this work we discovered a new method for dopamine regulation in neurons.
Read our research article here: Pai et al. (2019) Nat Comms 10. This research was covered by various media outlets (Altmetric score of 130, as of 24 Oct 2019 - top 5% of all scored research outputs by Altmetric).
All publications are listed on Google Scholar.
I am lead developer on patient classifier netDx, available at the Baderlab github page. I have contributed software packages for genome viz to R/Bioconductor project. My github repo contains repositories associated with research articles. As a graduate student working on animal neuroscience, I designed and implemented Matlab code to automate adaptive multi-step animal training protocols ("SessionModel"). That code and its successors are in use in numerous cognitive neuroscience labs at Cold Spring Harbor Labs and Princeton U, in the US.
When I'm not thinking about precision medicine, I am a professional vocalist and lead a jazz band in Toronto. I also have two kids that keep me on my toes.