I am a data scientist, bioinformatician, and research software developer with a background in ecology and evolutionary biology. My research combines amplicon and whole-genome sequencing to unravel the complex interactions between microbial symbionts and their plant hosts—ranging from tropical trees to alpine yellow monkeyflowers, and most recently, bioenergy feedstocks.
My work centers on applying reproducible data pipelines in R (with a dash of Python) to reveal insightful microbial patterns in ecological datasets. I leverage a range of statistical and machine learning methods, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), distance-based Redundancy Analysis (dbRDA), and advanced linear modeling (GLMs and GLMMs), to analyze genomic and ecological data.
As a committed advocate for open, transparent, and reproducible science, I collaborate with the ESIP Data Stewardship Committee to advance best practices in data management and stewardship.
I am also interested in web development for scientific communication and reproducible research. For example, I use Quarto to develop interactive and accessible web documentation and dashboards, such as for the esipDMP project.
In addition to R and Python, I work with Rust and MongoDB to build robust, high-performance data tools—such as rateDMP, which leverages Rust for backend processing and MongoDB for scalable data storage and querying.
I develop and maintain several R packages and analysis tools for the research community, including:
- BRCore: An R package for core analysis workflows in microbial ecology.
- interbrc-core-analysis: Analysis pipelines for the Bioenergy Research Centers shared research objective in feedstock microbiomes.
- esipDMP: A premier in data management planning and stewardship.
- rateDMP: (Rust) An experimental tool for rating or evaluating Data Management Plans.
You can find more details about my professional background in my CV.
- Microbial ecology & plant-microbe interactions
- Genomics & bioinformatics
- Machine learning & statistical modeling
- Reproducible research & open science
- Web development for scientific communication
I’m always interested in new collaborations, open-source projects, and discussions about data science, bioinformatics, and reproducibility. Feel free to reach out or connect via GitHub Issues or LinkedIn.