Skip to content

MurrellGroup/CodonMolecularEvolution.jl

Repository files navigation

CodonMolecularEvolution.jl

Dev Coverage Build Status

Descendant of MolecularEvolution.jl, specializing in codon models. The current dN/dS catalogue consists of

  • FUBAR
  • difFUBAR

To run our models in the cloud, see ColabMolecularEvolution.

Installation

In a julia REPL, type ]add CodonMolecularEvolution

Documentation

https://MurrellGroup.github.io/CodonMolecularEvolution.jl

Extensions

To trigger our plotting extension, one must using Plots, Phylo.

Examples

This section demonstrates how to use our dN/dS models with real biological data, showcasing both the input requirements and the type of evolutionary insights you can obtain.

using MolecularEvolution, FASTX, CodonMolecularEvolution

Note: To enable plotting, one would add using Plots, Phylo.

FUBAR

DOI

We perform FUBAR analysis on this FASTA file, and a phylogeny from this Newick tree file.

# Read data
outdir = "fubar"
seqnames, seqs = read_fasta("Ace2_with_bat.fasta")
treestring = readlines("Ace2_with_bat.tre")[1]
# Perform analysis
fgrid = alphabetagrid(seqnames, seqs, treestring);
method = DirichletFUBAR() # Standard FUBAR
fubar_df, params = FUBAR_analysis(method, fgrid, analysis_name=outdir*"/Ace2_with_bat");

The above call will write results to outdir*"/Ace2_with_bat", but here we only display what's written to sdout

Step 1: Initialization.
Step 2: Optimizing global codon model parameters.
Optimized single α,β LL=-66072.56532041529 with α=1.940881909909932 and β=0.7757100788818039.
Step 3: Calculating conditional likelihoods.
Step 4: Model fitting.
Step 5: Tabulating results.
19 sites with positive selection above threshold.
487 sites with purifying selection above threshold.

Now we show some of the above-threshold sites

above_threshold_sites = fubar_df[(fubar_df.positive_posterior .> 0.95) .| (fubar_df.purifying_posterior .> 0.95), :]
above_threshold_sites[1:10, :]
10×5 DataFrame
 Row │ site   positive_posterior  purifying_posterior  beta_posterior_mean  alpha_posterior_mean 
     │ Int64  Float64             Float64              Float64              Float64        
─────┼─────────────────────────────────────────────────────────────────────────────────────
   1 │     1         0.00564089            0.985948              0.0126755        1.42783
   2 │     4         5.12413e-5            0.998737              0.218888         1.12801
   3 │     6         0.0092943             0.981231              0.0227715        1.47458
   4 │     9         0.0118501             0.964818              0.0658507        0.392239
   5 │    11         2.29686e-8            0.999994              0.277966         2.10407
   6 │    15         1.43323e-6            0.999887              0.20953          1.42799
   7 │    17         0.000436892           0.995299              0.199044         0.917965
   8 │    19         8.04229e-9            0.999999              0.152897         2.47035
   9 │    25         0.000141827           0.996911              0.275294         1.36792
  10 │    27         0.995895              3.74494e-5            2.94698          1.05116

difFUBAR (awaiting the correct link)

DOI

We perform difFUBAR analysis on this FASTA file, and a tagged phylogeny from this NEXUS tree file.

# Read data
analysis_name = "difFUBAR/Ace2_tiny"
seqnames,seqs = read_fasta("data/Ace2_tiny_test.fasta");
treestring, tags, tag_colors = import_colored_figtree_nexus_as_tagged_tree("data/Ace2_no_background.nex")
# Perform analysis
df,results = difFUBAR(seqnames, seqs, treestring, tags, analysis_name);

difFUBAR will write its results to the export directory analysis_name, but here we show a short-hand representation of the results, which are written to stdout.

Step 1: Initialization. If exports = true, tree showing the assignment of branches to groups/colors will be exported to: output/Ace2_tagged_input_tree.svg.
Step 2: Optimizing global codon model parameters.
0.0% 29.0% 58.0% 87.0% 
Step 4: Running Gibbs sampler to infer site categories.
Step 5: Tabulating and plotting. Detected sites:
Site 3 - P(ω1 > ω2):0.0075; P(ω2 > ω1):0.9805; P(ω1 > 1):0.1205; P(ω2 > 1):0.9675
Site 13 - P(ω1 > ω2):0.0175; P(ω2 > ω1):0.9605; P(ω1 > 1):0.13; P(ω2 > 1):0.9305

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages