URL: http://github.com/borenstein-lab/MIMOSA
Publication: Noecker, C., Eng, A., Srinivasan, S., Theriot, C.M., Young, V.B., Jansson, J.K., Fredricks, D.N., and Borenstein, E. (2016). Metabolic Model-Based Integration of Microbiome Taxonomic and Metabolomic Profiles Elucidates Mechanistic Links between Ecological and Metabolic Variation. MSystems 1, e00013-15.
Follow these steps to quickly run a MIMOSA analysis and view a summary of the results.
If you do not have R and RStudio installed, follow the instructions to do so described in the R section here.
First, download the MIMOSA GitHub repository and install the MIMOSA package. You can do so by clicking here and unzipping the resulting file. Alternatively, you can use git to do so from a commmand line by running the following command:
git clone https://github.com/borenstein-lab/MIMOSA.git
Rscript installMimosa.R
Download the example data by clicking and unzipping this zip file. See below for more details on this dataset.
Perform a full MIMOSA analysis by running the following command in a shell window. You may need to adjust the paths to each data file, depending on how you saved the files above. This command assumes you have placed the MIMOSA data inside the MIMOSA code directory, and that you are also running this command from within the MIMOSA directory.
Rscript runMimosa.R --genefile="MIMOSA_data/bv_gg_picrust_genes.txt" --metfile="MIMOSA_data/bv_metabolites.txt" --contribs_file="MIMOSA_data/bv_gg_picrust_contributions.txt" --mapformula_file="MIMOSA_data/KEGG2010/reaction_mapformula.lst" --file_prefix="MIMOSA_data/mimosa_out" --ko_rxn_file="MIMOSA_data/KEGG2010/ko_reaction.list" --rxn_annots_file="MIMOSA_data/KEGG2010/reaction" --metadata_file="MIMOSA_data/bv_metadata.txt" --metadata_var="BV" --summary_doc_dir="" --num_permute=1000
If MIMOSA runs successfully but produces an error referring to pandoc and fails to generate the summary document, you have a couple options to fix this issue. You can open the file summarizeMimosaResults.Rmd in RStudio, press “Knit”, and then “Knit With Parameters”, and then provide the text printed by the runMimosa.R script to generate the document. Alternatively, you can follow the instructions in this post to set an environment variable and then re-run the script.
MIMOSA is a tool for relating paired microbiome and metabolomic data. MIMOSA can be used to answer questions like:
A MIMOSA analysis consists of the following steps:
An overview of the MIMOSA framework
In this tutorial, you will:
First, if you do not have R and RStudio installed, follow the instructions to do so described in the R section here.
To install and use MIMOSA, we recommend first downloading the full GitHub repository, using either “git clone” or by clicking here and unzipping the resulting file. MIMOSA requires several dependency packages that can be obtained from the CRAN and Bioconductor repositories (instructions for installing several of these were provided in the pre-workshop instructions, but the “installMimosa.R” script will also try to re-install those if needed). Running the following commands in a shell window will complete all of these steps:
git clone https://github.com/borenstein-lab/MIMOSA.git
cd MIMOSA #Navigate into the downloaded directory
Rscript installMimosa.R #Run the installation script
If you have a dataset pairing gene and/or OTU abundance data with measurements of identified metabolites, you may be able to use it for this analysis. You can see examples of the required file formats in the example dataset as well as in the mimosa/tests/testthat directory in the downloaded code.
Otherwise, we will use an example dataset describing the vaginal microbiome. This dataset is from the following publication:
Srinivasan, S., Morgan, M.T., Fiedler, T.L., Djukovic, D., Hoffman, N.G., Raftery, D., Marrazzo, J.M., and Fredricks, D.N. (2015). Metabolic Signatures of Bacterial Vaginosis. MBio 6, e00204-15.
Download the example data by clicking and unzipping this zip file.
The example data provided includes the following files, describing the vaginal microbiome of 70 women with and without Bacterial Vaginosis (BV):
The example data also includes 3 reference files from the KEGG database, in the “KEGG2010” directory. Because access to the KEGG database requires a license, these are from the last version of the database that was publicly available, in 2010. Newer versions contain information on more genes and reactions, but most core metabolism is largely unchanged. The reference files required by MIMOSA are:
In the near future, a new version of MIMOSA will not require KEGG access.
Generally, MIMOSA requires a table of KO abundances by sample, a table of identified metabolite abundances by sample, and a table of species-specific KO abundances by sample. These can generally be generated using any platform or processing. For example, the stratified and unstratified tables from a Humann2 analysis of metagenomic data can be used instead of PICRUSt output (with KEGG annotations). The MIMOSA R package includes a function format_humann2_contributions that will fully re-format the stratified table for this purpose. You can also run subsets of the MIMOSA analysis (using a custom R script) without a taxon-specific contribution table.
The most straightforward way to run a full MIMOSA analysis is to use the runMimosa.R script from the command line. This script will run the following steps:
From within the MIMOSA directory, run the following command in a shell window. You may need to adjust the paths to each data file, depending on your file structure. This command assumes you have placed the MIMOSA data inside the MIMOSA code directory, and that you are also running this command from within the MIMOSA directory.
Rscript runMimosa.R --genefile="MIMOSA_data/bv_gg_picrust_genes.txt" --metfile="MIMOSA_data/bv_metabolites.txt" --contribs_file="MIMOSA_data/bv_gg_picrust_contributions.txt" --mapformula_file="MIMOSA_data/KEGG2010/reaction_mapformula.lst" --file_prefix="MIMOSA_data/mimosa_out" --ko_rxn_file="MIMOSA_data/KEGG2010/ko_reaction.list" --rxn_annots_file="MIMOSA_data/KEGG2010/reaction" --metadata_file="MIMOSA_data/bv_metadata.txt" --metadata_var="BV" --summary_doc_dir="" --num_permute=1000
If MIMOSA runs successfully but produces an error referring to pandoc and fails to generate the summary document, you have a couple options to fix this issue. You can open the file summarizeMimosaResults.Rmd in RStudio, press “Knit”, and then “Knit With Parameters”, and then provide the text printed by the runMimosa.R script to generate the document. Alternatively, you can follow the instructions in this post to set an environment variable and then re-run the script.
The runMimosa.R script will produce several output files. The most interesting one is the summary document, mimosa_out_summary.html. This document includes a series of plots, tables, and statistics describing the results. These including the following:
You can adjust and customize the display of these plots by opening and editing the code in the *summarizeMIMOSAResults.Rmd** document, and then re-compiling it to html.
The other files describe the other results files in more detail. A full description of their contents can be found in the Readme for the MIMOSA GitHub repository.
MIMOSA uses a simplified and approximate model to relate microbial abundances to metabolite concentrations. Associations between metabolic potential and metabolite concentrations may exist for many other reasons besides the mechanism proposed by MIMOSA.
MIMOSA’s model is also incomplete, as many reactions are missing or mis-annotated in KEGG. Many metabolites cannot be successfully analyzed by MIMOSA because they are not linked to any reactions and/or genes in the KEGG database.
The potential taxonomic contributors identified by MIMOSA are simply those taxa whose own estimated metabolic potential is correlated with the whole-community scores. This means that these taxa may help explain the association with metabolite scores. While this was the approach used in published MIMOSA analyses, MIMOSA can alternatively identify taxa whose metabolic potential is most correlated directly with the metabolite concentrations, which we have found may be more informative. You can switch to this option by adding the flag –spec_method=“mets” to your runMimosa.R command.