Running FishTaco

The FishTaco python module handles all calculations internally. FishTaco offers an interface to the FishTaco functionality via the command line and the script.


FishTaco can be used in two alternative modes, depending on the availability of genomic information for each taxon. Specifically, if such data is available (e.g., through reference genomes), FishTaco can be used with the -gc flag. However, FishTaco can also infer this data by using the -inf flag. If you are using 16S data coupled with PICRUSt, please read Can I run FishTaco with a PICRUSt-derived metagenomic functional profile?.

Running FishTaco with genomic content data: -ta TAXA_ABUN_FILE -fu FUNCTION_ABUN_FILE -l LABELS_FILE -gc GENOMIC_CONTENT_FILE [options]

Running FishTaco with genomic content inference: -ta TAXA_ABUN_FILE -fu FUNCTION_ABUN_FILE -l LABELS_FILE -inf [options]

Required arguments

-ta, --taxa_abundance TAXA_ABUN_FILE

Input file of taxonomic abundance profiles (format)

-l, --labels LABELS_FILE

Input file of label assignment for the two sample sets being compared (format)

Optional arguments

-fu, --function_abundance FUNCTION_ABUN_FILE

Input file of function abundance (format)

-gc, --genomic_content_of_taxa GENOMIC_CONTENT_FILE

Input file of genomic content of each taxa (format)

-inf, --perform_inference_of_genomic_content

Defines if genome content is inferred (either de-novo or prior-based if genomic content is also given, default: FALSE)


Define sample set label to find enrichment in (default: 1)


Define sample set label to find enrichment against (default: 0)

-op, --output_prefix OUTPUT_PREF

Output prefix for result files (default: fishtaco_out)

-map_function_level {pathway, module, none, custom}

Map KOs to pathways, modules, none, or custom (default: pathway)

-assessment, --taxa_assessment_method {single_taxa, multi_taxa}

The method used when assessing taxa to compute individual contributions. The running time of single_taxa will be significantly lower than multi_taxa, but less accurate (see manuscript for details) (default: multi_taxa)

Advanced usage arguments

-map_function_file FUNC_LEVEL_MAP_FILE

Mapping file from KOs to pathways, modules, or custom (default: use internal KEGG database downloaded 07/15/2013)


Indicates to perform the inference on the KO level (default: use the mapped functional level, e.g., pathway)

-mult_hyp, --multiple_hypothesis_correction {Bonf, FDR-0.01, FDR-0.05, FDR-0.1, none}

Multiple hypothesis correction for functional enrichment (default: FDR-0.05)

-max_func, --maximum_functions_to_analyze MAX_FUNCTIONS

Maximum number of enriched functions to consider (default: All)

-score, --score_to_compute {t_test, mean_diff, median_diff, wilcoxon, log_mean_ratio}

The enrichment score to compute for each function (default: wilcoxon)

-max_score, --max_score_cutoff MAX_SCORE_CUTOFF

The maximum score cutoff (for example, when dividing by zero) (default: 100)

-na_rep NA_REP

How to represent NAs in the output (default: NA)

-number_of_permutations NUMBER_OF_PERMUTATIONS

number of permutations (default: 100)

-number_of_shapley_orderings_per_taxa NUMBER_OF_SHAPLEY_ORDERINGS_PER_TAXA

number of shapley orderings per taxa (default: 5)

-en, --enrichment_results DA_RESULT_FILE

Pre-computed functional enrichment results from the script (default: None)

-single_function_filter SINGLE_FUNCTION_FILTER

Limit analysis only to this single function (default: None)

-multi_function_filter_list MULTI_FUNCTION_FILTER

Limit analysis only to these comma-separated functions (default: None)

-h, --help

show help message and exit


Indicates that the functional profile has been already corrected with MUSiCC prior to running FishTaco (default: False)

-log, --log

Write to log file (default: False)

FishTaco Output Files

Main output files

contains the taxon-level decomposition of shift scores for the differentially abundant functions. (format)

Supporting stats output files

contains the final taxon-level contribution score for every differentially abundant(shifted) function in the input data, as calculated by FishTaco

contains statistics regarding the differential abundance for each function in the input file

contains statistics regarding the differential abundance for each taxa in the input file

contains the mean taxon-level contribution score for every differentially abundant(shifted) function in the input data (in default settings, this is equal to the final score)

contains the median taxon-level contribution score for every differentially abundant(shifted) function in the input data

contains the standard deviation of taxon-level contribution score for every differentially abundant(shifted) function in the input data

contains the metagenome-based shift statistics value for each function in the input file

contains the taxa-based shift statistics value for each function in the input file

contains the taxa-based abundance profile for each function in each sample

contains various statistics regarding the agreement between the metagenome- and taxa-based abundance profiles for each function

contains the residual between the metagenome- and taxa-based abundance profiles for each function (in “remove-residual” mode the residual is equal to zero)

contains the random Shapley orderings used in the run (for “permuted_shapley_orderings” mode)

contains the inferred copy numbers of each function in each taxon (for FishTaco with prior-based or de novo inference)

contains various statistics regarding the agreement between the metagenome- and taxa-based abundance profiles for each function (on test data)

contains the running log of FishTaco


The fishtaco/examples directory contains the following:

  • the file contains scaled abundance measurements of 10 species in 213 samples from the HMP dataset

  • the file contains MUSiCC-corrected abundance values for the K00001 orthology group in the same samples

  • the file contains the copy numbers of the K00001 orthology group in the 10 species as above

  • the file contains class labels from the same samples (control vs. case)

Using these files as input for FishTaco results in the following output files (found in the fishtaco/examples/output directory):

Note: If you installed the FishTaco package using pip, the examples directory is located in your python packages directory, e.g., lib/python3.3/site-packages

FishTaco with no inference

Running FishTaco with no inference generates the output files found in fishtaco/examples/output/fishtaco_out_no_inf_STAT_* -ta fishtaco/examples/ -fu fishtaco/examples/
-l fishtaco/examples/ -gc fishtaco/examples/ -op fishtaco_out_no_inf
-map_function_level none -functional_profile_already_corrected_with_musicc -assessment single_taxa -log

FishTaco with prior-based inference

Running FishTaco with prior-based inference generates the output files found in fishtaco/examples/output/fishtaco_out_prior_based_inf_STAT_* -ta fishtaco/examples/ -fu fishtaco/examples/
-l fishtaco/examples/ -gc fishtaco/examples/ -op fishtaco_out_prior_based_inf
-map_function_level none -functional_profile_already_corrected_with_musicc -inf -assessment single_taxa -log

FishTaco with de novo inference

Running FishTaco with de novo inference generates the output files found in fishtaco/examples/output/fishtaco_out_de_novo_inf_STAT_* -ta fishtaco/examples/ -fu fishtaco/examples/
-l fishtaco/examples/ -op fishtaco_out_de_novo_inf -map_function_level none -functional_profile_already_corrected_with_musicc
-inf -assessment single_taxa -log