Skip to content

Meth_Comp CLI usage

Activate virtual environment

# Activate Conda env 
conda activate pycoMeth

Getting help

pycoMeth Comp_Report --help
usage: pycoMeth Comp_Report [-h] -i METHCOMP_FN -g GFF3_FN -f REF_FASTA_FN
                            [-o OUTDIR] [-n N_TOP] [-d MAX_TSS_DISTANCE]
                            [--pvalue_threshold PVALUE_THRESHOLD]
                            [--min_diff_llr MIN_DIFF_LLR]
                            [--n_len_bin N_LEN_BIN] [--export_static_plots]
                            [--report_non_significant] [-v] [-q] [-p]

Generate an HTML report of significantly differentially methylated CpG
intervals from `Meth_Comp` text output. Significant intervals are annotated
with their closest transcript TSS.

optional arguments:
  -h, --help            show this help message and exit

Input/Output options:
  -i METHCOMP_FN, --methcomp_fn METHCOMP_FN
                        Input tsv file generated by Meth_comp (can be
                        gzipped). At the moment only data binned by intervals
                        with Interval_Aggregate are supported. (required)
                        [str]
  -g GFF3_FN, --gff3_fn GFF3_FN
                        Path to an **ensembl GFF3** file containing genomic
                        annotations. Only the transcripts details are
                        extracted. (required) [str]
  -f REF_FASTA_FN, --ref_fasta_fn REF_FASTA_FN
                        Reference file used for alignment in Fasta format
                        (ideally already indexed with samtools faidx)
                        (required) [str]
  -o OUTDIR, --outdir OUTDIR
                        Directory where to output HTML reports, By default
                        current directory (default: ./) [str]

Misc options:
  -n N_TOP, --n_top N_TOP
                        Number of top interval candidates for which to
                        generate an interval report. If there are not enough
                        significant candidates this is automatically scaled
                        down. (default: 100) [int]
  -d MAX_TSS_DISTANCE, --max_tss_distance MAX_TSS_DISTANCE
                        Maximal distance to transcription stat site to find
                        transcripts close to interval candidates (default:
                        100000) [int]
  --pvalue_threshold PVALUE_THRESHOLD
                        pValue cutoff for top interval candidates (default:
                        0.01) [float]
  --min_diff_llr MIN_DIFF_LLR
                        Minimal llr boundary for negative and positive median
                        llr. 1 is recommanded for vizualization purposes.
                        (default: 1) [float]
  --n_len_bin N_LEN_BIN
                        Number of genomic intervals for the longest chromosome
                        of the ideogram figure (default: 500) [int]
  --export_static_plots
                        Export all the plots from the reports in SVG format.
                        (default: False) [None]
  --report_non_significant
                        Report all valid CpG islands, significant or not in
                        the text report. This option also adds a non-
                        significant track to the TSS_distance plot (default:
                        False) [None]

Verbosity options:
  -v, --verbose         Increase verbosity
  -q, --quiet           Reduce verbosity
  -p, --progress        Display a progress bar
(pycoMeth) 

Example usage

Example with a single significant result

pycoMeth Comp_Report \
    -i "./data/Yeast_CGI_meth_comp.tsv.gz" \
    -g "./data/yeast.gff3" \
    -f "./data/yeast.fa" \
    -o "yeast_html" \
    --pvalue_threshold 0.05 \
    --verbose
## Checking options and input files ##
    [DEBUG]: Options summary
    [DEBUG]:    Package name: pycoMeth
    [DEBUG]:    Package version: 0.4.14
    [DEBUG]:    Timestamp: 2020-07-16 18:38:02.340862
    [DEBUG]:    methcomp_fn: ./data/Yeast_CGI_meth_comp.tsv.gz
    [DEBUG]:    gff3_fn: ./data/yeast.gff3
    [DEBUG]:    ref_fasta_fn: ./data/yeast.fa
    [DEBUG]:    outdir: yeast_html
    [DEBUG]:    n_top: 100
    [DEBUG]:    max_tss_distance: 100000
    [DEBUG]:    pvalue_threshold: 0.05
    [DEBUG]:    min_diff_llr: 1
    [DEBUG]:    n_len_bin: 500
    [DEBUG]:    api_mode: False
    [DEBUG]:    export_static_plots: False
    [DEBUG]:    report_non_significant: False
    [DEBUG]:    verbose: True
    [DEBUG]:    quiet: False
    [DEBUG]:    progress: False
    [DEBUG]:    kwargs
    [DEBUG]:        subcommands: Comp_Report
    [DEBUG]:        func: <function Comp_Report at 0x7ff1533bc200>
## Loading and preparing data ##
    Loading Methcomp data from TSV file
    Loading transcripts info from GFF file
    Loading chromosome info from reference FASTA file
    Number of significant intervals found (adjusted pvalue<0.05): 1
ERROR: Low number of significant sites. The summary report will likely contain errors
ERROR: Number of significant intervals lower than number of top candidates to plot
    Finding top candidates
    Creating output directory structure
    Computing source md5
## Parsing methcomp data ##
    Iterating over significant intervals
    [DEBUG]: Ploting top candidates rank: #1
    Generating summary report
(pycoMeth) 

Usage with large dataset including static plot export and report_non_significant CpG

pycoMeth Comp_Report \
    -i "./data/Medaka_CGI_meth_comp.tsv.gz" \
    -g "./data/medaka.gff3" \
    -f "./data/medaka.fa" \
    -o "medaka_html" \
    --n_top 50 \
    --export_static_plots \
    --report_non_significant
## Checking options and input files ##
## Loading and preparing data ##
    Loading Methcomp data from TSV file
    Loading transcripts info from GFF file
    Loading chromosome info from reference FASTA file
    Number of significant intervals found (adjusted pvalue<0.01): 3532
    Finding top candidates
    Creating output directory structure
    Computing source md5
## Parsing methcomp data ##
    Iterating over significant intervals
    Generating summary report
(pycoMeth)