CpG_Aggregate API usage
Import module
# Import main module
from pycoMeth.CpG_Aggregate import CpG_Aggregate
# Optionally inport jupyter helper functions
from pycoMeth.common import head, jhelp, stdout_print
Getting help
jhelp(CpG_Aggregate)
CpG_Aggregate (nanopolish_fn, ref_fasta_fn, output_bed_fn, output_tsv_fn, min_depth, sample_id, min_llr, verbose, quiet, progress, kwargs)
Calculate methylation frequency at genomic CpG sites from the output of nanopolish call-methylation
- nanopolish_fn (required) [list(str)]
Path to a nanopolish call_methylation tsv output file or a list of files or a regex matching several files (can be gzipped)
- ref_fasta_fn (required) [str]
Reference file used for alignment in Fasta format (ideally already indexed with samtools faidx)
- output_bed_fn (default: "") [str]
Path to write a summary result file in BED format (At least 1 output file is required) (can be gzipped)
- output_tsv_fn (default: "") [str]
Path to write a more extensive result report in TSV format (At least 1 output file is required) (can be gzipped)
- min_depth (default: 10) [int]
Minimal number of reads covering a site to be reported
- sample_id (default: "") [str]
Sample ID to be used for the BED track header
- min_llr (default: 2) [float]
Minimal log likelyhood ratio to consider a site significantly methylated or unmethylated in output BED file
verbose (default: False) [bool]
quiet (default: False) [bool]
progress (default: False) [bool]
kwargs
Example usage
Basic usage
ff = CpG_Aggregate (
nanopolish_fn="./data/nanopolish_sample_1.tsv",
ref_fasta_fn="./data/ref.fa",
output_bed_fn="./results/CpG_Aggregate_sample_1.bed",
output_tsv_fn="./results/CpG_Aggregate_sample_1.tsv.gz",
sample_id="sample_1",
progress=True)
head("./results/CpG_Aggregate_sample_1.tsv.gz")
head("./results/CpG_Aggregate_sample_1.bed")
Example usage using a regex and with a lower depth threshold
ff = CpG_Aggregate (
nanopolish_fn="./data/nanopolish_sample_*.tsv",
ref_fasta_fn="./data/ref.fa",
output_bed_fn="./results/CpG_Aggregate_sample_all.bed",
output_tsv_fn="./results/CpG_Aggregate_sample_all.tsv",
min_depth=5,
sample_id="sample_all",
progress=True)
head("./results/CpG_Aggregate_sample_all.tsv")
head("./results/CpG_Aggregate_sample_all.bed")
Example with multiple files
for i in range (1, 5):
stdout_print(f"##### SAMPLE {i} #####")
CpG_Aggregate (
nanopolish_fn=f"./data/nanopolish_sample_{i}.tsv",
ref_fasta_fn="./data/ref.fa",
output_bed_fn=f"./results/CpG_Aggregate_sample_{i}.bed",
output_tsv_fn=f"./results/CpG_Aggregate_sample_{i}.tsv",
sample_id=f"sample_{i}",
min_depth=3,
min_llr=1,
quiet=True)