Interval_Aggregate API usage
Import module
# Import main module
from pycoMeth.Interval_Aggregate import Interval_Aggregate
# optionally inport jupyter helper functions
from pycoMeth.common import head, jhelp, stdout_print
Getting help
jhelp (Interval_Aggregate)
Interval_Aggregate (cpg_aggregate_fn, ref_fasta_fn, interval_bed_fn, output_bed_fn, output_tsv_fn, interval_size, min_cpg_per_interval, sample_id, min_llr, verbose, quiet, progress, kwargs)
Bin the output of pycoMeth CpG_Aggregate
in genomic intervals, using either an annotation file containing intervals or a sliding window.
- cpg_aggregate_fn (required) [str]
Output tsv file generated by CpG_Aggregate (can be gzipped)
- ref_fasta_fn (required) [str]
Reference file used for alignment in Fasta format (ideally already indexed with samtools faidx)
- interval_bed_fn (default: None) [str]
SORTED bed file containing non-overlapping intervals to bin CpG data into (Optional) (can be gzipped)
- output_bed_fn (default: None) [str]
Path to write a summary result file in BED format (At least 1 output file is required) (can be gzipped)
- output_tsv_fn (default: None) [str]
Path to write a more extensive result report in TSV format (At least 1 output file is required) (can be gzipped)
- interval_size (default: 1000) [int]
Size of the sliding window in which to aggregate CpG sites data from if no BED file is provided
- min_cpg_per_interval (default: 5) [int]
Minimal number of CpG sites per interval.
- sample_id (default: "") [str]
Sample ID to be used for the BED track header
- min_llr (default: 2) [float]
Minimal log likelyhood ratio to consider a site significantly methylated or unmethylated in output BED file
verbose (default: False) [bool]
quiet (default: False) [bool]
progress (default: False) [bool]
kwargs
Example usage
Default usage with sliding windows
Interval_Aggregate (
cpg_aggregate_fn="./data/CpG_Aggregate_sample_1.tsv",
ref_fasta_fn="./data/ref.fa",
output_bed_fn="./results/Interval_Aggregate_sample_1.bed",
output_tsv_fn="./results/Interval_Aggregate_sample_1.tsv",
interval_size=500,
min_cpg_per_interval=3,
sample_id="sample_1",
progress=True)
head("./results/Interval_Aggregate_sample_1.tsv")
head("./results/Interval_Aggregate_sample_1.bed")
Usage with a CpG Islands annotation Bed file
ff = Interval_Aggregate (
cpg_aggregate_fn="./data/CpG_Aggregate_sample_1.tsv",
ref_fasta_fn="./data/ref.fa",
interval_bed_fn="./data/Yeast_CGI.bed",
output_bed_fn="./results/CGI_Aggregate_sample_1.bed",
output_tsv_fn="./results/CGI_Aggregate_sample_1.tsv",
sample_id="sample_1",
min_cpg_per_interval=1,
progress=True)
head("./results/CGI_Aggregate_sample_1.tsv")
head("./results/CGI_Aggregate_sample_1.bed")
Example with multiple files
for i in range (1, 5):
stdout_print (f"##### SAMPLE {i} #####")
Interval_Aggregate (
cpg_aggregate_fn=f"./data/CpG_Aggregate_sample_{i}.tsv",
ref_fasta_fn="./data/ref.fa",
output_bed_fn=f"./results/Interval_Aggregate_sample_{i}.bed",
output_tsv_fn=f"./results/Interval_Aggregate_sample_{i}.tsv",
sample_id=f"sample_{i}",
interval_size=500,
min_cpg_per_interval=3,
min_llr=1,
quiet=True)