CGI_Finder API usage
Import module
# Import main module
from pycoMeth.CGI_Finder import CGI_Finder
# optionally inport jupyter helper functions
from pycoMeth.common import head, jhelp
Getting help
jhelp(CGI_Finder)
CGI_Finder (ref_fasta_fn, output_tsv_fn, output_bed_fn, merge_gap, min_win_len, min_CG_freq, min_obs_CG_ratio, verbose, quiet, progress, kwargs)
Simple method to find putative CpG islands in DNA sequences by using a sliding window and merging overlapping windows satisfying the CpG island definition. Results can be saved in bed and tsv format
- ref_fasta_fn (required) [str]
Reference file used for alignment in Fasta format (ideally already indexed with samtools faidx)
- output_tsv_fn (default: None) [str]
Path to write an more extensive result report in TSV format (At least 1 output file is required)
- output_bed_fn (default: None) [str]
Path to write a summary result file in BED format (At least 1 output file is required)
- merge_gap (default: 0) [int]
Merge close CpG island within a given distance in bases
- min_win_len (default: 200) [int]
Length of the minimal window containing CpG. Used as the sliding window length
- min_CG_freq (default: 0.5) [float]
Minimal C+G frequency in a window to be counted as a valid CpG island
- min_obs_CG_ratio (default: 0.6) [float]
Minimal Observed CG dinucleotidefrequency over expected distribution in a window to be counted as a valid CpG island
verbose (default: False) [bool]
quiet (default: False) [bool]
progress (default: False) [bool]
kwargs
Example usage
Basic usage with yeast genome
ff = CGI_Finder (
ref_fasta_fn="./data/yeast.fa",
output_bed_fn="./results/yeast_CGI.bed",
output_tsv_fn="./results/yeast_CGI.tsv",
progress=True)
head("./results/yeast_CGI.tsv")
head("./results/yeast_CGI.bed")