methSeg {methylKit} | R Documentation |
The function uses a segmentation algorithm (fastseg
)
to segment the methylation profiles. Following that, it uses
gaussian mixture modelling to cluster the segments into k components. This process
uses mean methylation value of each segment in the modeling phase. Each
component ideally indicates quantitative classification of segments, such
as high or low methylated regions.
methSeg(obj, diagnostic.plot = TRUE, join.neighbours = FALSE, ...)
obj |
|
diagnostic.plot |
if TRUE a diagnostic plot is plotted. The plot shows methylation and length statistics per segment group. In addition, it shows diagnostics from mixture modeling: the density function estimated and BIC criterion used to decide the optimum number of components in mixture modeling. |
join.neighbours |
if TRUE neighbouring segments that cluster to the same seg.group are joined by extending the ranges, summing up num.marks and averaging over seg.means. |
... |
arguments to |
To be sure that the algorithm will work on your data, the object should have at least 5000 records
A GRanges
object with segment
classification and information.
'seg.mean' column shows the mean methylation per segment.
'seg.group' column shows the segment groups obtained by mixture modeling
Altuna Akalin, contributions by Arsene Wabo
methSeg2bed
, joinSegmentNeighbours
download.file("https://dl.dropboxusercontent.com/u/1373164/H1.chr21.chr22.rds", destfile="H1.chr21.chr22.rds",method="curl") mbw=readRDS("H1.chr21.chr22.rds") # it finds the optimal number of componets as 6 res=methSeg(mbw,diagnostic.plot=TRUE,maxInt=100,minSeg=10) # however the BIC stabilizes after 4, we can also try 4 componets res=methSeg(mbw,diagnostic.plot=TRUE,maxInt=100,minSeg=10,G=1:4) # get segments to BED file methSeg2bed(res,filename="H1.chr21.chr22.trial.seg.bed") unlink(list.files(pattern="H1.chr21.chr22",full.names=TRUE))