| reverseComplement {Biostrings} | R Documentation |
These functions can reverse a BString, DNAString or RNAString object and complement each base of a DNAString object.
reverse(x, ...) complement(x, ...) reverseComplement(x, ...)
x |
A BString (or derived) object
or a BStringViews object for reverse.
A DNAString object
or a BStringViews object
with a DNAString subject for complement
and reverseComplement.
|
... |
Additional arguments to be passed to or from methods. |
Given an object x of class BString, DNAString
or RNAString, reverse(x) returns an object of the same class
where letters in x are reordered in the reverse ordered.
If x is a DNAString object, complement(x) returns
an object where each base in x is "complemented" i.e.
A, C, G, T are replaced by T, G, C, A respectively.
Letters belonging to the "IUPAC extended genetic alphabet"
are also replaced by their complement (M <-> K, R <-> Y, S <-> S, V <-> B,
W <-> W, H <-> D, N <-> N) and the gap symbol (-) is unchanged.
reverseComplement(x) is equivalent to reverse(complement(x))
but is faster and more memory efficient.
An object of the same class and length as the original object.
reverseComplement(DNAString("ACGT-YN-"))
## Applying reverseComplement() to the pattern before calling matchPattern()
## is the standard way to search hits on the reverse strand of a chromosome:
library(BSgenome.Dmelanogaster.FlyBase.r51)
chrX <- Dmelanogaster[["X"]]
pattern <- DNAString("GAACGGTGTCT")
matchPattern(pattern, chrX) # 1 hit on strand +
m0 <- matchPattern(reverseComplement(pattern), chrX) # 2 hits on strand -
## Applying reverseComplement() to the subject instead of the pattern is not
## a good idea for 2 reasons:
## (1) Chromosome sequences are generally huge so it's going to be a lot of
## work and require a lot of memory to compute reverseComplement(subject).
## (2) Chromosome locations are generally given relatively to the positive
## strand, even for features located in the negative strand, so after
## doing this:
m1 <- matchPattern(pattern, reverseComplement(chrX))
## the start/end of the matches are now relative to the negative strand.
## You need to apply reverseComplement() again on the result if you want
## them to be relative to the positive strand:
m2 <- reverseComplement(m1)
## and finally to apply rev() to sort the matches from left to right
## (5'3' direction) like in m0:
m3 <- rev(m2) # same as m0, finally!
## Don't try the above example on human chromosome 1 since your computer
## would need to allocate about 250Mb of memory for this:
if (FALSE) {
library(BSgenome.Hsapiens.UCSC.hg18)
chr1 <- Hsapiens$chr1
matchPattern(pattern, reverseComplement(chr1)) # DON'T DO THIS!
matchPattern(reverseComplement(pattern), chr1) # DO THIS INSTEAD
}