dist2 {genefilter} | R Documentation |
Calculate an n-by-n matrix by applying a function to pairs of columns of an m-by-n matrix.
dist2(x, fun=function(a,b) mean(abs(a-b), na.rm=TRUE), diagonal=0)
x |
A matrix, or any object x for which ncol(x)
and x[,j] return appropriate results. |
fun |
A symmetric function of two arguments that may be
columns of x . |
diagonal |
The value to be used for the diagonal elements of the resulting matrix. |
With the default value of fun
, this function calculates
for each pair of columns of x
the mean of the absolute values
of their differences (which is proportional to the L1-norm of their
difference). This is a distance metric.
The implementation assumes that fun
is symmetric,
fun(a,b)=fun(b,a)
. Hence, the
returned matrix is symmetric.
fun(a,a)
is not evaluated, instead the value of diagonal
is used to fill the diagonal elements of the returned matrix.
A use for this function is the detection of outlier arrays in a
microarray experiment. Assume that each column of x
can be
decomposed as
z+β+ε, where z is a fixed vector
(the same for all columns), ε is vector of
nrow{x}
i.i.d. random numbers, and β is an arbitrary
vector whose majority of entries are negligibly small (i.e. close to
zero). In other words, Dz the probe effects, ε
measurement
noise and β differential expression effects. Under this
assumption, all entries of the resulting distance matrix should be the
same, namely a multiple of the standard deviation of ε.
Arrays whose distance matrix entries are way different give cause
for suspicion.
A symmetric matrix of size n x n
.
z = matrix(rnorm(15693), ncol=3) dist2(z)