Journal of Applied Mathematics and Decision Sciences
Volume 3 (1999), Issue 1, Pages 7-19
doi:10.1155/S1173912699000012

Robustness of the sample correlation - the bivariate lognormal case

C. D. Lai,1 J. C. W. Rayner,2 and T. P. Hutchinson3

1Statistics, IIST, Massey University, New Zealand
2School of Mathematics and Applied Statistics, University of Wollongong, Australia
3School of Behavioural Sciences, Macquarie University, Australia

Copyright © 1999 C. D. Lai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The sample correlation coefficient R is almost universally used to estimate the population correlation coefficient ρ. If the pair (X,Y) has a bivariate normal distribution, this would not cause any trouble. However, if the marginals are nonnormal, particularly if they have high skewness and kurtosis, the estimated value from a sample may be quite different from the population correlation coefficient ρ.

The bivariate lognormal is chosen as our case study for this robustness study. Two approaches are used: (i) by simulation and (ii) numerical computations.

Our simulation analysis indicates that for the bivariate lognormal, the bias in estimating ρ can be very large if ρ0, and it can be substantially reduced only after a large number (three to four million) of observations. This phenomenon, though unexpected at first, was found to be consistent to our findings by our numerical analysis.