In the analysis of multivariate data, it is frequently desirable to employ statistical methods which are insensitive to the presence of outliers in the sample. To address the problem of outliers, it is important to develop robust statistical procedures. Most statistical procedures include explicit or implicit prior assumptions about the distribution of the observations, but often without taking into account the effect of outliers. The purpose of this paper is to present a novel robust version of PCA which has some attractive features.
Principal components analysis (PCA) is considered to be one of the most important techniques in statistics. However, the classical version of PCA depends on either a covariance or a correlation matrix, both of which are very sensitive to outliers. We develop an alternative method to classical PCA, which is far more robust, by using a multivariate Cauchy likelihood to construct a robust principal components (PC) procedure. It is an adaptation of the classic method of PCA obtained by replacing the Gaussian log-likelihood function by the Cauchy log-likelihood function, in a sense that will be explained in section 2.2. Although we do not claim that the interpretation of standard PCA in terms of operations on a Gaussian likelihood is new, see Bolton and Krzanowski, this fact does not appear to have been exploited in the development of a robust PCA procedure, as we do in this paper. An important reason for using the multivariate Cauchy likelihood is that this likelihood has only one maximum point, but the single most important motivation is that it leads to a robust procedure.