Clustering (unsupervised learning) is the task of discovering groups of observations. In the case of directional data researchers have addressed this problem by either using hierarchical clustering (Lund, 1999) the k-Means algorithm (Hornik et al., 2012) or model based clustering with the von Mises-Fisher (Banerjee et al., 2005) or the Kent distribution (Peel et al., 2001). More recently, Amayri and Bouguila (2013) included the task of feature selection into model based clustering using, mixtures of von Mises-Fisher distributions.
With discriminant analysis (supervised learning, or classication) on the other hand, the group of each observation is known and unlike clustering, the literature is far less populated. Both (Morris and Laycock, 1974) and (Figueiredo, 2009) performed supervised learning and conducted simulation studies using the von Mises-Fisher distribution. Examples of classication with directional data can be found in many scientic elds. For example, classication of the wind direction according to the year's season (Mardia and Jupp, 2000). Classifying the constitutes measurements of magnetic remanence in rock specimens, after each specimen had been partially thermally demagnetised to the same stage (Fisher et al., 1993). Separating between the longest axis and shortest axis orientations of tabular stones measured on a slope at Windy Hills, Scotland (Fisher et al., 1993).
The drawback of the aforementioned papers is that they attack the problem with limited stepping stones. In most papers, applied or not, the von Mises-Fisher distribution is used, perhaps due to its convenient form and easiness to work with. The von Mises-Fisher tough assumes independent variables, whereas the Kent distribution has elliptical contours, allowing for correlation between the variables. Yet, there are more spherical distributions than just these two and more algorithms than simple maximum likelihood discriminant analysis. However, no one, to the best of our knowledge, has studied supervised learning for directional (or even spherical) data using more than one distribution or even more algorithms.
In this paper we focus on discriminant analysis with spherical data expanding the work of Figueiredo (2009), by including three more distributions, the Independent Angular Gaussian, or projected normal, (Mardia and Jupp, 2000), the Kent distribution (Kent, 1982), and the Elliptically Symmetric Angular Gaussian distribution (Paine et al., 2018). In addition, the, non parametric, k-NN algorithm (Cover and Hart, 1967) coupled with the cosine distance is put in the testbed for comparison. Our goal is to provide evidence as to which distribution is more suitable and whether the k-NN algorithm should be employed for supervised learning with spherical data.