Data mining in high dimensionality typically faces the consequences of increasing sparsity and declining differentiation between points, while sparsity tends to increase false negatives. Here, the problem of solving high-dimensional problems using low-dimensional solutions is addressed. In clustering, we provide a new framework for finding candidate subspaces and the clusters within them using only two-dimensional clustering. It is robust to noise and handles overlapping clusters. In the field of outlier detection, several novel algorithms suited to high-dimensional data are presented.m These...
Data mining in high dimensionality typically faces the consequences of increasing sparsity and declining differentiation between points, while sparsit...