Singularity in Data Analysis


Steven Ellis


October 25, 2019

Statistical data by their very nature are indeterminate in the sense that if one repeated the process of collecting the data the new data set would be somewhat different from the original. Therefore, a statistical method, f, taking a data set x to a point in some space F, should be stable at x: Small perturbations in x should result in a small change in f(x). Otherwise, f is useless at x or – and this is important – near x. So one doesn’t want f to have “singularities”, a data set x such that the the limit of f(y) as y approaches x doesn’t exist. (Yes, the same issue arises elsewhere in applied math.)

However, broad classes of statistical methods have topological obstructions of continuity: They must have singularities. In this talk I will show why and give lower bounds on the Hausdorff dimension, even Hausdorff measure, of the set of singularities of such methods. I will give examples.