The Nonconvex Geometry of Linear Inverse Problems. (arXiv:2101.02776v1 [math.OC])

The gauge function, closely related to the atomic norm, measures the
complexity of a statistical model, and has found broad applications in machine
learning and statistical signal processing. In a high-dimensional learning
problem, the gauge function attempts to safeguard against overfitting by
promoting a sparse (concise) representation within the learning alphabet.

In this work, within the context of linear inverse problems, we pinpoint the
source of its success, but also argue that the applicability of the gauge
function is inherently limited by its convexity, and showcase several learning
problems where the classical gauge function theory fails. We then introduce a
new notion of statistical complexity, gauge$_p$ function, which overcomes the
limitations of the gauge function. The gauge$_p$ function is a simple
generalization of the gauge function that can tightly control the sparsity of a
statistical model within the learning alphabet and, perhaps surprisingly, draws
further inspiration from the Burer-Monteiro factorization in computational

We also propose a new learning machine, with the building block of gauge$_p$
function, and arm this machine with a number of statistical guarantees. The
potential of the proposed gauge$_p$ function theory is then studied for two
stylized applications. Finally, we discuss the computational aspects and, in
particular, suggest a tractable numerical algorithm for implementing the new
learning machine.



Related post