The idea that many important classes of signals can be well-represented by linear combinations of a small set of atoms selected from a given dictionary has had dramatic impact on the theory and practice of signal processing. For practical problems in which an appropriate sparsifying dictionary is not known ahead of time, a very popular and successful heuristic is to search for a dictionary that minimizes an appropriate sparsity surrogate over a given set of sample data. While there is a body of empirical evidence suggesting this approach does learn very effective representations, there is little theoretical guarantee. In this paper, we show that under mild hypotheses, the dictionary learning problem is locally well-posed: the desired solution is indeed a local minimum of the ℓ1 norm. Namely, if A ∈ ℝm×n is an incoherent (and possibly overcomplete) dictionary, and the coefficients X ∈ ℝn×p follow a random sparse model, then with high probability (A,X) is a local minimum of the ℓ1 norm over the manifold of factorizations (A′,X′) satisfying A′X′ = Y, provided the number of samples p = Ω(n3k). For overcomplete A, this is the first result showing that the dictionary learning problem is even locally solvable using ℓ1-minimization.
展开▼