Technical Reports

Technical Reports

2008-001: Generalized Maximum Likelihood Estimation of Normal Mixture Densities, by Cun-Hui Zhang [3/08]

We study the generalized maximum likelihood estimator of location and location-scale mixtures of normal densities. A large deviation inequality is obtained which provides the convergence rate n-p/(4+2p)}(log n)Kp in the Hellinger distance for mixture densities when the mixing distributions have bounded finite p-th weak moment, p > 0, and the convergence rate n-1/2(log n)K when the mixing distributions have an exponential tail uniformly. Our results are applicable to the estimation of the true density of independent identically distributed observations from a normal mixture as well as the estimation of the average marginal densities of independent not identically distributed observations from different normal mixtures. The validity of our results for mixing distributions with the p-th weak moment, 0 < p < 2 and for not identically distributed observations is of special interest in the compound estimation and other problems involving sparse normal means.

2008-002: Forward & Backward Greedy Algorithm for Learning Sparse Representation, by Tong Zhang [4/11/08]

Given a large number of basis functions that can be potentially more than the number of samples, we consider the problem of learning a sparse target function that can be expressed as a linear combination of a small number of these basis functions. We are interested in two closely related themes

  • feature selection, or identifying the basis functions with non-zero coefficients;
  • estimation accuracy, or reconstructing the target function from noisy observations.
  • Two methods that are widely used to solve this problem are forward and backward greedly algorithms. First, we show that neither idea is adequate. Second, we propose a novel combination that simultaneously incorporates forward and backward steps into each greedy iteration. For least squares regression, we develop strong theoretical results for new procedure showing that it can effectively solve this problem under moderate assumptions. Experiments confirm the effectiveness of the new procedure for achieving sparsity.