Optimization without retraction on the random generalized Stiefel manifold

Abstract

Optimization over the set of matrices that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods, such as Riemannian approaches, which require a computationally expensive eigenvalue decomposition involving fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimate of the feasible set. Our method does not enforce the constraint in every iteration exactly, but instead it produces iterations that converge to a critical point on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian counterparts involving the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and GEVP.

Publication
arXiv:2405.01702