Lei-Hong Zhang, Xijun Ma, Chungen Shen

SIAM Journal on Scientific Computing, Volume 43, Issue 4, Page A2685-A2713, January 2021.

Data points from many recent real applications usually have a multiview structure in the sense that they are drawn from a multivariate random variable $\mathtt{v}\in \mathbb{R}^n$ that can be partitioned into multiple, say, $m$, subvariables (i.e., multiview) $\mathtt{v}_i\in \mathbb{R}^{n_i}$ for $i=1,\ldots,m$ and $\sum_{i=1}^mn_i=n$. The multiview canonical correlation analysis is a statistical approach which fuses each subvariable $\mathtt{v}_i$ into a reduced one $s_i={x}_i^{T}\mathtt{v}_i$ through a linear combination ${x}_i$ so that the fused $m$ random variables achieve the maximum of a certain type of correlation. Among many criteria that measure the correlation, one of the earliest rules is to maximize the sum of all pairwise correlations subject to the ellipsoidal constraint of each ${x}_i$. The model is commonly referred to as the maximal correlation problem (MCP), and the associated KKT system is called the multivariate eigenvalue problem (MEP). Existing methods for MCP and/or MEP may encounter slow convergence or be inapplicable in applications with high-dimensional features. This paper proposes a Krylov subspace-type method by exploiting the special structure of the ellipsoidal constraint of ${x}_i$. Both the global convergence and the local convergence rate are studied, and numerical verification of the efficiency is carried out on both synthetic examples and applications of the unsupervised feature fusion with real data.