# TD4¶

TD4 module performs temporal decorrelation of fourth order on time-delayed cumulant matrices

In the interest of resolving spatial and temporal anharmonic dependencies in the molecular simulation trajectories, we have designed the TD4 module which performs joint diagonalization of time-delayed cumulant matrices (a tensor of fourth-order time-delayed statistics signifying kurtosis). TD4 is the counterpart of SD4, where fourth-order spatial correlations are minimized, implying zero time lag.

Conceptually, the assumption we make is that a molecular simulation trajectory is a linear combination of independent, anharmonically fluctuating protein motions. To discover these anharmonic motions, we borrow a technique from signal processing literature, called Blind Source Separation (BSS), which attempts to extract or unmix independent non-Gaussian sources from signal mixtures with Gaussian noise. To facilitate the extraction of anharmonic modes of motion of the fourth-order, the trajectory data $$X_{orig} \in \mathbb{R}^{3N \times t}$$, where 3N represents (x,y,z) coordinates from individual atom selections and t represents conformations is decorrelated for second-order dependencies both spatially and temporally by transforming it through the modules of SD2 and TD2. SD2 module removes dominant second order spatial correlations by computing a spatial covariance matrix and performing principal component analysis (PCA). The function diagonalizes covariance matrix to obtain the projection matrix $$Y = B^TX_{orig} (m \times t)$$, where m is subspace dimensionality and $$B (3N \times m)$$ are the dominant eigenvectors. Consequently, TD2 module removes dominant second order temporal correlations by computing a time-delayed (specified by a lag time $$\tau$$) covariance matrix and performing PCA. A matrix Z is obtained by projecting the spatially resolved data matrix Y onto the dominant eigenvectors $$B_{TD2}$$. The matrix Z then undergoes transformations to retrieve mutually independent signals by obtaining a separating matrix $$W \in \mathbb{R}^{m \times 3N}$$.

Algorithmically, the method of unmixing temporally correlated signals of fourth-order can be viewed as a symmetric eigenvalue problem of a generalized cumulant matrix $$Q_{ij}$$. As a measure of statistical independence, we will consider the ‘diagonality’ of a set of cumulant matrices. The cumulant matrices are generated in a low-dimensional subspace denoted by m, which is the best guess for the most compact summary of the fourth-order statistics. The subspace dimensionality can be adjusted by examining the inflection points in the cumulative variance plots generated from SD2 module.

In order to generate the cumulant matrices, a time-lagged covariance matrix is defined by:

$R_z{(\tau)} = E\left\{ZZ_{\tau}^T\right\},$

where $$Z \in \mathbb{R}^{m \times t}$$ is second-order spatially and temporally resolved molecular simulation data, $$\tau$$ is time delay and $$Z_{\tau} = Z(t-\tau)$$ is the time-lagged version of Z. A fourth-order cumulant matrix $$Q_{ij}$$ of this data matrix Z is defined by:

$Q_{ij} = E\left\{ZZ^TZ_{\tau}^TZ_{\tau}\right\} - E\left\{ZZ^T\right\} \textrm{tr}\, E\left\{Z_{\tau}Z_{\tau}^T\right\} -2E\left\{ZZ_{\tau}^T\right\}E\left\{Z_{\tau}Z^T\right\},$

where $$Q_{ij} \in \mathbb{R}^{m \times m}$$ computes a time-lagged cumulant matrix. The possibility of computational errors, such as round-off errors, can destroy the symmetricity of the cumulant matrix which is restored by performing:

$Q_{ij} = \frac{1}{2} \left[Q_{ij} + Q_{ij}^T\right].$

A time-lagged cumulant tensor $$\mathbb{Q} \in \mathbb{R}^{m \times (m \times k)}$$, where k = $$[{m \times (m+1)}]/2$$ is defined for the storage of cumulant matrices computed by the symmetric $$Q_{ij}$$ matrix. Joint diagonalization of these time-lagged cumulant matrices reduces fourth-order temporal dependencies leading to anharmonic modes of motion of the trajectory data. This is done through Jacobi’s iterative method of finding solution to a system of linear equations. In particular, the method uses successive transformations to calculate diagonal elements of the cumulant tensor by decimating off-diagonal elements with each iteration. The spatio-temporally decorrelated matrix of fourth-order is computed by:

${Z_{TD4}} = W X_{orig},$

where W attempts to separate sources from signal mixture $$X_{orig}$$ by finding directions, such that projections onto these directions have maximum statistical independence. The computed parameter $$Z_{TD4}$$ is fourth-order spatially and temporally resolved matrix.

Parameters

Z       - an mxT spatially uncorrelated of order 2 and temporally uncorrelated of order 2 matrix (m subspaces, T samples). May be a numpyarray or matrix where,

m       - dimensionality of the subspace we are interested in. Defaults to None, in which case m=n.

T       - number of snapshots of MD trajectory

V       - separating matrix obtained after doing the PCA analysis on m components of real data followed temporal decorrelation of the spatially whitened data

lag     - lag time in the form of an integer denoting the time steps

verbose - print information on progress. Default is true.


Returns

W - a separating matrix obtained from resolving fourth order temporal correlations


Reference

1. Georgiev, P., & Cichocki, A. (2003). Robust independent component analysis via time-delayed cumulant functions. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 86(3), 573-579.