*TD4*ΒΆ

**TD4 module performs temporal decorrelation of fourth order on time-delayed cumulant matrices**

In the interest of resolving spatial and temporal anharmonic dependencies in the molecular simulation trajectories, we have designed the TD4 module which performs joint diagonalization of time-delayed cumulant matrices (a tensor of fourth-order time-delayed statistics signifying kurtosis). TD4 is the counterpart of SD4, where fourth-order spatial correlations are minimized, implying zero time lag.

Conceptually, the assumption we make is that a molecular simulation trajectory is a linear combination of independent, anharmonically fluctuating protein motions. To discover these anharmonic motions, we borrow a technique from signal processing literature, called Blind Source Separation (BSS), which attempts to extract or unmix independent non-Gaussian sources from signal mixtures with Gaussian noise. To facilitate the extraction of anharmonic modes of motion of the fourth-order, the trajectory data \(X_{orig} \in \mathbb{R}^{3N \times t}\), where *3N* represents *(x,y,z)* coordinates from individual atom selections and *t* represents conformations is decorrelated for second-order dependencies both spatially and temporally by transforming it through the modules of SD2 and TD2. SD2 module removes dominant second order spatial correlations by computing a spatial covariance matrix and performing principal component analysis (PCA). The function diagonalizes covariance matrix to obtain the projection matrix \(Y = B^TX_{orig} (m \times t)\), where *m* is subspace dimensionality and \(B (3N \times m)\) are the dominant eigenvectors. Consequently, TD2 module removes dominant second order temporal correlations by computing a time-delayed (specified by a lag time \(\tau\)) covariance matrix and performing PCA. A matrix *Z* is obtained by projecting the spatially resolved data matrix *Y* onto the dominant eigenvectors \(B_{TD2}\). The matrix *Z* then undergoes transformations to retrieve mutually independent signals by obtaining a separating matrix \(W \in \mathbb{R}^{m \times 3N}\).

Algorithmically, the method of unmixing temporally correlated signals of fourth-order can be viewed as a symmetric eigenvalue problem of a generalized cumulant matrix \(Q_{ij}\). As a measure of statistical independence, we will consider the ‘diagonality’ of a set of cumulant matrices. The cumulant matrices are generated in a low-dimensional subspace denoted by *m*, which is the best guess for the most compact summary of the fourth-order statistics.
The subspace dimensionality can be adjusted by examining the inflection points in the cumulative variance plots generated from SD2 module.

In order to generate the cumulant matrices, a time-lagged covariance matrix is defined by:

where \(Z \in \mathbb{R}^{m \times t}\) is second-order spatially and temporally resolved molecular simulation data, \(\tau\) is time delay and \(Z_{\tau} = Z(t-\tau)\) is the time-lagged version of *Z*. A fourth-order cumulant matrix \(Q_{ij}\) of this data matrix *Z* is defined by:

where \(Q_{ij} \in \mathbb{R}^{m \times m}\) computes a time-lagged cumulant matrix. The possibility of computational errors, such as round-off errors, can destroy the symmetricity of the cumulant matrix which is restored by performing:

A time-lagged cumulant tensor \(\mathbb{Q} \in \mathbb{R}^{m \times (m \times k)}\), where *k* = \([{m \times (m+1)}]/2\) is defined for the storage of cumulant matrices computed by the symmetric \(Q_{ij}\) matrix.
Joint diagonalization of these time-lagged cumulant matrices reduces fourth-order temporal dependencies leading to anharmonic modes of motion of the trajectory data. This is done through Jacobi’s iterative method of finding solution to a system of linear equations. In particular, the method uses successive transformations to calculate diagonal elements of the cumulant tensor by decimating off-diagonal elements with each iteration.
The spatio-temporally decorrelated matrix of fourth-order is computed by:

where *W* attempts to separate sources from signal mixture \(X_{orig}\) by finding directions, such that projections onto these directions have maximum statistical independence. The computed parameter \(Z_{TD4}\) is fourth-order spatially and temporally resolved matrix.

**Parameters**

```
Z - an mxT spatially uncorrelated of order 2 and temporally uncorrelated of order 2 matrix (m subspaces, T samples). May be a numpyarray or matrix where,
m - dimensionality of the subspace we are interested in. Defaults to None, in which case m=n.
T - number of snapshots of MD trajectory
V - separating matrix obtained after doing the PCA analysis on m components of real data followed temporal decorrelation of the spatially whitened data
lag - lag time in the form of an integer denoting the time steps
verbose - print information on progress. Default is true.
```

**Returns**

```
W - a separating matrix obtained from resolving fourth order temporal correlations
```

**Reference**

- Georgiev, P., & Cichocki, A. (2003). Robust independent component analysis via time-delayed cumulant functions. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 86(3), 573-579.