SD2

SD2 module performs spatial decorrelation of the second order for real signals

SD2 module removes dominant second order spatial correlations by computing a spatial covariance matrix and performing principal component analysis (PCA).

PCA is based on eigenvector-eigenvalue decomposition of the covariance matrix. PCA successfully decorrelates the factors with lag spacing of zero.

Sphering/Whitening

A linear transformation of vectors of random variables with known covariance matrix into a set of new variables whose covariance is identity matrix with unit variance. (Source: Wikipedia)

Problem

We may encounter non-orthogonal correlations and PCA fails to decorrelate those factors because of the orthogonality of basis vectors.

Progressive Solution

Spatially decorrelated data is further decorrelated in the temporal domain [TD2].

Parameters

data    - a 3n x T data matrix (number 3 is due to the x,y,z coordinates for each atom). Maybe a numpy
              array or a matrix where,

n       - size of the protein

T       - number of snapshots of MD trajectory

m       - dimensionality of the subspace we are interested in; Default value is None, in which case m=n

verbose - print information on progress. Default is true.

Returns

A 3n x m matrix U (NumPy matrix type), such that Y = U * data is a 2nd order spatially whitened
coordinates extracted from the 3n x T data matrix. If m is omitted, U is a square 3n x 3n matrix.

Ds  - has eigen values sorted by increasing variance

PCs - holds the index for m most significant principal components by decreasing variance S = Ds[PCs]

S   - Eigen values of the ‘data’ covariance matrix

B   - Eigen vectors of the ‘data’ covariance matrix. The eigen vectors are orthogonal.

Note

  • Firstly, the data is rigid body aligned by making use of IterativeMeansAlign module from ANCA package and then remove mean to center the data before performing the eigen value decomposition of the covariance matrix Alignment is not required if you are using dihedral/angular coordinates
  • Eigen value decomposition is then performed on the mean free data and the eigen values are sorted by decreasing variance to obtain m most energetically significant components
  • Sphering is carried out to obtain matrix Y such that, Y is spatially whitened by performing PCA analysis on m components of the real data
  • U is whitening matrix
  • Y is a matrix of spatially uncorrelated components