You are on page 1of 10

Molecular Dynamics and the Wavelet

Transform
Elijah J. Gregory
Dept. of Pharmaceutics and Pharmaceutical Chemistry, University of Utah
Summer Undergraduate Research Fellowship
August 2004

Introduction

A useful technique for understanding the fast time scale motions of biomolecular systems, in their native en-
vironments or under interesting experimental conditions, is the use of molecular dynamics simulation. These methods
employ a model potential to take the effects of bond and nonbond interactions into account while solving Newton’s
equations of motion for the atomic system and its surroundings. Practicably the calculations must be made at a time
step of 1–2 femtoseconds. Attempts at less frequent calculation lead to a deterioration in the numerical stability of
the simulation. Simulations of nanoseconds and microseconds are possible, opening the door to view more complex
motions of large biomolecular systems and even of homomeric and heteromeric molecular complexes. The chaotic
nature of the data increases the complexity of trying to extract information from the resulting molecular dynamics
trajectory, however.
Analysis of numerical sequences is an ubiquitous problem in the physical sciences and modern experimental
data require, indeed encourage, the use of more advanced techniques. A wide variety of approaches have been employed,
ranging from statistical analyses to machine learning techniques to time and frequency analysis. We have recently
become interested in the wavelet transform. It has been used for a milieu of applications including, but not limited
to, image and movie compression, speech recognition and synthesis, efficient calculation of differential and integral
operator problems, statistical analysis of chaotic data and preconditioning of data for neural-network like algorithms.
These applications speak toward the potential for wavelets in analyzing molecular dynamics data.
The wavelet transform is a member of the general class of time-frequency transforms. One familiar with the
Fourier transform 1 may also be familiar with the so-called ”Short Time Fourier Transform”, another time-frequency
analysis technique. The underlying principle is to use an analyzing function that has a finite number of nonzero
elements. This allows frequency information to be collected which is relevent only to a small portion of the signal. In
this sense we understand time-frequency analyses to be local in nature. They allow one to focus attention on specific
portions of data. They also avoid the problem of information being obscured when the properties of very different
data are mixed into one result as can happen in global analyses of nonstationary data.
Unfortunately, this idea may not be extended ad infinitum. A result known as the Balian-Low theorem 2 shows
that it is impossible to construct a function which is compactly supported in both time and frequency. In fact, the
Heisenberg uncertainty principle gives definite bounds on the localization any function can achieve in both frequency
and time 5 . These results discouraged work in this direction for many years. Over time, bits and pieces of what would
eventually become the underlying framework of wavelet theory were developed in other contexts, including coherent
states in quantum mechanics 3;4 . Eventually in 1986 a continuous function with negligible values outside a compact
support in time and frequency was constructed 6 . A flurry of research followed this discovery over the next few years
and the limits of what wavelets may reveal are still being explored today.
In this paper a class of wavelets constructed by I. Daubechies in 1988 are used to aid the analysis of nonsta-
tionary molecular dynamics trajectories. This nonstationarity stems from two main sources. First and most evident is
the extreme sensitivity to initial conditions displayed by even moderately small biomolecules. Since the biomolecular
system is so complex, yet constrained to the energy surface of the molecule, the resulting dynamics are chaotic and
difficult to analyze clearly 7 . Second is the apparently stochastic transitions which are observed within the molecular
dynamics trajectories 8. This is what makes it so difficult to draw relationships between conformational shifts in the
molecule separated by time. Even though a molecule in a certain conformation may be more likely to exhibit a pro-
posed transition it is by no means guaranteed.
Early work took the opportunity to compare simulation to what would be predicted in a Langevin model 9
of a system of damped harmonic oscillators approximating the molecular system 10 . These experiments suggested
harmonic motions were the rule over short (sub-picosecond) timescales. Across larger timescales, however, this har-
monic approximation begins to break down. The resulting difference between the dynamics predicted by the Langevin

1
theory and by the molecular dynamics simulation as measured by differences in the predicted atomic autocorrelation
functions, has been termed the ’anharmonic’ component of the molecular dynamics 10 . Further researches revealed this
anharmonic component could be associated with collective low-frequency motions, involving much of the molecule 11;12 .
These motions may be elucidated by such techniques as Normal Mode Analysis, or Principal Component Analysis as
in the case of MD trajectories. This phenomenon has also been understood in terms of ”multiply-hierarchical” energy
minima 13 , a way of understanding the motions apparent across many timescales. The simplified conceptual model
presented was that of self-similar energy minima, with larger minima having smaller minima nested inside of them.
Recent work has also shown molecular dynamics trajectories could be classed with a combination of ARIMA, ARMA,
AR and MA nonstationary time series models 14 .
The class of wavelets used in the current work is known in the literature as the symmlet wavelet, or by the
more descriptive name, Daubechies’ least-asymmetric wavelets 2 . This class of functions has many useful properties,
first and foremost of which is the existence of an exact inverse transform. Owing to their general mathematical con-
struction they have many interpretations in different contexts. For instance, they may be viewed as the basis vectors
of a Multiresolution Approximation (MRA) 15 . As with the standard Fourier transform, each square integrable real
function has a unique wavelet representation. Then we may view the MRA as a means to project molecular dynamics
data onto a smooth approximation space, affording the opportunity to view the simulated collective dynamics of a
molecule without distracting random fluctuations.
The wavelet filters may also be interpreted as finite differencing filters, with smoothing properties in the fre-
quency domain and decorrelating properties in the time domain 2;16;17 . Initially this has been used to analyze chaotic
data, by decorrelating the data so that it approximates a normal distribution. This has been used in order to apply
statistical tests such as the Iterated Cumulative Sum of Squares test for variance change-points, due to Inclán and
Tiao 18;19 . This test is used to partition a time series into sets which show reduced within-set variability. In the context
of this work a change in the observed variance may be taken to indicate a new and different region of the energy surface
is being sampled, though this may not always be the case. In this way it is hoped more reliable estimates of molecular
properties may be made.
This idea of partitioning the MD trajectories has been recognized before. Recently Maragliano et al. used
a modified F-statistic to argue for a static partitioning of 200ps for molecular dynamics data 20 . Other researchers
have used large sudden changes of dihedral angles as a criterion for a partitioning of the trajectory 21 . Not only is
partitioning in time important for the analysis of molecular dynamics properties, but also partitioning in space 32 .

Wavelet Theory
A. Multiresolution Analysis
The idea of a multiresolution analysis(MRA) is an interesting one, that is, how should one represent an object
at varying resolutions. In 1989 S.G. Mallat published a paper laying the groundwork for the MRA 15 . A multiresolution
analysis is essentially a ladder of nested subspaces
· · · Vj ⊂ Vj−1 ⊂ Vj−2 ⊂ · · · ⊂ V1 ⊂ V0 ⊂ V− 1 ⊂ · · · (1)
such that the subspaces satisfy the following properties:
[
Vj = L2 (ℜ) (2)
j∈Z

+∞
\
= {∅} (3)
j=−∞

f (•) ∈ Vj ⇐⇒ f (2j •) ∈ V0 (4)


A more precise account of these properties and their implications may be found in a later work by S.G. Mallat 22 .
The wavelet filters used here satisfy these conditions and so the output of a wavelet transform corresponds to a
multiresolution analysis. Since the resulting projections span disjoint subspaces, we can move down a level (to a
higher resolution) by adding to a subspace Vj the difference between it and the subspace Vj−1 . This difference is
denoted by Wj and the reconstruction of the data from these subspaces can be written as: Vj ⊕ Wj = Vj−1 where
⊕ represents a direct orthogonal sum. All compactly supported wavelets are associated with an MRA, but not all
MRA’s generate a compactly supported wavelet basis. In a basic wavelet analysis one deals with two functions, a
scaling function (φ) and a wavelet function (ψ). The scaling function is what generates the approximation subspaces,
Vj , while the wavelet functions generate the difference between an approximation subspace and the subspace at the
next higher level of resolution, Wj . This provides a very simple framework for the decomposition and reconstruction
of data.

2
B. Filter Properties
The wavelet and scaling filters as constructed by Daubechies possess many special properties. A table of filter
coefficients such as used here may be found on p. 198 in Daubechies (1992) 2 . They are known also as Quadrature
Mirror Filters (QMFs) most often in the signal processing literature 23 . One immediate property is an exact inverse
transformation. The inverse transformation is affected by simply using time reversed versions of the filters used to
perform the decompostion. Another convenient property of the QMfs is that all of the filters used are generated from
one scaling function through the following ‘two-scale’ relations:
X
φj = hn φj−1,n (5)
n
X
ψj = gn φj−1,n (6)
n

Going back to some a priori known scaling function φ0 and noting that translation is a trivial transformation, all
of the wavelet and scaling functions making up the wavelet basis are generated from the original scaling function.
Alternatively, if one knows the scaling coefficients hn at a level j = 0 the wavelet coefficients can be found by:

gn = (−1)n h(L − n), f or n = 0, . . . , L (7)

where L = 2N and NPis the number of vanishing moments the wavelet posseses. Remember that the k th vanishing
L
moment is defined as n=0 nk hn = 0.
Another interesting property of these filters is the wavelet transform implementing them may be thought of
as an orthogonal transformation which may be made orthonormal by applying a normalizing constant √12 . Prior to
PL PL
normalization the filter coefficients satisfy the following conditions, n=0 h2n = 1 and n=0 gn2 = 1. With the nor-
malizing constant these expressions evaluate to 12 . For a real-valued discrete sequence this sum of squares is equal to
the area under the curve of its Fourier representation. Since the filters are applied by a convolution with the data the
filtering may be done as a multiplication in the frequency domain. This can be represented as: Wh = H · F, where Wh
is the set of wavelet coefficients obtained from filtering with the scaling filter h. Then it is easily seen that the filtering
with each filter preserves the sum-of-squares of the original data. This is a statement of the numerical stability of the
transform. It also ensures features in the wavelet domain are proportional to what they represent in the time domain.

C. Discrete Wavelet Transform Algorithms

i) Discrete Wavelet Transform(DWT)


The DWT is the simplest of the discrete wavelet transform variants. In this scheme the discretely sam-
pled trajectory is filtered with two filters represented by a compact (finite and bounded) set of coefficients.
This filtering corresponds to a convolution operation between the trajectory and the filter. Convolution
with the scaling filter produces the low-frequency approximation to the trajectory and the wavelet filter
produces the detail describing the high-frequency information missing from the approximation. Since the
filters are almost entirely concentrated over complementary intervals of width π 2;24 , these two convolutions
may be seen as partitioning the data into high-frequency and low-frequency subbands of equal width. This
is, of course, an approximation but a good one for understanding how the transform functions. Since the
wavelet transform preserves energy and the scaling and wavelet subspaces are orthogonal only half of the
coefficients in each subspace need be retained. The frequency content has not changed by the transform
and by the Shannon sampling theorem the same number of samples as in the original data should be able
to describe the frequency content completely. Indeed this is taken advantage of in the DWT algorithm.
After each filtering step, downsampling of the wavelet or scaling coefficients is performed. If a fast Fourier
transform spectral multiplication is used instead of a convolution, the algorithmic complexity may be re-
duced to O(N).
After this downsampling the filters are applied to the approximation subspace to obtain a still coarser
approximation to the data along with another set of intermediate-frequency wavelet coefficients. Down-
sampling is performed again and now we have a set of N/4 approximation coefficients covering the lowest
quarter of frequencies, a set of N/4 wavelet coefficients covering the next highest quarter of frequencies and
a set of N/2 wavelet coefficients describing the highest half of frequencies and the total number of samples
is N.
As is also seen in standard Fourier analysis, the issue of finite sampling time rears its head. The problem

3
stems from the fact that at the edges of the data where nothing is defined a discontinuity from the first or
last observation to zero may result. The affect on the analysis is the presence of a spurious discontinuity in
the data, which contains energy at all frequencies. This can cloud the results of a global Fourier analysis
and introduces artifacts near the borders of the data in a wavelet analysis. A number of methods have
been proposed and used to deal with this phenomenon. One simple method is to periodize the signal. This,
however, may also introduce artefactual results since the assumption that the data repeats may not be
accurate at all. Additionally, the observed values at the start and end of the experiment may not be close
to one another and so a spurious singularity may still be present. Another method is to reflect the ends
of the data back across the borders. This has the effect of overemphasizing the presence of the borders in
the data. A third method, used in wavelet analysis, is to pad the edges of the signal with an appropriate
exponentially decreasing tail. This results in a better approximation of the data, including the edges 25 .
In this work the periodization approach is taken and analyses which would be adversely affected by the
borders treated in this way are performed by exluding these border regions.
Recall at each step that a set of coarse approximation coefficients are obtained along with a set of detail
coefficients. To reconstruct the data take the coarsest approximation and its corresponding detail coeffi-
cients and insert a zero between each of their elements (i.e. perform an ‘upsampling’). These upsampled
coefficients are then filtered with a time-reversed version of the filter from which they were generated. This
corresponds to a correlation operation. The outputs from these correlations are then summed together to
obtain the next more detailed approximation and the process is repeated until the original data is recon-
structed. This is illustrated by the right-hand side of fig.1 below.
It should be noted the use of downsampling restricts the length of data considered to be a power of two.
For long datasets this is not to great a restriction as the data may be truncated to the next lowest power
of two, but for smaller trajectories this could pose a problem for using this transform. To work around
this problem, more frequent sampling of the trajectory should provide sufficient data. There is also the
problem of, which wavelet function is ”the best”. Obviously the results of the wavelet analysis will depend
to some degree on the shape of the wavelet function chosen. Here we have chosen a well-known Quadrature
Mirror Filter-type function, but many other choices for filters exist. This issue is compounded when the
standard DWT algorithm is employed. Downsampling results in what has been termed ”energy leakage”
where the wavelet representation shows frequency content where it does not actually exist in the data 26 .
This appears as spurious frequencies as well as poor localization of these frequencies in time. Fortunately,
this does not affect properties such as the exact inverse transformation but it can lead to misleading results
in a statistical analysis of the wavelet coefficients. The DWT is fast and efficient, but has limitations which
limit its real-world effectiveness and it is not used here.

Fig. 1 - Block Diagram of Discrete Wavelet Transform

ii) Disceret Wavelet Packet Transform(DWPT)


The DWPT shares many properties with its cousin, the Discrete Wavelet Transform. They make use
of the same filters and run into the same problems when dealing with the edges of data. In fact, the
only difference between the two is the filtering scheme employed. Whereas the filters are only applied to
the approximation coefficients in the DWT algorithm, in the DWPT algorithm they are applied to both

4
the approximation and detail coefficients at each step. Downsampling is performed after each filtering
operation so the number of samples retained after each step is still constantly N. Confusion can result from
the terminology of ’approximation’ and ’detail’ coefficients in this context and so the filtered output will
be referred to as a wavelet packet for the rest of the paper.
The discrete wavelet packet transform is also restricted to data of length a power of two for the same
reasons as the discrete wavelet transform. The advantage gained by this variant is to produce a finer division
of the frequency axis. In the discrete wavelet transform, most of the high frequency content is contained in a
single set of coefficients. Effectively the high frequency content of the signal is smeared across the subband
and detailed information is lost. The DWPT circumvents this problem by more frequent application of
the filters and so the high frequency content is smeared across fewer subbands and is better resolved at a
slightly increased computational cost. The algorithm still suffers from the energy ’leakage’ problem noted
above. The issue with misalignment of the features of the data and wavelet coefficients is known as shift
variance. Obviously, for applications seeking to use the wavelet transform to help describe the data shift
invariance is desired. For this reason the DWPT is not used here as well.
iii) Maximal Overlap Discrete Wavelet Packet Transform(MODWPT)
The maximal overlap discrete wavelet packet transform is a third variant of the DWT. It is exactly the
DWPT but for the method in which dilation of the filters is achieved. Instead of downsampling the filtered
output at each step, the scaling and wavelet filters are upsampled 35 . This corresponds to inserting a zero
between each coefficient of the filters at each step. In the frequency domain the filters are contracted so that
their periodic images appear within the frequency interval [0, 2π]. The area under the curve is preserved
and we see how the frequency axis is partitioned (see fig.2). Because no downsampling of the data occurs,
each set of coefficients retained contains N elements. The upshot of this expansion of data is the energy
leakage problem is alleviated and shift invariance is achievable. First, however, care must be taken with
the interpretation of the resulting wavelet packets. It might seem intuitive that the wavelet coefficients
produced by the scaling filter should always correspond to lower frequencies than those produced by the
wavelet filter. This is not true, however, after one round of upsampling one sees the scaling filter consists
of two mounds at the extremes of frequency while the wavelet filter is centered at middle frequencies. As
the transform is taken further and further the wavelet packets become even more scrambled in terms of
frequency content. Fortunately there is a simple description for the ordering observed in the figures above
known as a Gray Code, which will be described below.
It was mentioned before that the substition of filter upsampling for data downsampling would alleviate
the problem of shift variance as plagues the previous DWT variants. If one performs the MODWPT
algorithm as has been described thus far they will notice the features of the wavelet coefficients do not
quite line up with the features in the data! The problem is, while the features of data reconstructed from the
wavelet coefficients are shift invariant, the wavelet coefficients themselves are not quite. TO obtain shift-
invariance in the wavelet domain itself would require a filter with linear phase, which implies a symmetry
of the filter. This property is impossible to achieve with the Daubechies’ class of wavelets. Fortunately,
one may calculate a circular shift to apply to the data to obtain better than integer approximate shift
invariance, which is good enough for the discrete case. If the scaling filter is denoted h and the wavelet
filter by g then the calculation of the appropriate shift is given by 36 :
" L
# L L
!
1 X X X
(2L − 1) n|hn |2 + n|gn |2 − n|hn |2 b′ (8)
||h||2 n=0 n=0 n=0

where L is the length of the filter and b′ is the Gray code encoding of the wavelet packet.
iv) Algorithm
The MODWPT can be defined succinctly by the expression:
Lj −1
f˜j,n,l X(t−l)
X
Wj,n,l = mod N (9)
l=0

Here W represents a set of wavelet coefficients, j denotes the number of filtering steps to arrive at the
set of coefficients, n is a frequency index corresponding to the Gray code and f is a filter derived from a
combination of dilated wavelet and scaling filters. The Gray code encodes this derivation with a special
binary sequence. First consider that the 1-bit sequence 0 denotes filtering by the scaling filter and the
sequence 1 denotes filtering by the wavelet filter. To generate the 2-bit gray codes first take the 1-bit
sequences and prepend a 0 then reverse the 1-bit sequences to get 1 0 and prepend a 1. So there are

5
four total 2-bit gray codes, one for each of the wavelet packets after two rounds of filtering. Going from
left-to-right the gray codes [00 01 10 11] correspond to increasing frequency content of the wavelet pack-
ets. This construction of gray codes can be carried out to j-bits. The j th bit in the sequence defines
which filter, which has been upsampled j times, to convolve with the rest to give an overall filter. Not
only does this allow an easy frequency indexing of the wavelet packets, but the overall filter may be ap-
plied to the raw data to obtain the desired wavelet packet directly, reducing the computational cost greatly.

ICSS Algorithm
The Iterated Cumulative Sum of Squares algorithm in this work is due to Inclán and Tiao 18 . The algorithm is used to
affect a statistical test to detect whether or not the process variance generating a time series changes. The detection
of a change point depends on the magnitude of a test statistic defined as:
Ck k
Dk+ = − , k = 1, · · · , T, with D0 = DT = 0 (10)
CT T

where Ck = kt=1 a2t is the cumulative sum of squares of a sequence {at }∈ ℜ. With this statistic we can see that
P
if the variance is constant the cumulative sum of squares, Ck , will grow linearly with the length considered so that
Dk will be centered around zero. This test, however, is biased towards values near the center of the sequence. Even
more so, the statistic is biased towards values nearer the end as well. Since the statistic may be viewed alternately as
a ’distance’ from the start of the sequence, hence values near the start will almost certainly not be large enough to
pass the critical value test. To work around this problem a similar statistic defined as a ’distance’ from the end of the
sequence is used:
k Ck
Dk− = − , k = 1, · · · , T, with D0 = DT = 0 (11)
T CT
and the larger of the two is taken as our choice for signalling a putative variance change point, Dk ≡ max(Dk+ , Dk− ).
Any sequence of real numbers may be used as input for this algorithm. The statistic Dk is calculated for
the entire sequence and the maximum value is tested against a critical value to determine whether or not a bonafide
variance change has been detected. This point is used to define the end and start of two new segments of the original
data. Each of these segments is then input into the algorithm and the process is repeated until no change points are
detected. In order to achieve a greater localization of change points and to reduce the number of false positives we
implement a refinement scheme to the detected change points. For each change point a new test is made using the
neighboring two change points as the limits of the input. The beginning and end of the sequence may be used in lieu
of change points in the cases of the first and last detected change points, respectively. The new set of detected change
points is compared with the old set and the algorithm stops when the new set of change points are sufficiently close
to the old set.
The critical value for the test statistic Dk in this work is the asymptotic value of 1.358 given by Inclán and
Tiao for a large number of iid random variables 18 . Various studies have shown 128 samples to be sufficient for the
use of this test statistic without overshadowing omnipresent sampling error 27. Unfortunately, atomic MD trajectories
and indeed most any molecular property measured over time will also be highly correlated in time. This should not be
surprising, the system has momentum as well as a tendency to move within minima of the molecular energy surface.
Immediately, the asymptotic critical value for iid random variables is invalidated for the MD data and this is where
wavelets’ decorrelating properties come in handy. By acting as an Lth order finite differencing filter and smoothing
filter the data is decorrelated and made to approximate an independent random variable 17 .

Results
A. Smooth Approximation Space
The multiresolution analysis (MRA) suggests it may be possible to find a projection of the molecular dynamics
trajectory onto a smooth wavelet approximation subspace. This suggests the possibility of viewing the simulated
dynamics with the collective low-frequency motions mostly free of high-frequency harmonic fluctuations. Indeed,
wavelets are used in many denoising applications 28;29 . Here the smoothed trajectory is reconstructed from the low-
frequency trend wavelet packet at level j = 10. Unfortunately, due to unforeseen setbacks, a movie of this smoothed
molecular trajectory could not be produced in time for this printing. Soon in the future two movies will be available at
http://www.chpc.utah.edu/ dwee/MAwW/smoothGG6. One will show the full 15.3 nanosecond smoothed trajectory.
The second will show a downsampled version of the first. The smoothness of the trajectory allows for a downsampling
of the data for a faster more meaningful view of the dynamics. This is important for the elucidation of those large

6
scale motions which may be obfuscated by jittery high-frequency dynamics. Much greater downsampling is possible,
especially with the use of inerpolating schemes and this suggests also a way of generating and storing large numbers
of molecular motions efficiently in a database.
B. Partitions with Variance Change Points
As discussed before, partitioning of the time series into locally stationary segments is a desirable goal. In this
work the Iterated Cumulative Sum of Squares algorithm is employed to identify these locally stationary segments in
molecular dynamics data of nucleic acids. In theory, applying this method to the atomic trajectories would locate
subsets of atoms exhibiting variance changes in concert, suggesting a natural spatial partitioning of the molecule as in
domain motions or perhaps other less-obvious relationships. Currently, however, the technique has only been applied
to a limited amount of data. As an example of the methods utility the algorithm was applied to an all-atom root mean
square deviation over 1̃5ns of simulation time. First the data is filtered three times and the highest frequency packet
which well-approximates white noise is used as input to the ICSS algorithm. The suitability of the approximation is
currently made with a visual inspection of the autocorrelation function of a potential input. The algorithm finds six
changes of variance within the trajectory generating seven data segments as seen in fig.3.

Fig.3— (y) All-atom RMSd in Angstroms; Fig.4— (y) γ dihedral angle () on second guanine in,
(x) time in picoseconds counting 5’→ 3’, this angle involves the ether linkage to the
first guanine and the next three carbons along the DNA
backbone.; (x) time in picoseconds

Figure 4 shows the observation of a DNA backbone dihedral angle over the course of the simulation. There
are some readily apparent ‘switches’ where the dihedral angle seems constrained to sample different ranges. It is
interesting to note that some, but not all, of these switches occur when a variance change is detected in the all-atom
root-mean-square deviation. Since the all-atom RMSd is an average property of the molecular system it is reasonable
that a conformational shift described by one dihedral angle measure would not be apparent as is the case here. Below
is a table giving the all-atom atomic fluctuations for the entire sequence and each of the segments found by the ICSS
algorithm. The segments are numbered in the order they occur in fig.3 above.
Segment Start (in ps) End (in ps) variance (in Angstroms)
Full Simulation 1 15 303 1.2029
First Nanosecond 1 1 000 1.1233
Segment # 1 1 001 3 037 1.0355
Segment # 2 3 038 7 805 1.0858
Segment # 3 7 806 8 484 1.1735
Segment # 4 8 485 8 953 1.2178
Segment # 5 8 954 11 148 1.1833
Segment # 6 11 149 11 862 1.0193
Segment # 7 11 863 15 303 1.1232
Table 1— Intra-segment variability versus global variability
The atomic fluctuations in the DNA oligomer is seen to increase slowly at first and then more rapidly, followed
by a rapid decrease and then a further increase by the end of the simulation. Looking at a plot of the RMSd by atom
reveals unique patterns of variability amongst the atoms. For example, the data from segment # 1 exhibiting low
variability still shows increased fluctuations near the center of the helix and near the terminal ends, but decreased
fluctuations throughout the rest of the helix. The data from segment # 4, on the other hand, shows increased fluctua-
tions amongst most of the main chain atoms. It appears the ICSS algorithm is able to discern when a macromolecule
moves from one area of its conformational space to another. These results are encouraging, but do not reflect the real
power of the proposed method of partitoning.

7
Fig.5— Solid line indicates results from full-length simulation, dots indicate reults from segment# 4 and x’s indicate results from segment# 1

Conclusion
The wavelet transform is a very powerful and relatively new technique for analyzing and understanding non-stationary
data such as that collected in molecular dynamics simulations. The chaotic nature of the molecular motions mean
the molecular conformations sampled depend to a high degree on the exact initial coordinates of the atoms and its
surrounding environment, making finely detailed predictions of molecular motions impossible, except possibly with
some Markov-type models 30 . Additionally many analyses of molecular motions, including normal mode analysis, have
been shown to be sensitive to the specific subset of atoms chosen to be considered in the analysis 8;32 . The Iterated
Cumulative Sum of Squares algorithm was considered here in this respect. The chaotic MD data does not conform to
the assumptions placed on the data by the ICSS algorithm but the wavelet transform can be used to get an approxi-
mation of a white noise process from the data. While this method has not been widely employed, its effectiveness has
been hinted at here. By applying this test to the atomic trajectories in Cartesian space or dihedral space or perhaps
some other descriptive measurement space spatial subsets of the molecule, protein or nucleic acid, may be identified
as being related. A smoothing application was also considered as a means of obtaining meaningful visualizations of
low-frequency motions sampled during the simulation.
Some further applications have become apparent to the author, but they remain undeveloped as useful appli-
cations. One such is to speed up the estimation of eigenvectors and eigenvalues for the macromolecular system by
diagonalization of the covariance matrix. This technique is known as normal mode analysis when performed with an
experimental structure and potential function 9 . The same technique is known generally as principle component anal-
ysis 31 when performed on data-sets. In molecular dynamics data the potential function is implicit in the observations.
In the computatonal chemistry literature this is known as ”essential dynamics” among other things 32 . Recall the
wavelets’ decorrelating properties, which were taken advantage of in the ICSS algorithm. This decorrelation results
in data that is approximately uncorrelated and as such the covariance matrix is approximately diagonalized. Since
the wavelet transform is an orthonormal linear transform the eigenvalues remain invariant and the eigenvectors can
be obtained by means of the inverse wavelet transform. There is remarkably little information on the use of wavelets
for this purpose in the literature though there is no reason it will not. One problem may occur because it is the
low-frequency modes which are of interest and it is the low-frequency wavelet packets which retain the bulk of the
autocorrelation. It is the wavelet filter which decorrelates the data. The trend and the correlation are not abolished,
but rather ‘pushed’ into the lowest frequency packets. With a long enough simulation, on the order of 30-40ns it would
be possible to extend the wavelet analysis until even the low-frequency modes of interest become decorrelated. If such
long simulations were required for the method to be effective, this would be a sore limitation.
Finally, a few remarks should be made regarding the calculation of the filtering operations and the wavelet
transform itself. Using the Gray encoding scheme it is possible to forego many slow convolutions with a large dataset
and instead perform many smaller convolutions with wavelet and scaling filters with only one convolution with the
dataset itself. This speeds up calculations considerably, but for very low-frequency analyses the calculation may still
take one to two minutes. As mentioned before and as is well known, convolution in time corresponds to multiplication
in frequency. Thus, we may speed up calculations by using fast Fourier transforms followed by a multiplication and a
fast inverse transform. This method of calculation will soon be incorporated into a MATLAB library of functions for

8
performing 1D wavelet analysis along with an ICSS algorithm and subroutines for performing the analyses described
in this work. In the future a technique known as the lifting scheme 33 will be implemented. This is an exciting new
technique for factoring many types of transforms into a series of algebraic steps. It allows in-place calculation and
runs faster than the fast Fourier transform, especially as the problem scales up 33 . It also opens up the door to such
constructs as ‘spatial wavelets’ which can be used to implement 3D multiresolution algorithms for solving differential
and integral operators 34 . Possibly there are applications of wavelets for molecular dynamics not yet imagined and it
is a terrifically exciting prospect.

Acknowledgements
The author would like to thank the Center for High Performance Computing for their support and use of computer
time for simuations and calculations. Thanks also to Thomas Cheatham and the Dept. of Pharmaceutics and Phar-
maceutical Chemistry for their support and sponsorship.

References
[1] Bracewell, R. N.; The Fourier Transform and its Applications, 3rd Ed. McGraw-Hill International Editions; 2000.
[2] Daubechies, I.; Ten Lectures on Wavelets, CBMS-NSF Regional Conferences in Applied Mathematics, SIAM
Publishing, Philadelphia; 1992.
[3] Klauder, J.R., Skagerstam, B.-S.; Coherent States, World Scientific, Singapore; 1985.
[4] Meyer, Y., Ryan, R.D.; Wavelets: Algorithms and Applications, SIAM Publishing, Philadelphia; 1993.
[5] Mallat, S.G.; A Wavelet Tour of Signal Processing, 2nd Ed. Academic Press; 1999.
[6] Kronland-Martinet, R., Morlet, J., Grossmann, A.; Int. J. Pattern Recognition and Aritificial Intelligence, v. 1,
pp. 273–301; 1987.
[7] Braxenthauler, M., Unger, R., Auerbach, D., Given, J.A., Moult, J.; Proteins: Struct., Func. and Gen., v. 29,
pp. 417–425; 1997.
[8] van Aalten, D.M.F., de Groot, B.L., Findlay, J.B.C., Berendsen, H.J.C., Amadei, A.; J. Comput. Chem., v. 18(2),
pp. 169–181; 1997.
[9] Lamm, G., Szabo, A.; J. Chem. Phys., v. 85(12), pp. 7334–7348; 1986;
[10] Swaminathan, S., Ichiye, T., van Gunsteren, W., Karplus, M.; Biochemistry, v. 21, pp. 5230–5241; 1982.
[11] Levy, R.M., Karplus, M., Kushick, J., Perahia, D.; Macromolecules, v. 17, pp. 1370-1374; 1984.
[12] Brooks, B., Karplus, M.; Proc. Natl. Acad. Sci., Pt. I Biological Sciences, v. 80(21), pp. 6571–6575; 1983.
[13] Kitao, A., Hayward, S., Go, N.; Proteins: Struct., Func. and Gen., v. 33, pp. 496–517; 1998.
[14] Alakent, B., Doruker, P.; J. Chem. Phys., v. 120(2), pp. 1072–1088; 2004.
[15] Mallat, S.G.; Tran. Am. Math. Soc., v. 315(1), pp. 69–87; 1989.
[16] Wornell, G.W.; IEEE Trans. Info. Theory, v. 36(4), pp. 859–861; 1990.
[17] Percival, D.B., Walden, A.T.; Wavelet Methods for Time Series Analysis, Secn. 9.1, Cambridge University Press;
2000.
[18] Inclán, C., Tiao, G.C.; J. Am. Stat. Assoc., v. 89(427), pp. 913–923; 1994.
[19] Whitcher, B., Byers, S.D., Guttorp, P., Percival, D.B.; Technical Report, Nat’l Research Center for Statistics and
the Environment, University of Washington, Seattle; 2000.
[20] Maragliano, L., Cottone, G., Cordone, L., Ciccotti, G.; Biophysics J., v. 86, pp. 2765–2772; 2004.
[21] Kitao, A., Go, N.; Curr. Opin. Struct. Biol., v. 9, pp. 164–169; 1999.
[22] Mallat, S.G.; A Wavelet Tour of Signal Processing, Ch. 7, 2nd Ed. Academic Press; 1999.
[23] Vetterli, M., Herley, C.; IEEE Trans. Signal Proc., v. 40(9), pp. 2207–2232; 1992.

9
[24] Daubechies, I.; Comm. Pure Applied Math., v. 41, pp. 909-996; 1988.
[25] Meyer, S.D.; Monthly Weather Rev., v. 121, pp. 2858–2866; 1993.
[26] Qiu, J., Tha Paw U, K., Shaw, R.H.; J. Geophys. Research, v. 100(D12), pp. 25 769–25 779; 1995.
[27] Gabbanini, F., Vannucci, M., Bartoli, G., Moro, A.; J. Comput. Graphical Stat., accepted for publication; 2004.
[28] Donoho, D.; IEEE Trans. Info. Theory, v. 43, pp. 613–627; 1995.
[29] Coifman, R.R., Wickerhauser, M.V.; Time-Frequency and Wavelets in Biomedical Engineering, ed. Akay, M.,
IEEE Press, Piscatawny N.J., pp. 323–346; 1998.
[30] de Groot, B.L., Daura, X., Mark, A.E., Grubmüller, H.; J. Mol. Biol., v. 309, pp. 299–313; 2001.
[31] Jolliffe, I.T.; Principal Component Analysis, Springer-Verlag, Berlin; 2002.
[32] Abseher, R., Nilges, M.; J. Mol. Biol., v. 297, pp. 911–920; 1998.
[33] Daubechies, I., Sweldens, W., available through the world wide web at: http://cm.bell-labs.com/who/wim/ -
”Factoring Wavelet Transforms into Lifting Steps”; Sweldens, W., ”The Lifting Scheme: A New Philosophy in
Biorthogonal Wavelet Construction”.
[34] Amaratunga, K., Castrillon-Candas, J.E.; Int. J. Numerical Methods in Eng., v. 52, pp. 239-271; 2001.
[35] Walden, A.T., Contreras-Cristan, A.; Proc. Math. Phys. Engng. Sci., v. 454(1976), pp. 2243–2266; 1998.
[36] Hess-Nielsen, N., Wickerhauser, M.V.; Proc. IEEE, v. 84(4), pp. 523–540.

10

You might also like