Control System Design - An Introduction To State-Space Methods - Friedland

Frontiers in Applied Dynamical Systems:
Reviews and Tutorials 5
Andrew J. Majda
Introduction
to Turbulent
Dynamical
Systems in
Complex Systems
Frontiers in Applied Dynamical Systems:
Reviews and Tutorials
Volume 5
Frontiers in Applied Dynamical Systems: Reviews and Tutorials
The Frontiers in Applied Dynamical Systems (FIADS) covers emerging topics and
significant developments in the field of applied dynamical systems. It is a collection
of invited review articles by leading researchers in dynamical systems, their
applications and related areas. Contributions in this series should be seen as a portal
for a broad audience of researchers in dynamical systems at all levels and can serve
as advanced teaching aids for graduate students. Each contribution provides an
informal outline of a specific area, an interesting application, a recent technique, or
a “how-to” for analytical methods and for computational algorithms, and a list of
key references. All articles will be refereed.
Editors-in-Chief
Christopher K.R.T Jones, University of North Carolina, Chapel Hill, USA
Björn Sandstede, Brown University, Providence, USA
Lai-Sang Young, New York University, New York, USA
Series Editors
Margaret Beck, Boston University, Boston, USA
Henk A. Dijkstra, Utrecht University, Utrecht, The Netherlands
Martin Hairer, University of Warwick, Coventry, UK
Vadim Kaloshin, University of Maryland, College Park, USA
Hiroshi Kokubu, Kyoto University, Kyoto, Japan
Rafael de la Llave, Georgia Institute of Technology, Atlanta, USA
Peter Mucha, University of North Carolina, Chapel Hill, USA
Clarence Rowley, Princeton University, Princeton, USA
Jonathan Rubin, University of Pittsburgh, Pittsburgh, USA
Tim Sauer, George Mason University, Fairfax, USA
James Sneyd, University of Auckland, Auckland, New Zealand
Andrew Stuart, University of Warwick, Coventry, UK
Edriss Titi, Texas A&M University, College Station, USA,
Weizmann Institute of Science, Rehovot, Israel
Thomas Wanner, George Mason University, Fairfax, USA
Martin Wechselberger, University of Sydney, Sydney, Australia
Ruth Williams, University of California, San Diego, USA
More information about this series at http://www.springer.com/series/13763

Andrew J. Majda
Introduction to Turbulent
Dynamical Systems
in Complex Systems
123
Andrew J. Majda
New York University
New York, NY
USA
ISSN 2364-4532 ISSN 2364-4931 (electronic)

Frontiers in Applied Dynamical Systems: Reviews and Tutorials
ISBN 978-3-319-32215-5 ISBN 978-3-319-32217-9 (eBook)
DOI 10.1007/978-3-319-32217-9
Library of Congress Control Number: 2016947470
Mathematics Subject Classification (2010): 62M20, 76F55, 86-08, 86A22, 86A10, 82C31, 82C80,
37A60
© Springer International Publishing Switzerland 2016

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG Switzerland
Preface
Turbulent dynamical systems are ubiquitous complex systems in geoscience and

engineering and are characterized by a large dimensional phase space and a large
dimension of strong instabilities which transfer energy throughout the system. They
also occur in neural and material sciences. Key mathematical issues are their basic
mathematical structural properties and qualitative features, their statistical predic-
tion and uncertainty quantification (UQ), their data assimilation, and coping with
the inevitable model errors that arise in approximating such complex systems.
These model errors arise through both the curse of small ensemble size for large
systems and the lack of physical understanding. This is a research expository article
on the applied mathematics of turbulent dynamical systems through the paradigm of
modern applied mathematics involving the blending of rigorous mathematical
theory, qualitative and quantitative modeling, and novel numerical procedures
driven by the goal of understanding physical phenomena which are of central
importance. The contents include the general mathematical framework and theory,
instructive qualitative models, and concrete models from climate atmosphere ocean
science. New statistical energy principles for general turbulent dynamical systems
are discussed with applications, linear statistical response theory combined with
information theory to cope with model errors, reduced low order models, and recent
mathematical strategies for UQ in turbulent dynamical systems. Also recent
mathematical strategies for online data assimilation of turbulent dynamical systems
as well as rigorous results are briefly surveyed. Accessible open problems are often
mentioned. This research expository book is the first of its kind to discuss these
important issues from a modern applied mathematics perspective.
Audience: The book should be interesting for graduate students, postdocs, and
senior researchers in pure and applied mathematics, physics, engineering, and cli-
mate, atmosphere, ocean science interested in turbulent dynamical systems as well
as other complex systems.
New York, USA Andrew J. Majda

May 2016
v
Acknowledgements
The author thanks Prof. Xiaoming Wang and his Ph.D. students Di Qi and
Nan Chen for many helpful discussions and comments on the material presented
here. This research of the author is partially supported by the Office of Naval
Research through MURI N00014-16-1-2161 and DARPA through W911NF-
15-1-0636. The author hopes the discussion of the research topics in this book
inspires mathematicians, scientists, and engineers to study the exciting topics in
turbulent dynamical systems.
vii
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Turbulent Dynamical Systems for Complex Systems:
Basic Issues for Prediction, Uncertainty Quantification,
and State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Detailed Structure and Energy Conservation Principles . . . . . . . . . 3
2 Prototype Examples of Complex Turbulent Dynamical Systems . . . . 5
2.1 Turbulent Dynamical Systems for Complex
Geophysical Flows: One-Layer Model . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The L-96 Model as a Turbulent Dynamical System . . . . . . . . . . . . 7
2.3 Statistical Triad Models, the Building Blocks of Complex
Turbulent Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 More Rich Examples of Complex Turbulent Dynamical
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Quantitative Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.2 Qualitative Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 The Mathematical Theory of Turbulent Dynamical Systems . . . . . . . 13
3.1 Nontrivial Turbulent Dynamical Systems with a Gaussian
Invariant Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Exact Equations for the Mean and Covariance
of the Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.1 Turbulent Dynamical Systems with Non-Gaussian
Statistical Steady States and Nontrivial Third-Order
Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.2 Statistical Dynamics in the L-96 Model
and Statistical Energy Conservation . . . . . . . . . . . . . . . . . . 17
3.2.3 One-Layer Geophysical Model as a Turbulent Dynamical
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
ix
x Contents
3.3 A Statistical Energy Conservation Principle for Turbulent

Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 22
3.3.1 Details About Deterministic Triad Energy
Conservation Symmetry . . . . . . . . . . . . . . . . . . . . . . . .... 24
3.3.2 A Generalized Statistical Energy Identity . . . . . . . . . . .... 30
3.3.3 Enhanced Dissipation of the Statistical Mean Energy,
the Statistical Energy Principle, and “Eddy Viscosity” .... 36
3.3.4 Stochastic Lyapunov Functions for One-Layer
Turbulent Geophysical Flows . . . . . . . . . . . . . . . . . . . .... 38
3.4 Geometric Ergodicity for Turbulent Dynamical Systems . . . . .... 39
4 Statistical Prediction and UQ for Turbulent Dynamical Systems . . . 43
4.1 A Brief Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.1 Low-Order Truncation Methods for UQ and Their
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1.2 The Gaussian Closure Method for Statistical Prediction . . . 44
4.1.3 A Fundamental Limitation of the Gaussian Closure
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 A Mathematical Strategy for Imperfect Model Selection,
Calibration, and Accurate Prediction: Blending Information
Theory and Statistical Response Theory . . . . . . . . . . . . . . . . . . . . . 46
4.2.1 Imperfect Model Selection, Empirical Information
Theory, and Information Barriers . . . . . . . . . . . . . . . . . . . . 46
4.2.2 Linear Statistical Response and Fluctuation-Dissipation
Theorem for Turbulent Dynamical Systems . . . . . . . . . . . . 48
4.2.3 The Calibration and Training Phase Combining
Information Theory and Kicked Statistical Response
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.4 Low-Order Models Illustrating Model Selection,
Calibration, and Prediction with UQ . . . . . . . . . . . . . . . . . . 53
4.3 Improving Statistical Prediction and UQ in Complex Turbulent
Dynamical Systems by Blending Information Theory
and Kicked Statistical Response Theory . . . . . . . . . . . . . . . . . . . . . 55
4.3.1 Models with Consistent Equilibrium Single Point
Statistics and Information Barriers . . . . . . . . . . . . . . . . . . . 57
4.3.2 Models with Consistent Unperturbed Equilibrium
Statistics for Each Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.3 Calibration and Training Phase . . . . . . . . . . . . . . . . . . . . . . 59
4.3.4 Testing Imperfect Model Prediction Skill
and UQ with Different Forced Perturbations . . . . . . . . . . . . 59
4.3.5 Reduced-Order Modeling for Complex Turbulent
Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Contents xi
5 State Estimation, Data Assimilation, or Filtering for Complex

Turbulent Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1 Filtering Noisy Lagrangian Tracers for Random Fluid Flows . . . . . 66
5.2 State Estimation for Nonlinear Turbulent Dynamical Systems
Through Hidden Conditional Gaussian Statistics . . . . . . . . . . . . . . 67
5.2.1 Examples and Applications of Filtering Turbulent
Dynamical Systems as Conditional Gaussian Systems . . . . 68
5.3 Finite Ensemble Kalman Filters (EnKF): Applied Practice
Mathematical Theory, and New Phenomena . . . . . . . . . . . . . . . . . . 72
5.3.1 EnKF and ESRF Formulation . . . . . . . . . . . . . . . . . . . . . . . 73
5.3.2 Catastrophic Filter Divergence . . . . . . . . . . . . . . . . . . . . . . 74
5.3.3 Rigorous Examples of Catastrophic Filter Divergence . . . . 75
5.3.4 Rigorous Nonlinear Stability and Geometric Ergodicity
for Finite Ensemble Kalman Filters . . . . . . . . . . . . . . . . . . . 75
5.4 Mathematical Strategies and Algorithms for Multi-scale Data
Assimilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4.1 Conceptual Dynamical Models for Turbulence
and Superparameterization. . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.4.2 Blended Particle Methods with Adaptive Subspaces
for Filtering Turbulent Dynamical Systems . . . . . . . . . . . . . 83
5.4.3 Extremely Efficient Multi-scale Filtering Algorithms:
SPEKF and Dynamic Stochastic Superresolution
(DSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Chapter 1
Introduction
Consider a general dynamical system, perhaps with noise, written in the Itô sense in
physicist’s notation as given by
du
= F(u, t) + σ (u, t)Ẇ(t) (1.1)
dt
for u ∈ R N where σ is an N × K noise matrix and W ∈ R K is K -dimensional white

noise. The noise often represents degrees of freedom that are not explicitly modelled
such as the small scale surface wind on the ocean. Typically one thinks about the
evolution of smooth probability density p(u, t) associated with (1.1) as statistical
solution which satisfies the Fokker–Planck equation [48]
1
pt = −∇u · F(u, t) p + ∇u · ∇u (Qp) ≡ L F P ( p), (1.2)
2
pt |t=t0 = p0 (u),
with Q(t) = σ σ T . While (1.1) is a nonlinear system, the statistical equation in (1.2)
is a linear equation for functions in R N . The equation in (1.2) when there is no noise
and σ ≡ 0 is called the Liouville equation.
1.1 Turbulent Dynamical Systems for Complex Systems:

Basic Issues for Prediction, Uncertainty Quantification,
and State Estimation
For simplicity, consider the case without noise for (1.1), i.e., σ = 0. Turbulent
dynamics systems for complex systems are chaotic dynamical systems characterized
© Springer International Publishing Switzerland 2016 1
A.J. Majda, Introduction to Turbulent Dynamical Systems in Complex Systems,
Frontiers in Applied Dynamical Systems: Reviews and Tutorials 5,
DOI 10.1007/978-3-319-32217-9_1
2 1 Introduction
by a large dimensional phase space, u ∈ R N , with N 1 and a large dimension

of unstable directions in that phase space as measured by the number of positive
Lyaponov exponents or non-normal transient growth subspace, which strongly inter-
act and exchange energy. They are ubiquitous in many complex systems with fluid
flow such as for example, the atmosphere, ocean, the coupled climate systems [130,
153, 171], confined plasma [140], and turbulence at high Reynolds numbers [143].
In such systems, all these linear instabilities are mitigated by energy-conserving
nonlinear interactions that transfer energy to the linearly stable modes where it is
dissipated resulting in a statistical steady state for the complex turbulent system.
Prediction and uncertainty quantification (UQ) for complex turbulent dynamical
systems is a grand challenge where the goal is to obtain accurate statistical estimates
such as the change in mean and variance for key statistical quantities in the nonlinear
response to changes in external forcing parameters or uncertain initial data. These
efforts are hampered by the inevitable model errors and the curse of ensemble size
for complex turbulent dynamical systems. In the simplest set-up, model errors occur
when (1.1) is approximated by a different dynamical system for u M ∈ R M , M N
satisfying a similar equation as (1.1),
du M
= F M (u M , t) + σ M (u M , t)Ẇ(t) (1.3)
dt
where F M maps R M to R M and σ M is a noise matrix as in (1.1). Practical complex

turbulent models often have a huge phase space with N = 106 to N = 1010 . Model
errors as in (1.3) typically arise from lack of resolution compared with the original
perfect model which is too expensive to simulate directly and also the lack of physical
understanding of certain physical effects such as for example the interactions of ice
crystals or dust with clouds in the atmosphere. The noise in (1.3) is often non-zero
and judiciously chosen to mitigate model errors [104].
For chaotic turbulent dynamical systems, single predictions often have little statis-
tical information and Monte-Carlo ensemble predictions of (1.1) are utilized which,
for σ = 0, are equivalent to solving the Fokker–Planck equation in (1.2) through
particles as

L
du j
p L (u(x, t)) = p0, j δ(u − u j (t)), where = F(u j , t),
j=1
dt
(1.4)

L
with initial data p0 (u) ∼
= p0, j δ(u − u0, j ), u j |t=0 = u0, j .
j=1
The “curse of ensemble size” arises for practical predictions of complex turbulent
models since N is huge so that by computational limitation L = O(50), O(100) is
small and very few realizations are available; on the other hand with model errors
using less resolution in (1.3) so N decreases significantly, L can be increased but
model errors can swamp this gain in statistical accuracy. Thus, it is a grand challenge
1.1 Turbulent Dynamical Systems for Complex Systems: … 3
to devise methods that make judicious model errors in (1.3) that lead to accurate
predictions and UQ [144, 158]. Data assimilation, also called state estimation or
filtering, uses the available observations to improve prediction and UQ and thus is
also a grand challenge for complex turbulent dynamical systems [112]. Due to the
curse of ensemble size, these issues of prediction, state estimation, and UQ will all
be discussed in more detail in the later sections of this book.
1.2 Detailed Structure and Energy Conservation Principles
In the above discussion, we have emphasized the important role for energy conserving
nonlinear interactions for complex turbulent dynamical systems to transfer energy
from the unstable modes to stable modes where it is dissipated resulting in a statistical
steady state. Many turbulent dynamical systems are quadratic and have the following
abstract structure, for u ∈ R N
du
= L u + B(u, u) + F + σ Ẇ, (1.5)
dt
where
(A) L is a linear operator typically representing dissipation and dispersion.

(B) The bilinear term B(u, u) is energy conserving so that u · B(u, u) = 0,
(1.6)
where the dot denotes the standard Euclidean inner product.
(C) The noise matrix, σ , is a constant noise matrix.
All the coefficients in (1.5) and (1.6) can have smooth dependence in time repre-
senting important effects such as the seasonal cycle or the time dependent change in
external forcing. The use of the Euclidean inner product in (B) from (1.6) is made
for simplicity in exposition here as well as the state independent noise assumed in
(C) from (1.6). In many practical applications the linear operator L is a sum
L = L + D,
where L is skew symmetric representing dispersion
L ∗ = −L (1.7)
and D is a symmetric operator which represents strict dissipation so that
u · Du ≤ −d|u|2 with d > 0. (1.8)

4 1 Introduction
Under the assumptions in (1.6)–(1.8) the energy E = 21 |u|2 satisfies

dE d 1 2
= |u| = (Du · u) + F · u
dt dt 2
d 1
≤ − |u|2 + |F|2 (1.9)
2 2d
1
= −d E + |F|2 ,
2d
where the elementary inequality a · b ≤ |a|2 + |b|2 has been used. The Gronwall
2 2
inequality in (1.9) guarantees the global existence of bounded smooth solutions, the
existence of an absorbing ball (i.e., that the vector field for (1.5) with σ ≡ 0 points
inwards for |u| large enough) and the existence of an attractor for σ = 0 and time
independent F [34, 149, 165].
The main goals of the remainder of this article are to develop a mathematical
framework and illustrate emerging applications of turbulent dynamical systems to
the subtle statistical issues of prediction, UQ, and state estimation which can cope
with model error and the curse of ensemble size.
Many complex turbulent dynamical systems have the abstract mathematical struc-
ture in (1.5)–(1.7) including (truncated) Navier–Stokes as well as basic geophysical
models for the atmosphere, ocean and the climate systems with rotation, stratification
and topography [130, 153, 171]. Three prototype examples are discussed next and
other models are listed and briefly discussed there.
Chapter 2
Prototype Examples of Complex Turbulent
Dynamical Systems
Here we introduce three different prototype models of complex turbulent

dynamical systems with the structure in (1.5)–(1.7). The first is the basic one layer
geophysical model for the atmosphere or ocean with the effects of rotation, stratifica-
tion, topography, and both deterministic and random forcing plus various dissipative
mechanisms [124, 130, 171]; without geophysical effects this model reduces to the 2-
dimensional Navier–Stokes equation but all these geophysical effects are a very rich
source of new and important phenomena in the statistical dynamics far beyond ordi-
nary 2-D flow [130, 171]. The second model is a 40-dimensional turbulent dynam-
ical system due to Lorenz [98] which mimics weather waves of the mid-latitude
atmosphere called the L-96 model. This qualitative model is an important test model
for new strategies and algorithms for prediction, UQ, and state estimation, and is
widely used for these purposes in the geoscience community [73, 99, 117, 118, 144,
159]. The third models discussed in some detail here are stochastic triad models
[121–123] which are the elementary building blocks of complex turbulent systems
with energy conserving nonlinear interactions like those in (1.5)–(1.7). All three
examples will be used throughout the article. The chapter concludes with a list and
brief discussion of some other important examples of complex turbulent dynamical
systems.
2.1 Turbulent Dynamical Systems for Complex

Geophysical Flows: One-Layer Model
Turbulence in idealized geophysical flows is a very rich and important topic with
numerous phenomenological predictions and idealized numerical experiments. The
anisotropic effects of explicit deterministic forcing, the β-effect due to the earth’s
curvature, and topography together with random forcing all combine to produce a
DOI 10.1007/978-3-319-32217-9_2
6 2 Prototype Examples of Complex Turbulent Dynamical Systems
remarkable number of realistic phenomena; see the basic textbooks [130, 153, 171].
These include the formation of coherent jets and vortices, and direct and inverse
turbulent cascades as parameters are varied [130, 153, 171]. It is well known that
careful numerical experiments indicate interesting statistical bifurcations between
jets and vortices as parameters vary [133, 135, 172, 161, 163, 167], and it is a
contemporary challenge to explain these with approximate statistical theories [13,
45, 46, 163]. However, careful numerical experiments and statistical approximations
are only possible or valid for large finite times so the ultimate statistical steady state
of these turbulent geophysical flows remain elusive. Recently Majda and Tong [126]
contribute to these issues by proving with full mathematical rigor that for any values
of the deterministic forcing, the β-plane effect, and topography and with precise
minimal stochastic forcing for any finite Galerkin truncation of the geophysical
equations, there is a unique smooth invariant measure which attracts all statistical
initial data at an exponential rate, that is geometric ergodicity. The rate constant
depends on the geophysical parameters and could involve a large pre-constant.
Next we introduce the equations for geophysical flows which we consider in this
article. Here we investigate geophysical flow on a periodic domain T2 = [−π, π]2 ,
with general dissipation, β-plane effect, stratification effect, topography, determin-
istic forcing and a minimal stochastic forcing. The model [130] is given by
dq
+ ∇ ⊥ ψ · ∇q = D (Δ) ψ + f (x) + Ẇt , (2.1)
dt
q = Δψ − F 2 ψ + h (x) + βy.
In the equation above:

• q is the potential vorticity. ψ is the streamfunctions. Itdetermines the vorticity by
ω = Δψ, and the flow by u = ∇ ⊥ ψ = −∂ y ψ, ∂x ψ . Here x = (x, y) denotes
the spatial coordinate. l
• The operator D (Δ) ψ = j=0 (−1) γ j Δ ψ stands for a general dissipation
j j
operator. We assume here γ j ≥ 0 and at least one γ j > 0. This term can include:
(1) Newtonian (eddy) viscosity, νΔ2 ψ; (2) Ekman drag dissipation, −dΔψ; (3)
radiative damping, dψ; (4) hyperviscosity dissipation, which could be a higher-
order power of Δ and any positive combination of these. All versions are often
utilized in these models in the above references.
• Here f (x) is the external deterministic forcing. The random forcing Wt is a
Gaussian random field. Its spectral formulation will be given explicitly later.
• βy is the β-plane approximation of the Coriolis effect and h (x) is the periodic
topography. √
• The constant F = L −1 R , where L R = g H0 / f 0 is the Rossby radius which
measures the relative strength of rotation to stratification [124].
Note if one considers for example the atmospheric wind stress on the ocean, the
equation in (2.1) naturally has both deterministic and stochastic components to the
forcing. The remarkable effects of topography and the β-effect on dynamics are
discussed in detail in [130, 171]. The general mathematical framework of turbulent
2.1 Turbulent Dynamical Systems for Complex … 7
dynamical systems will be shown later to apply to this model. If we ignore geophysical
effects with F, β, h ≡ 0 and use viscosity, (2.1) becomes the 2-D Navier–Stokes
equations.
2.2 The L-96 Model as a Turbulent Dynamical System
The large dimensional turbulent dynamical systems studied here have fundamen-
tally different statistical character than in more familiar low dimensional chaotic
dynamical systems. The most well known low dimensional chaotic dynamical sys-
tem is Lorenz’s famous three-equation model [97] which is weakly mixing with
one unstable direction on an attractor with high symmetry. In contrast, as discussed
earlier, realistic turbulent dynamical systems have a large phase space dimension, a
large dimensional unstable manifold on the attractor, and are strongly mixing with
exponential decay of correlations. The simplest prototype example of a turbulent
dynamical system is also due to Lorenz and is called the L-96 model [98, 99]. It
is widely used as a test model for algorithms for prediction, filtering, and low fre-
quency climate response [102, 130], as well as algorithms for UQ [117, 159]. The
L-96 model is a discrete periodic model given by the following system
du j
= u j+1 − u j−2 u j−1 − u j + F, j = 0, · · · , J − 1, (2.2)
dt
with J = 40 and with F the forcing parameter. The model is designed to mimic
baroclinic turbulence in the midlatitude atmosphere with the effects of energy con-
serving nonlinear advection and dissipation represented by the first two terms in
(2.2). For sufficiently strong forcing values such as F = 6, 8, 16, the L-96 model is a
prototype turbulent dynamical system which exhibits features of weakly chaotic tur-
bulence (F = 6), strong chaotic turbulence (F = 8), and strong turbulence (F = 16)
[102] as the strength of forcing, F, is increased. In order to quantify and compare the
different types of turbulent chaotic dynamics in the L-96 model as F is varied, it is
convenient to rescale the system to have unit energy for statistical fluctuations around
1/2
the constant mean statistical state, ū [102]; thus, the transformation u j = ū + E p ũ j ,
−1/2
t = t˜E p is utilized where E p is the energy fluctuation [102]. After this normaliza-
tion, the mean state becomes zero and the energy fluctuations are unity for all values
of F. The dynamical equation in terms of the new variables, ũ j , becomes
d ũ j
= ũ j + 1 − ũ j−2 ũ j−1 + E −1/2 ũ j + 1 − ũ j−2 ū − ũ j + E −1
p (F − ū) .
d t˜ p
(2.3)
Table 2.1 lists in the non-dimensional coordinates, the leading Lyapunov exponent,
λ1 , the dimension of the unstable manifold, N + , the sum of the positive Lyapunov
exponents (the KS entropy), and the correlation time, Tcorr , of any ũ j variable with
itself as F is varied through F = 6, 8, 16. Note that λ1 , N + and KS increase
Table 2.1 Dynamical properties of L-96 model for weakly chaotic regime (F = 6), strongly
chaotic regime (F = 8) and fully turbulent regime (F = 16)
F λ1 N+ KS Tcorr
Weakly 6 1.02 12 5.547 8.23
chaotic
Strongly 8 1.74 13 10.94 6.704
chaotic
Fully turbulent 16 3.945 16 27.94 5.594
Here, λ1 denotes the largest Lyapunov exponent, N+ denotes the dimension of the expanding sub-
space of the attractor, KS denotes the Kolmogorov–Sinai entropy, and Tcorr denotes the decorrelation
time of energy-rescaled time correlation function
F=6 F=8 F = 16
20 20 20
18 18 18
16 16 16
14 14 14
12 12 12
time
10 10 10
8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
10 20 30 40 10 20 30 40 10 20 30 40
space space space
Fig. 2.1 Space-time of numerical solutions of L-96 model for weakly chaotic (F = 6), strongly
chaotic (F = 8), and fully turbulent (F = 16) regime
significantly as F increases while Tcorr decreases in these non-dimensional units;

furthermore, the weakly turbulent case with F = 6 already has a twelve dimen-
sional unstable manifold in the forty dimensional phase space. Snapshots of the time
series for (2.1) with F = 6, 8, 16, as depicted in Figure 2.1, qualitatively confirm
the above quantitative intuition with weakly turbulent patterns for F = 6, strongly
chaotic wave turbulence for F = 8, and fully developed wave turbulence for F = 16.
It is worth remarking here that smaller values of F around F = 4 exhibit the more
familiar low-dimensional weakly chaotic behavior associated with the transition to
turbulence.
2.3 Statistical Triad Models, the Building Blocks

of Complex Turbulent Dynamical Systems
Statistical triad models are special three dimensional turbulent dynamical systems
with quadratic nonlinear interactions that conserve energy. For u = (u 1 , u 2 , u 3 )T ∈
R3 , these equations can be written in the form of (1.5)–(1.7) with a slight abuse of
2.3 Statistical Triad Models, the Building Blocks … 9
notation as
du
= L × u + Du + B (u, u) + F + σ Ẇt , (2.4)
dt
where ‘×’ is the cross-product, L ∈ R3 , and the nonlinear term

⎛ ⎞
B1 u 2 u 3
B (u, u) = ⎝ B2 u 3 u 1 ⎠ ,
B3 u 1 u 2
with B1 + B2 + B3 = 0, so that u · B (u, u) = 0. They are the building blocks of

complex turbulent dynamical systems since a three-dimensional Galerkin truncation
of many complex turbulent dynamics in (1.5)–(1.7) have the form in (2.4), in par-
ticular the models from Sections 2.1 and 2.2. A nice paper illustrating the fact for
many examples in the geosciences is [53]; the famous three-equation chaotic model
of Lorenz is a special case of this procedure. The random forcing together with
some damping represents the effect of the interaction with other modes in a turbulent
dynamical system that are not resolved in the three dimensional subspace [121–
123]. Stochastic triad models are qualitative models for a wide variety of turbulent
phenomena regarding energy exchange and cascades and supply important intuition
for such effects. They also provide elementary test models with subtle features for
prediction, UQ, and state estimation [49, 51, 105, 156, 157].
Elementary intuition about energy transfer in such models can be gained by look-
ing at the special situation with L = D = F = σ ≡ 0 so that there are only the
nonlinear interactions in (2.4). We examine the linear stability of the fixed point,
ū = (ū 1 , 0, 0)T . Elementary calculations show that the perturbation δu 1 satisfies
dδu 1
dt
= 0 while the perturbations δu 2 , δu 3 satisfy the second-order equation
d2 d2
δu 2 = B2 B3 ū 21 δu 2 , δu 3 = B2 B3 ū 21 δu 3 ,
dt 2 dt 2
so that
there is instability with B2 B3 > 0 and

the energy of δu 2 , δu 3 grows provided B1 has (2.5)
the opposite sign of B2 and B3 with B1 + B2 + B3 = 0.
The elementary analysis in (2.5) suggests that we can expect a flow or cascade of
energy from u 1 to u 2 and u 3 where it is dissipated provided the interaction coefficient
B1 has the opposite sign from B2 and B3 .
We illustrate this intuition in a simple numerical experiment in a nonlinear regime
with a statistical cascade. For the nonlinear coupling we set B1 = 2, B2 = B3 = −1
so that (2.5) is satisfied and L ≡ 0, F ≡ 0 for simplicity. We randomly force
u 1 with a large variance σ12 = 10 and only weakly force u 2 , u 3 with variances
σ22 = σ32 = 0.01 while we use diagonal dissipation D with d1 = −1 but the stronger
state of the mean

2
<u > <u > <u >
1 1 2 3
0
-1
0 1 2 3 4 5
variance
10 var u var u var u
1 2 3
0
0 1 2 3 4 5
3rd order central moments
0
<M123>
-1
-2
0 1 2 3 4 5
time
Fig. 2.2 Triad model simulation in strongly nonlinear regime with energy cascade: full-system
statistics predicted with direct Monte Carlo using triad system (2.4). The time evolutions of the
mean, variance, and third-order interaction are shown in the left; in the right plots the steady state
conditional probability density functions of pu 1 u 2 u 3 are shown as well as 2D scatter plots
damping d2 = d3 = −2 for the other two modes. A large Monte Carlo simulation
with N = 1 × 105 is used to generate the variance of the statistical solution and the
probability distribution function (PDF) along the coordinates in Figure 2.2. These
results show a statistical steady state with much more variance in u 1 than u 2 and u 3
reflecting the above intuition below (2.5) on energy cascades. Intuitively the transfer
of energy in this triad system in each component separately is reflected by the third
moment, u 1 u 2 u 3 := M123 , and this is negative and non-zero reflecting the non-
Gaussian energy transfer in this system form u 1 to u 2 and u 3 (see Proposition 3.2 and
Theorem 3.1). This illustrates the use of the triad model for gaining intuition about
complex turbulent dynamics.
It is worth remarking that the degenerate stochastic triad model in (2.4) with
B1 ≡ 0 and B2 = −B3 and L
= 0 is statistically exactly solvable, has non-Gaussian
features and mimics a number of central issues for geophysical flows and is an
important unambiguous test model for prediction and state estimation [49–51, 112].
2.4 More Rich Examples of Complex Turbulent

Dynamical Systems
We briefly list and mention other important examples where the subsequent theory,
techniques, and ideas in this article can be applied currently or in the near future. We
begin with quantitative models and end with a list of judicious qualitative models.
We also mention recent applications for prediction, UQ, and state estimation.
2.4 More Rich Examples of Complex Turbulent … 11
2.4.1 Quantitative Models
(A) The truncated turbulent Navier–Stokes equations in two or three space dimen-
sions with shear and periodic or channel geometry [143].
(B) Two-layer or even multi-layer stratified flows with topography and shears in peri-
odic, channel geometry or on the sphere [94, 130, 171]. These models include
more physics like baroclinic instability for transfer of heat and generalize the
one-layer model discussed in Section 2.1. There has been promising novel mul-
tiscale methods in two-layer models for the ocean which overcome the curse
of ensemble size for statistical dynamics and state estimation called stochastic
superparameterization. See [110] for a survey and for the applications [63–68]
for state estimation and filtering. The numerical dynamics of these stochastic
algorithms is a fruitful and important research topic. The end of Chapter 1 of
[130] contains the formal relationship of these more complex models to the
one-layer model in Section 2.1.
(C) The rotating and stratified Boussinesq equations with both gravity waves and
vortices [94, 124, 171].
There are even more models with clouds and moisture which could be listed. Next is
the list of qualitative models with insight on the central issues for complex turbulent
dynamical systems.
2.4.2 Qualitative Models
(A) The truncated Burgers–Hopf (TBH) model: Galerkin truncation of the inviscid
Burgers equation with remarkable turbulent dynamics with features predicted by
simple statistical theory [119, 120, 125]. The models mimic stochastic backscat-
ter in a deterministic chaotic system [2].
(B) The MMT models of dispersive wave turbulence: One-dimensional models of
wave turbulence with coherent structure, wave radiation, and direct and inverse
turbulent cascades [23, 116]. Recent applications to multi-scale stochastic super-
parameterization [66], a novel multi-scale algorithm for state estimation [62],
and extreme event prediction [35] are developed.
(C) Conceptual dynamical models for turbulence: There are low-dimensional mod-
els capturing key features of complex turbulent systems such as non-Gaussian
intermittency through energy conserving dyad interactions between the mean
and fluctuations in a short self-contained paper [115]. Applications as a test
model for non-Gaussian multi-scale filtering algorithms for state estimation
and prediction [91] will be discussed in Section 5.4.
It is very interesting and accessible to develop a rigorous analysis of these models

and also the above algorithms.
Chapter 3
The Mathematical Theory of Turbulent
Dynamical Systems
With the motivation from Chapter 1 and 2, here we build the mathematical theory
of turbulent dynamical systems. First in Section 3.1 we show that many turbulent
dynamical systems have non-trivial turbulent dynamics with Gaussian invariant mea-
sures [102, 130]. As mentioned earlier, understanding the complexity of anisotropic
turbulent processes over a wide range of spatiotemporal scales in climate atmosphere
ocean science and engineering shear turbulence is a grand challenge of contemporary
science with important societal impacts. In such anisotropic turbulent dynamical sys-
tems the large scale ensemble mean and the turbulent fluctuations strongly exchange
energy and strongly influence each other. These complex features strongly impact
practical prediction and UQ. The goal in Section 3.2 is to develop the exact equa-
tions for the turbulent mean and fluctuations in any turbulent dynamical system
[156]. These equations are not closed but involve the third moments with special
statistical symmetries as a consequence of conservation of energy. Section 3.2.1 is a
simple general result applying 3.2 which shows that typically statistical steady states
of turbulent dynamical systems are non-Gaussian with non-trivial third moments.
Section 3.2.2 applies 3.2 and 3.2.1 to the statistical dynamics for the L-96 model in
a simple and instructive fashion while 3.2.3 shows how this framework applies to
the statistical dynamics of the complex one-layer geophysical models described in
Section 2.1. Section 3.3 and 3.3.1 and 3.3.2 contain a detailed exposition and a gen-
eralization of a recent general statistical energy conservation principle [100] for the
total energy in the statistical mean and the trace of the covariance of the fluctuations.
In Section 3.3.3 as a consequence of the energy conservation principle in 3.3 it is
shown that the energy of the statistical mean has additional damping due to turbulent
dissipation in the statistical steady state; this motives formal “eddy viscosity” clo-
sures [143] as discussed there. The energy conservation principle is applied in 3.3.4
to the complex one-layer geophysical models from Section 2.1 and 3.2.3 in full gen-
erality to yield stochastic Lyapunov functions. Geometric ergodicity of a turbulent

DOI 10.1007/978-3-319-32217-9_3
14 3 The Mathematical Theory of Turbulent Dynamical Systems
dynamical system guarantees a unique invariant measure or statistically steady state.

Section 3.4 briefly discusses the mathematical framework, recent results, and open
problems.
3.1 Nontrivial Turbulent Dynamical Systems

with a Gaussian Invariant Measure
Consider the stochastic dynamical equation (SDE)
du
= B (u, u) + Lu − Λdu + Λ1/2 σ Ẇ, u ∈ RN (3.1)
dt
with the structure of a turbulent dynamical system with
u · Lu = 0, Skew Symmetry for L,

u · B (u, u) = 0, Energy Conservation, (3.2)
divu (B (u, u)) = 0, Liouville Property.
σ2
The scalars d and σ satisfy σeq
2
= 2d
, and Λ ≥ 0 is a fixed positive definite matrix
and Λ is the square root.
1/2
Proposition 3.1 (Gaussian invariant measure) The SDE in (3.1) with the structural
properties in (3.2) has a Gaussian invariant measure,

1 −2
peq = CN exp − σeq u · u .
2
When Λ ≡ 0, σeq
2
can be arbitrary.
Proof Using (3.2) the Fokker–Planck equation from (1.2) is

dp Λσ 2
= −divu (B (u, u) + Lu) p + divu (Λdup) + divu ∇p .
dt 2
Insert peq and use (3.2) to get
Λσ 2
−Λdupeq + ∇u peq ≡ 0,
2
as required.
Examples show that the Liouville property in (3.2) is essential [130]. Many examples
of nontrivial dynamics in geophysical flows and qualitative models like TBH with
Gaussian invariant measures can be found in [102, 130].
3.2 Exact Equations for the Mean and Covariance of the Fluctuations 15
3.2 Exact Equations for the Mean and Covariance

of the Fluctuations
Consider the turbulent dynamical system from (1.5)–(1.7)
du
= (L + D) u + B (u, u) + F (t) + σk (t) Ẇk (t; ω) (3.3)
dt
acting on u ∈ RN . In the above equation and for what follows repeated indices will
indicate summation. In some cases the limits of summation will be given explicitly
to emphasize the range of the index. In the equation, L is skew-symmetric while D
is negative definite and the quadratic operator B (u, u) conserves energy by itself so
that it satisfies
u · B (u, u) = 0.
We use a finite-dimensional representation of the stochastic field consisting of a

fixed-in-time, N-dimensional, orthonormal basis {vi }Ni=1
u (t) = ū (t) + Zi (t; ω) vi , (3.4)
where ū (t) = u (t) represents the ensemble average of the response, i.e. the mean
field, and Zi (t; ω) are stochastic processes.
By taking the average of (3.3) and using (3.4), the mean equation is given by
d ū
= (L + D) ū + B (ū, ū) + Rij B vi , vj + F, (3.5)
dt
with R = ZZ∗ the covariance matrix. Moreover the random component of the
solution, u = Zi (t; ω) vi satisfies
du
= (L + D) u + B ū, u + B u , ū + B u , u + σk (t) Ẇk (t; ω) .
dt
By projecting the above equation to each basis element vi we obtain
dZi
= Zj (L + D) vj + B ū, vj + B vj , ū ·vi +B u , u ·vi +σk (t) Ẇk (t; ω)·vi .
dt
From the last equation we directly obtain the exact evolution equation of the covariant
matrix R = ZZ∗
dR
= Lv R + RLv∗ + QF + Qσ , (3.6)
dt
where we have:
(i) the linear dynamical operator expressing energy transfers between the mean
field and the stochastic modes (effect due to B), as well as energy dissipation
(effect due to D) and non-normal dynamics (effect due to L)

{Lv }ij = (L + D) vj + B ū, vj + B vj , ū · vi ; (3.7)
(ii) the positive definite operator expressing energy transfer due to the external
stochastic forcing
{Qσ }ij = (vi · σk ) σk · vj ; (3.8)
(iii) as well as the energy flux between different modes due to non-Gaussian statistics
(or nonlinear terms) modeled through third-order moments

{QF }ij = Zm Zn Zj B (vm , vn ) · vi + Zm Zn Zi B (vm , vn ) · vj . (3.9)
Note that the energy conservation property of the quadratic operator B is inherited
by the matrix QF since

tr (QF ) = 2 Zm Zn Zi B (vm , vn ) · vi = 2B u , u · u = 0. (3.10)
The above exact statistical equations will be the starting point for the developments
in this chapter and subsequent material on UQ methods in Chapter 4 [117, 158, 159].
3.2.1 Turbulent Dynamical Systems with Non-Gaussian

Statistical Steady States and Nontrivial Third-Order
Moments
Consider a turbulent dynamical system without noise, σ ≡ 0, and assume it has a

statistical steady state so that ūeq and Req are time independent. Since dtd Req = 0, Req
necessarily satisfies the steady covariance equation (3.6)
Lūeq Req + Req Lū∗eq = −QF,eq , (3.11)
where QF,eq includes the third moments from (3.9) evaluated at the statistical steady
state. Thus a necessary and sufficient condition for a non-Gaussian statistical steady
state is that the first and second moments satisfy the obvious requirement that the
matrix
Lūeq Req + Req Lū∗eq
= 0, (3.12)
so the above matrix has some non-zero entries. This elementary remark can be
viewed as a sweeping generalization for turbulent dynamical systems of the Karman–
Howarth equation for the Navier–Stokes equation (see Chapter 6 of [47]). The
non-trivial third moments play a crucial dynamical role in the L-96 model and for
two-layer ocean turbulence [157, 158], and is discussed in later chapters.
In Section 2.1, we have constructed turbulent dynamical systems with a Gaussian
invariant measure and non-zero noise when Λ
= 0. There ūeq = 0, Req = σeq 2
I,
∗
and Lūeq = L with L skew-symmetric, so LReq + Req L ≡ 0 and the damping, with
matrix D = −dΛ, exactly balances the stochastic forcing variances; these facts also
apply to the case in Section 3.1 for Λ ≡ 0 and no dissipation and random forcing
where σeq2
can be any non-zero number. This helps illustrate and clarify the source of
non-Gaussianity through the nontrivial interaction of the linear operator Lūeq and the
covariance Req at a statistical steady state. In fact for a strictly positive covariance
matrix, Req , there is a “whitening” linear transformation, T , so that TReq T −1 = I so
the condition in (3.12) for nontrivial third moments is satisfied if the symmetric part
of the matrix, TLūeq T −1 , is non-zero.
3.2.2 Statistical Dynamics in the L-96 Model and Statistical

Energy Conservation
The L-96 model from (2.2) in Section 2.2 is translation invariant, if uj (t) is a solution
so is uj+k (t) for any shift k. This implies a statistical symmetry of homogeneous
statistics [47, 143]. This means that we can restrict the statistical equations for the
mean and second moment so the statistical mean, ū (t), is a scalar function and the
covariance matrix is diagonal,
Rij = ri δij , ri > 0,
provided that we pick the orthonormal discrete Fourier basis to expand the random
field in 3.2. Here we simply record the evolution equation for the mean and covariance
for (3.5), (3.6) of Section 3.2 for a slight translation invariant generalization of the
L-96 model. The details can be found in [117, 157].
Consider the L-96 model with homogeneous time dependent translation invariant
coefficients
duj
= uj+1 − uj−2 uj−1 − d (t) uj + F (t) , j = 0, 1, . . . , J − 1, J = 40. (3.13)
dt
Periodic boundary conditions uJ = u0 are applied. To compare with the abstract
form in (1.5)–(1.7) we can write the linear operator for L-96 system as
L (t) = −d (t) I,
and define the quadratic form as

∗ J−1
B (u, v) = uj−1 vj+1 − vj−2 j=0 .
J/2
Choose the orthonormal discrete Fourier basis as {vk }k=−J/2+1 with
J−1
1 j
vk = √ e2πık J .
J j=0
We use a Fourier basis because they diagonalize translation invariant systems with
spatial homogeneity. Here are the statistical dynamics for L-96 model:
d ū (t) 1
J/2
= − d (t) ū (t) + rk (t) Γk + F (t)
dt J
k=−J/2+1 (3.14)
drk (t)
=2 [−Γk ū (t) − d (t)] rk (t) + QF,kk , k = 0, 1, . . . , J/2.
dt
∗

Here we denote Γk = cos 4πk J
− cos 2πkJ
, r−k = Z−k Z−k = Zk Zk∗ = rk , and the
nonlinear flux QF for the third moments becomes diagonal
2
Re Zm Z−m−k Zk e−2πi J − e2πi J
2m+k m+2k
QF,kk = √ δkk ,
J m
with energy conservation trQF = 0. We explore and approximate this dynamics for
prediction and UQ later in this article [117, 157, 158]. Next we give an important
application of these statistical equations.
Statistical Energy Conservation Principle for L-96 Model
The claim can be seen by simple manipulations of Equations (3.14). By multiplying
ū on both sides of the mean equation in (3.14), we get
d ū2 2
= −2d ū2 + 2ūF + Γk rk ū.
dt J
k
And by summing up all the modes in the variance equation in (3.14)

dtrR
=2 − Γk rk ū − 2dtrR + trQF .
dt
k
It is convenient to define the statistical energy including both mean and total vari-
ance as
J 1
E (t) = ū2 + trR. (3.15)
2 2
With this definition the corresponding dynamical equation for the statistical energy
E of the true system can be easily derived as
dE 1
= −2dE + JF ū + trQF = −2dE + JF ū, (3.16)
dt 2
with symmetry of nonlinear energy conservation, trQF = 0, assumed.
This important fact implies that controlling model errors in the mean guarantees
that they can be controlled for the variance at single locations too [117]. A general
statistical energy principle is discussed in Section 3.3.
Nontrivial Third Moments for Statistical Steady State of L-96 Model
Consider the original L-96 model with constant forcing
Fand damping d = 1 with
the statistical steady states ūeq , Req , with Req = req,i δij diagonal. For the L-96
model with homogeneous statistics, the linear operator is also block diagonal with
the symmetric part Lūs eq = Lūs eq ,i δij and
Lūs eq ,i = −Γi ūeq − 1, i = 0, · · · , J/2.
Now according to 3.2.1, the third moments are non-zero and the statistical steady
state is non-Gaussian provided
J
Lūs eq ,i req,i
= 0, for some i with 0 ≤ i ≤ ,
2
in which case from (3.11) of 3.2.1,
J
−2Lūs eq ,i req,i = QF,eq,i , 0 ≤ i ≤ .
2
Simple numerical experiments show these third moments are not negligible for forc-
ing values, F ≥ 5, in the L-96 model, and accounting for them in some approxi-
mate fashion is crucial for prediction and UQ [117, 156–159]. See the discussion in
Section 4.3.
3.2.3 One-Layer Geophysical Model as a Turbulent

Dynamical System
Here the complex one-layer geophysical models with topography, forcing, and dissi-
pation are shown to be amenable to the structure and statistical analysis of Section 3.2
provided one chooses the coordinates for the dynamics carefully.
In order to implement the one-layer geophysical models from Section 2.1 in
numerics, we need to do a finite dimensional Galerkin truncation. One way to achieve
this is letting q = qΛ + βy, where qΛ has Fourier modes only with finite indices set
I ⊂ Z2 / {(0, 0)} with symmetry:
eik·x
qΛ = qk ek (x) , ek (x) = .
2π
k∈I
We say I is symmetric if k ∈ I , then −k ∈ I . One practical choice of I can be

of form

I = k ∈ Z2 / {(0, 0)} | |k| ≤ N or I = k ∈ Z2 / {(0, 0)} | |k1 | ≤ N, |k1 | ≤ N ,

with N being a large number. Let PΛ be the projection of L 2 T2 onto the finite
subspace spanned by {ek (x) , k ∈ I }. We can then project (2.1) to the modes in I
using a truncation operator. The truncated model is then

dqΛ = − PΛ ∇ ⊥ ψΛ · ∇qΛ dt − β (ψΛ )x dt + D (Δ) qΛ dt + fΛ (x) dt + dWΛ (t) ,
qΛ = ΔψΛ − F 2 ψΛ + hΛ (x) .
(3.17)

Here ψΛ = k∈I ψk ek (x) is the truncation of ψ, and likewise we can have spectral
formulations of the truncated relative vorticity ωΛ , external forcing fΛ , topography
hΛ . In particular, we model the Gaussian random field as

WΛ (t) = σk Wk (t) .
k∈I
The Wk (t) above are independent complex Wiener processes except for conjugating
∗ ∗
pairs, where σk = σ−k , Wk = W−k . One simple way to achieve this is letting
Bk (t) , Bk (t) to be independent real Wiener processes, and
r i
1 1
Wk (t) = √ Bkr (t) + iBki (t) , W−k (t) = √ Bkr (t) − iBki (t) ,
2 2
for k ∈ I+ = {k ∈ I : k2 > 0} ∪ {k ∈ I : k2 = 0, k1 > 0}. The corresponding

incompressible flow field is uΛ = ∇ ⊥ ψΛ , while its underlying basis will be ẽk =
ik ⊥
|k| ek .
Spectral Formulation
Another way to obtain and study (3.17) is projecting (2.1) onto each Fourier mode.
In fact, it suffices to derive equations for any one of qk , ψk , uk or ωk , since the others
can then be determined quite easily by the following linear relation:
|k|2 (qk − hk ) −qk + hk |k| (qk − hk )

ωk = , ψk = , uk = − .
F + |k|
2 2
F + |k|
2 2
F 2 + |k|2
We choose to project (2.1) onto the Fourier modes of qΛ . The resulting formula for
qk is
−dk + iβk1
dqk (t) = (qk (t) − hk ) dt
F 2 + |k|2
(3.18)
+ am,n qm qn − bm,n hn qm dt + fk dt + σk dWk (t) ,
m+n=k,m,n∈I
with the three wave interaction coefficients am,n , bm,n and the general damping dk
given by

n⊥ , m n⊥ , m 1 1

bm,n = , am,n = − , dk = γj |k|2j .
2π |n|2 + 2π F 2 4π |m|2 + F 2 |n|2 + F 2 j
It is easy to see that

n⊥ , m = (m + n)⊥ , m = − m⊥ , n
so am,m = 0, am,n = an,m = an,m+n = −a−m,n ; moreover the triad conservation

property am,n + an,−m−n + a−m−n,m = 0 since the sum is

n⊥ , m 1 1 1 1 1 1

− + − + − .
4π |m|2 + F 2 |n|2 + F 2 |n|2 + F 2 |n + m|2 + F 2 |m + n|2 + F 2 |m|2 + F 2

Also note that the damping dk ≥ d0 = j γj > 0.
Now we can rewrite (3.18) into the following form as a turbulent dynamical system
in (1.5)–(1.7)
dq
= (L + D) q + B (q, q) + F + ΣdW (t) . (3.19)
dt
Here q, F, and W are |I |-dim complex valued vectors with components being
qk , Fk , Wk . The operators above are given by
• L is a skew symmetric matrix. Its diagonal entries are Lkk = |k|iβk 2
1
+F 2
, and off-
∗
diagonal entries are Lkm = −bk−m,m hk−m . Note that Lmk = −bm−k,m hm−k = −Lkm .
−dk
• D is a diagonal negative-definite matrix. Its diagonal entries are |k|2 +F 2 .

• B is a quadratic form. Its kth component is B (p, q) k = am,n pm qn . It satisfies
the relation

B (q, q) , q = qk∗ am,n qm qn
k∈I m+n=k

= am,n qm qn q−m−n
(3.20)
m,n
1
= qm qn q−m−n am,n + an,−m−n + a−m−n,m = 0
3 m,n
due to the triad conservation property listed below (3.18).

• F is a constant vector, it has components of form Fk = − −d k +iβk1
h + fk .
|k|2 +F 2 k
• Σ is a diagonal matrix with entries Σkk = σk .
Here are the concrete equations for the ensemble mean and covariance of the fluctu-
ations from (3.5) and (3.6). We denote the ensemble mean field as q̄ = Eq, then the
potential vorticity field has the Reynold’s decomposition q = q̄ + k∈I Zk (t) ek .
The ek is the canonical unit vector with 1 at its kth component, which corresponding
to ek in the Fourier decomposition. The exact equation for the mean is the following:
d q̄
= (L + D) q̄ + B (q̄, q̄) + Rmn B (em , en ) + F.
dt m,n
Rmn in the above equation is the covariance matrix Rmn = EZm Zn∗ . This matrix follows
the ODE, as derived in (3.6)
dR
= Lv R + RLv∗ + QF + Qσ .
dt
The matrix Lv is given by
{Lv }mn = (L + D) em + B (q̄, em ) + B (em , q̄) , en .
Matrix Qσ expresses energy transfer due to external stochastic forcing, so it is a

diagonal matrix with entries {Qσ }kk = |σk |2 . The energy flux is represented by QF as

{QF }mn = Zi Zj Zn B ei , ej , em + Zi Zj Zm B ei , ej , en ,
with trQF ≡ 0.
3.3 A Statistical Energy Conservation Principle

for Turbulent Dynamical Systems
As mentioned earlier, a fundamental issue in prediction and UQ is that complex

turbulent dynamical systems are highly anisotropic with inhomogeneous forcing and
the statistical mean can exchange energy with the fluctuations. Despite the fact that
3.3 A Statistical Energy Conservation Principle for Turbulent Dynamical Systems 23
the exact equations for the statistical mean (3.5) and the covariance fluctuations (3.6)
are not closed equations, there is suitable statistical symmetry so that the energy of the
mean plus the trace of the covariance matrix satisfies an energy conservation principle
even with general deterministic and random forcing [100]. Here the exposition of this
brief paper is expanded, generalized, and applied directly to complex geophysical
flow with the framework in Section 3.2.3 and 2.1. This conservation principle has
other important implications for prediction, and UQ [117].
The system of interest is a quadratic system with conservative nonlinear dynamics
du
= (L + D) u + B (u, u) + F (t) + σk (t) Ẇk (t; ω) , (3.21)
dt
acting on u ∈ RN . The exact mean statistical field equation and the covariance
equation can be calculated from Section 3.2 as
d ū
= (L + D) ū + B (ū, ū) + Rij B ei , ej + F,
dt (3.22)
dR
=Lv R + RLv∗ + QF + Qσ ,
dt

where we have the covariance matrix given by Rij = Zi Zj∗ and · denotes averaging
over the ensemble members. Each component in the mean and covariance equations
are summarized as follows, which we repeat here for convenience:
(i) the linear dynamics operator expressing energy transfers between the mean field
and the stochastic modes (effect due to B), as well as energy dissipation (effect
due to D), and non-normal dynamics (effect due to L, D, ū)

{Lv }ij = (L + D) ej + B ū, ej + B ej , ū · ei ; (3.23)
(ii) the positive definite operator expressing energy transfer due to external stochas-
tic forcing
{Qσ }ij = (ei · σk ) σk · ej ; (3.24)
(iii) as well as the energy flux between different modes due to non-Gaussian statistics
(or nonlinear terms) given exactly through third-order moments
QF = Zm Zn Zj B (em , en ) · ei + Zm Zn Zi B (em , en ) · ej . (3.25)
With energy conservation, the nonlinear terms satisfy the statistical symmetry
requirement
trQF ≡ 0, (3.26)
since with u = Zi ei , trQF = u · B (u, u) = 0 by energy conservation.

3.3.1 Details About Deterministic Triad Energy

Conservation Symmetry
Proposition 3.2 Consider the three dimensional Galerkin projected dynamics

spanned by the triad ei , ej , ek for 1 ≤ i, j, k ≤ N for the pure nonlinear model
(uΛ )t = PΛ B (uΛ , uΛ ) . (3.27)
Assume the following:

(A) The self interactions vanish,
B (ei , ei ) ≡ 0, 1 ≤ i ≤ N; (3.28)
(B) The dyad interaction coefficients vanish through the symmetry,
ei · [B (el , ei ) + B (ei , el )] = 0, for any i, l. (3.29)
Then the threedimensional Galerkin

truncation becomes the triad interaction equa-
tions for u = ui , uj , uk = uΛ · ei , uΛ · ej , uΛ · ek
dui
=Aijk uj uk ,
dt
duj
=Ajki uk ui , (3.30)
dt
duk
=Akij ui uj ,
dt
with coefficient satisfying
Aijk + Ajik + Akji = 0, (3.31)
which is the detailed triad energy conservation symmetry, since
Aijk + Ajki + Akij

≡ ei · B ej , ek + B ek , ej +
ej · [B (ek , ei ) + B (ei , ek )] +

ek · B ei , ej + B ej , ei
= 0. (3.32)
Here we display the details about

the results
in Proposition 3.2 with the triad energy
conservation for variables u = ui , uj , uk . For clarification, we give the explicit form
for the projection operator PΛ . Define Λ as the index set for the resolved modes
under the orthonormal basis {ei }Ni=1 of full dimensionality N in the truncation model.
Therefore, the truncated expression for the state variable u becomes

uΛ = PΛ u = ui ei , (3.33)
i∈Λ
where ui are the coefficients under the corresponding basis ei . Particularly,

note that
under the Reynolds decomposition of the state variable u (t) = ū (t) + Zi (t; ω) ei
in this article, we have uk = ūM + Zk if k = M being the base mode for the mean
state ū = ūM eM ; and uk = Zk if k
= M. The projected energy conservation law
for truncated energy EΛ = 21 uΛ · uΛ is satisfied depending on the proper conserved
quantity and the induced inner product
dEΛ
= uΛ · PΛ B (uΛ , uΛ ) = 0. (3.34)
dt
The second equality holds since

uΛ · PΛ B (uΛ , uΛ ) = uΛ · [ei · B (uΛ , uΛ )] ei
i∈Λ
⎛ ⎞

N
=⎝ uj ej ⎠ · [ei · B (uΛ , uΛ )] ei
j∈Λ i=1
= uΛ · B (uΛ , uΛ ) ,
through the truncated expansion (3.33). We can include the other modes outside the
resolved set Λ in the second equality above due to the orthogonality between ei and
ej if i
= j.
Now consider the triad truncated system about state variable uΛ in a three dimen-
sional subspace. We take the index set with resolved modes as Λ = {i, j, k}. The
associated truncated model (3.27) becomes three dimensional. For the right hand
side of the system, the explicit expressions can thus be calculated along each mode
em , m = i, j, k by applying assumption (A) and (B) in Proposition 3.2

dum
= em · PΛ B (uΛ , uΛ ) = em · B un en , ul el
dt n∈Λ l∈Λ

= un ul em · B (en , el )
n,l∈Λ

= un ul em · B (en , el )
n
=l∈Λ−{m}
= un ul em · [B (en , el ) + B (el , en )] , n
= l
= m.
The third equality above applies the two assumptions (3.28) and (3.29), thus terms
including mode em in the nonlinear interaction B are all cancelled. The interaction
coefficients therefore can be defined as

Aijk = ei · B ej , ek + B ek , ej , (3.35)
with symmetry
Aijk = Aikj , (3.36)
and vanishing property
Aijk = 0, if two of the index i, j, k coincident. (3.37)
With this explicit definition of the coefficients Aijk , the detailed triad energy conser-
vation symmetry in (3.32) is just direct application of the above formulas, that is for
any ui , uj , uk with i
= j
= k
dEΛ dui duj duk

= ui + uj + uk
dt dt dt dt
= Aijk + Ajki + Akij ui uk uj
≡ 0.
Remark The triad interaction conditions in (A) and (B) are satisfied for 2-D flows
in periodic geometry and on the sphere [87, 130]. In Section 3.2.3 we have already
verified this property directly for geophysical flow with rotation and topography in
periodic geometry.
3.3.1.1 Dynamics for the Mean and Fluctuation Energy
Consider the statistical mean energy, Ē = 1

2
|ū|2 = 21 ū · ū. Proposition 3.3 calculates
the dynamics for the mean energy
Proposition 3.3 The change of mean energy Ē = 1

2 (ū · ū) satisfies

d 1 2 1
|ū| = ū · Dū + ū · F + Rij ū · B ei , ej + B ej , ei . (3.38)
dt 2 2
The last term represents the effect of the fluctuation on the mean, ū.

Next consider the fluctuating energy E = 21 tr Rij . Proposition 3.4 describes the
dynamics for the total fluctuation part
Proposition 3.4 Under the structure assumption in (3.28) and (3.29) on the basis
ei , the fluctuating energy, E = 21 trR, for any turbulent dynamical system satisfies,
dE 1 1
= tr D̃R + RD̃∗ + trQσ
dt 2 2
1
− Rij ū · B ei , ej + B ej , ei , (3.39)
2
where R satisfies the exact covariance equation in (3.22).
The mean energy equation in (3.38) is a direct result from the statistical mean dynam-
ics (3.22) by taking inner product with ū on both sides of the equations. For further
simplifications in the formula, we have
Simplification in the Mean Energy
Comparing the mean energy equation (3.38) and the dynamics for the mean in (3.22),
two terms vanish. First, the interactions between the mean ū · B (ū, ū) = 0 vanishes
naturally due to the energy conservation property; and second, the quadratic form
ū · L ū = 0 due to L being skew-symmetric. In fact, there is
(ū · L ū)∗ = ū · L ∗ ū = −ū · L ū,
and notice that ū · L ū is real, therefore the skew-symmetric quadratic form also
vanishes.
The fluctuation energy equation in (3.39) is reached first by taking trace on both
sides on the covariance dynamical equation in (3.22). Then we need to carry out the
following simplifications.
The Linear Interaction Part in the Fluctuating Energy
In the linear interaction part in (3.23), we need to use the representation of linear
operators L, D under the basis ei . Directly from the definition of Lv , the explicit form
for these two transformed operators can be found
L̃ij = ei · Lej , D̃ij = ei · Dej .
Since this transform above can be viewed as a change of basis, the skew-symmetric
property of L̃ and negative definite symmetric property of D̃ are both maintained.
Furthermore the linear interaction part in (3.22) can be further simplified as
⎛ ⎞
L̃ + D̃ R + R L̃ ∗ + D̃∗
tr ⎝ ⎠ = 1 tr D̃R + RD̃ = tr D̃R = tr DR̂ = ei · Dej Rji .
2 2
i,j
The skew-symmetric part vanishes since

tr L̃R + RL̃ ∗ = tr L̃R − tr RL̃ = 0.
The Nonlinear Interaction Part Between ū and ej in the Fluctuating Energy

In the nonlinear interactions part between the mean and the basis in (3.22), we only
need to show the following equality in one component

ei · B ū, ej Rji = trB ū, R̂ , (3.40)
i,j
then the remaining

parts can be reached by symmetry and taking transpose. Here the
matrix B ū, R̂ is defined as the componentwise interaction with each column of R̂,
that is
B ū, R̂ ≡ B ū, R̂(1) , B ū, R̂(2) , · · · , B ū, R̂(N) , (3.41)
where R̂(l) represents the l-th column of R̂. Under the above definition (3.41), first
note that componentwise we have

B eM , ei ⊗ ej = Bk eM , ei e(l)∗
j = Bk (eM , ei ) e(l)∗
j = B (eM , ei ) ⊗ ej ,
kl kl
where the second equality uses the bilinearity of the form B. Therefore to show (3.40)
again by applying the bilinear property of the quadratic form B, we have

ei · B ū, ej Rji
i,j
⎛ ⎞

= ei · B ⎝ū, ej Rji ⎠
i j
⎛ ⎞

= e(l)∗
i Bl
⎝ū, ej Rji ⎠
l,i j
⎛ ⎞

= Bl ⎝ū, Rij ei e(l)∗
j
⎠
l i,j
⎛ ⎞
(l)
= Bl ⎝ū, Rij ei ⊗ ej ⎠
l i,j

= Bl ū, R̂(l)
l

= trB ū, R̂ .
Above we adopt the discrete version of the basis ei for the simplicity in expressions,
and for clarification, subscripts are used to denote row component and superscripts
are for column components. The other parts in the nonlinear interaction terms in
(3.23) can be achieved in a similar fashion.
Finally, adding the results in Propositions 3.3 and 3.4, we have the main result in
[100].
Theorem 3.1 (Statistical Energy Conservation Principle) Under the structural

assumption (3.28) and (3.29) on the basis ei , for any turbulent dynamical systems in
(3.21), the total statistical energy, E = Ē + E = 21 ū · ū + 21 trR, satisfies
dE 1
= ū · Dū + ū · F + tr D̃R + trQσ , (3.42)
dt 2
where R satisfies the exact covariance equation in (3.22).
3.3.1.2 Illustrative General Examples and Applications
We have the interesting immediate corollary of Theorem 3.1
Corollary 3.1 Under the assumption of Theorem 3.1, assume D = −dI, with d > 0,
then the turbulent dynamical system satisfies the closed statistical energy equation
for E = 21 ū · ū + 21 trR,
dE 1
= −2dE + ū · F + trQσ . (3.43)
dt 2
In particular, if the external forcing vanishes so that F ≡ 0, Qσ ≡ 0, for random
initial conditions, the statistical energy decays exponentially in time and satisfies
E (t) = exp (−2dt) E0 .
Assume the symmetric dissipation matrix, D, satisfies the upper and lower bounds,
− d+ |u|2 ≥ u · Du ≥ −d− |u|2 , (3.44)
with d− , d+ > 0. Typical general dissipation matrices D̃ are diagonal in basis with
Fourier modes or spherical harmonics [130]. Now for any diagonal matrix D̃ and any
positive symmetric matrix R ≥ 0 we have the a priori bounds,

D̃R + RD̃∗
− d+ trR ≥ tr ≥ −d− trR. (3.45)
2
Thus, with the Theorem and Corollary 3.1, we immediately have
Corollary 3.2 Assume D̃ is diagonal and satisfies the upper and lower bounds in
(3.44), then the statistical energy in (3.42) in Theorem 3.1, E (t), satisfies the upper
and lower bounds E+ (t) ≥ E (t) ≥ E− (t) where E± (t) satisfy the differential
equality in Corollary 3.1 with d ≡ d± . In particular, the statistical energy is a

statistical Lyapunov function for the turbulent dynamical system in (3.21). Also, if
the external forcings F, Qσ vanish, the statistical energy decays exponentially with
these upper and lower bounds.
In standard fashion if some bound is known on the statistical mean energyin (3.43),
then this also provides control of the total variance and in particular trR ≡ k |Zk |2 .
Consider the Gaussian approximation to the one point statistics; recall that u =
ū + Zi ei so at the location x, the mean and variance are given by

ū (x) = ūM eM (x) , var (u (x)) = Zj Zk∗ ej (x) ⊗ ek (x) .
We have control over the variance of the average over the domain, denoted by Ex
because Ex ej (x) ⊗ ek (x) = δjk I; thus, the average of the single point variance is
bounded by trR which is controlled by E. See [117] for applications to UQ.
3.3.2 A Generalized Statistical Energy Identity
In this subsection, we take a more detailed look at the assumptions given in (3.28)
and (3.29) for the derivation of the statistical energy principle. It is observed that
these conditions may not be necessary for the conclusion in the main Theorem 3.1.
Also, it is useful to check further in detail about the energy principle for (spatially)
inhomogeneous systems, which is typical in many realistic applications where the
conservation does not occur with the Euclidean inner product.
3.3.2.1 Generalized Condition for Dyad and Triad Energy

Conservation Symmetry
Here we investigate further about the triad energy conservation symmetry Aijk +
Ajki + Akij = 0 in Proposition 3.1 from the previous section. Note that the triad
coefficient Aijk is defined through proper energy-conserving inner product, and is
crucial for the statistical energy conservation law in the central theorem. On the other
hand, to guarantee this triad symmetry, we have assumed (3.28) and (3.29). It is shown
by simple examples that these two assumptions are satisfied in many applications. But
still there exist examples with statistical energy conservation property, which violate
the assumptions in (3.28) and (3.29). This shows that the previous two assumption
might be further generalized under weaker constraints. Here we check the possibility
to generalize this statistical energy identity by looking at further the dyad interactions,
and two typical examples are shown to illustrate these results.
Generalized Triad Symmetry Assumption
We define the triad interaction coefficient from a properly defined inner product
·, · as

Aijk = ei · B ej , ek + ei · B ek , ej , (3.46)
with symmetry Aijk = Aikj . The first obvious observation about this coefficient is that
it will vanish for the interaction with itself, Aiii ≡ 0, due to the conservative property
of the quadratic form, u · B (u, u) = 0. Now consider the two dimensional Galerkin
projection model spanned by the two modes ei andej . Therefore
we have
the dyad
interaction equations for state variables u = ui , uj = ei · uΛ , ej · uΛ
dui
=Aiij ui uj + Aijj uj2 ,
dt (3.47)
duj
=Ajji uj ui + Ajii ui2 .
dt

Note that the energy conservation property d ui2 + uj2 /dt ≡ 0 is still satisfied for
this dyad model. Therefore we get the dyad energy conservation symmetry

Aiij + Ajii ui2 uj + Ajji + Aijj ui uj2 = 0,
for arbitrary values of ui , uj . Thus, we have the following proposition from the dyad
system:
Proposition 3.5 Consider the dyad interaction system (3.47). If the assumption
about the dyad symmetry
Aiij + Ajii = 0,
is satisfied for some wavenumber 1 ≤ i, j ≤ N, then we will also have the dyad
symmetry
Ajji + Aijj = 0.
Proposition 3.5 offers the more generalized version of the assumption for statistical
energy conservation equation. Actually, previously (3.28) and (3.29) require the
stronger assumptions for vanishing single mode interaction and dyad interaction
Aiij ≡ 0, Ajii ≡ 0, ∀ i, j ≤ N.
The new assumption loosens the requirement to a weaker version that the two coef-
ficients only need to cancel each other rather than vanish uniformly at the same
time.
To check the validity of the assumption, we come back to the three-mode system
(3.30) between the Galerkin projection modes ui , uj , uk , in the general form, the
component uk of the three-mode interactive equations become
duk
= Akij ui uj + Akik ui uk + Akjk uj uk + Akii ui2 + Akjj uj2 .
dt
Correspondingly we also have the dynamics for ui and uj
dui
= Aijk uj uk + Aiji uj ui + Aiki uk ui + Aijj uj2 + Aikk uk2 .
dt
duj
= Ajki uk ui + Ajij ui uj + Ajkj uk uj + Ajkk uk2 + Ajii ui2 .
dt
Again applying the energy conservation principle, the energy in the triad system
satisfies
d 2
0= u + ui2 + uj2
dt k
= (Akik + Aikk ) ui uk2 + Akjk + Ajkk uj uk2

+ (Akii + Aiki ) ui2 uk + Akjj + Ajkj uj2 uk

+ Ajii + Aiji ui2 uj + Aijj + Ajij ui uj2

+ Akij + Aijk + Ajki ui uj uk .
Therefore we get the following proposition developed by Di Qi from discussions

with the author.
Proposition 3.6 (Qi and Majda) (Generalized energy conservation principle)

Assume for any index pair (i, j) with 1 ≤ i, j ≤ N, we have the dyad interaction
balance

Aiij + Ajii = ei · B ei , ej + ei · B ej , ei + ej · B (ei , ei ) = 0, (3.48)
or equivalently

Ajji + Aijj = ej · B ej , ei + ej · B ei , ej + ei · B ej , ej = 0. (3.49)
Then we will have the same detailed triad energy conservation symmetry in (3.26)
Akij + Aijk + Ajki = 0.
Therefore the statistical energy conservation principle (3.42) still holds under the
generalized assumption (3.48) or (3.49).
Examples with a dyad model and the TBH model

To check the generalized energy principle, we begin with the simplest dyad interac-
tion equation
∂u1
= γ1 u1 u2 + γ2 u22 ,
∂t
∂u2
= −γ1 u12 − γ2 u1 u2 .
∂t
mentioned in [100]. Take the two-dimensional natural basis e1 = (1, 0)T and e2 =
(0, 1)T , and the inner product is defined as the standard Euclidean inner product.
Then the conservative quadratic interaction becomes
T
1 1
B (e1 , e1 ) = (0, −γ1 )T , B (e1 , e2 ) = B (e2 , e1 ) = γ1 , − γ2 B (e2 , e2 ) = (γ2 , 0)T .
2 2
First the energy conservation u · B (u, u) = 0 is satisfied in this model. But obviously
the assumptions in (3.28) and (3.29) become non-zero. On the other hand, if we check
the assumption in (3.48) or (3.49), the dyad interaction balance is satisfied so that
e1 · B (e1 , e2 ) + e1 · B (e2 , e1 ) + e2 · B (e1 , e1 ) = γ1 − γ1 = 0.
Therefore the statistical energy conservation law is still valid for this dyad system

d 1 2 1 2 2
ū + ū2 +
2
u + u2 = 0.
dt 2 1 2 1
This can be checked easily from direct calculation from the original dyad system.
As another example, we check the truncated Burgers–Hopf (TBH) equation [130]
1 2
(uΛ )t + PΛ uΛ x
= 0,
2

with uΛ (x) = |k|≤Λ uk ek (x). The TBH equation is another very nice example with
both dyad and triad interactions. Periodic boundary conditions are applied on the
system, so the natural choice of the basis is the standard Fourier basis ek (x) = eikx .
The quadratic interaction operator can be defined as
1
B (u, v) = − (uv)x ,
2
with the inner product defined as the standard inner product in Hilbert space
ˆ π
1
u, v = uv ≡ uv.
2π −π
First, the energy conservation law is satisfied by the periodic boundary condition
1 1 3
u, B (u, u) = − u u2 x = − u x ≡ 0.
2 6
The assumptions in (3.28) and (3.29) are still not satisfied
B (em , em ) = −ime2imx ,
and
i
B (em , en ) = B (en , em ) = − (m + n) ei(m+n)x .
2
However still we have the combined dyad interaction balance in (3.48) and (3.49) of
Proposition 3.6
em , B (em , en ) + B (en , em ) + en , B (em , em ) = −i (m + n) ei(2m+n)x − im ei(2m+n)x ≡ 0.
Therefore the statistical energy conservation is still satisfied for the TBH equation,
that is,

d 1 2 1 2
ū + uk = 0.
dt 2 Λ 2
k∈Λ
3.3.2.2 Generalized Statistical Energy Principle

for Inhomogeneous Dynamics
In many applications the kinetic energy is not simply the Euclidean inner-product of
the state variables; here we show how to develop an energy principle for this situation.
Here we consider the general case of the statistical energy principle developed in
[100] with inhomogeneous dynamics. For simplicity in exposition, in the subsection
we write everything under the discrete matrix form. The the set of proper orthonormal
basis forms the transform matrix E = [e1 , e2 , · · · , eN ]. And define the general form
of inner product through the positive definite metric matrix M = M T (M ≥ 0) so that
u, v = uT Mv.
Therefore the orthonormal property of the basis is defined under the weighted inner
product
1 1
ei , ej = eiT Mej = δij ⇔ E T ME = I, M 2 EE T M 2 = I.

For example, = R̂ = u ⊗ u be the covariance matrix in the physical domain,
let∗C
and Rkl = ûk ûl the covariance matrix in the spectral domain under the spectral
basis {ek }, then the transform relation between the covariance matrices C and R
under different basis can be defined through the metric matrix M and the basis E

CMej = Rij ei ⇔ CME = ER ⇔ C = ERE T .
i
Such a formula is true in general. Combining the above relations, we get the part in
the statistical energy equation

trD̃R = ei , Dej Rji = tr E T MDER = tr E T MDCME = tr MEE T MDC = tr (MDC) .
i,j
(3.50)
We also have
ū, Dū = tr ūT DM ū = tr (DM (ū ⊗ ū)) . (3.51)
To express the statistical energy equation under uniform representations, we can also
write the statistical energy E = 21 ū, ū + 21 trR in the matrix form as
1
E= tr (M ū ⊗ ū + CM) . (3.52)
2
Substitute (3.50), (3.51), and (3.52) into the original statistical energy equation (3.42),
we get the energy dynamics for inhomogeneous dynamical systems
d 1 1
tr (M ū ⊗ ū + CM) = tr (D (M ū ⊗ ū + CM)) + ūT MF + trQσ . (3.53)
dt 2 2
In fact, the inclusion of the metric matrix M is awkward in presentation as in the
formulations above. Therefore, we can introduce the transformed basis including the
metric as
1 1
Ẽ = M 2 E = M 2 [e1 , e2 , · · · , eN ] = ẽ1 , ẽ2 , · · · , ẽN .
With the above transformed basis and positive-definite metric matrix M = M T ,

the inner product comes back to the standard form in Euclidean space with the
orthonormal properties as original

ẽi , ẽj = ẽTi ẽj = δij ⇔ Ẽ T Ẽ = I, Ẽ Ẽ T = I.
Accordingly, we can define the statistical mean and covariance matrix together with
the external forcing F under this basis as
1
ū˜ = M 2 ū, C̃ = M 2 u ⊗ M 2 u = M 2 CM 2 , F̃ = M 2 F.
1 1 1 1 1
Finally the transformed statistical energy also comes back to the original form as
1 ˜ ˜
Ẽ = E = tr ū ⊗ ū + C̃ . (3.54)
2
Under all these notations, we reach the following proposition for the statistical energy
dynamics for inhomogeneous dynamical systems:
Proposition 3.7 (Statistical energy inequality for inhomogeneous systems) For the
inhomogeneous case of the dynamical system (3.21) with negative definite matrix
D = DT as the general inhomogeneous damping and the positive definite metric
matrix M = M T , the statistical energy equation in (3.42) can be rewritten as
d 1 ˜ ˜ 1
tr ū ⊗ ū + C̃ = tr D ū˜ ⊗ ū˜ + C̃ + ū˜ T F̃ + trQσ . (3.55)
dt 2 2
Especially, we can estimate the bounds of the statistical energy from
1 d Ẽ 1
− λmin (D) 2Ẽ + ū˜ T F̃ + trQσ ≤ ≤ −λmax (D) 2Ẽ + ū˜ T F̃ + trQσ , (3.56)
2 dt 2
with −λmax and −λmin the maximum and minimum eigenvalues of D respectively.
Above in the estimation for energy bounds, we apply the inequality
λmin (A) trB ≤ tr (AB) ≤ λmax (A) trB,
for positive definite matrices A, B. Therefore, (3.56) estimates proper bounds for the
statistical energy and will come back to the original case if M and D are diagonal
matrices (or even in the form aI) in the homogeneous case.
3.3.3 Enhanced Dissipation of the Statistical Mean Energy,

the Statistical Energy Principle, and “Eddy Viscosity”
The goal here is to use the statistical energy principle to demonstrate that there is
enhanced dissipation for the mean statistical energy in a statistical steady state. This
simple but important result was suggested by Xiaoming Wang in informal discussion
with the author. Here is the derivation. From Proposition 3.3 of Section 3.3, the
change in the energy in the mean is given by

d 1 2
|ū| = ū · Dū + ū · F + Rū, (3.57)
dt 2

with Rū ≡ ū · 21 B ei , ej + B ej , ei Rij and the summation convention. By the
statistical energy Theorem 3.1 in Section 3.3, with E = 21 |ū|2 + 21 trR
dE
= ū · Dū + ū · F + tr (DR) (3.58)
dt
provided the random forcing vanishes. Subtracting the first identity from the second
one we have
d 1 1
trR = − Rū + tr (DR) , (3.59)
dt 2 2
so in a statistical steady state, we have
Rū ≡ tr (DR) . (3.60)
Now make the assumption above Corollary 3.2 from Section 3.3 that D is a diagonal
dissipation matrix, then
−d+ trR ≥ tr (DR) ≥ −d− trR.
Thus under the above hypothesis we have the proposition:
Proposition 3.8 At a statistical steady state,
−d− trR ≤ Rū = tr (DR) ≤ −d+ trR,
so that Rū damps the energy in the mean flow from (3.57) at a rate proportional to
the turbulent dissipation which is precisely tr (DR), the dissipation of energy for the
fluctuations.
Conventional “eddy viscosity” methods [143] postulate a simple closure such as
R = −D ∗ (ū) D (ū) ū, (3.61)
for some nonlinear matrix function D (ū) and the formal closed equation for the
mean, ū,
d ū
= (L + D) ū + B (ū, ū) − D ∗ (ū) D (ū) ū + F, (3.62)
dt
The rate of change for mean energy in such a closed equation is given by

d 1 2
|ū| = ū · Dū + ū · F − |D (ū) ū|2 ,
dt 2
and such an ad hoc damping of mean kinetic energy is at least broadly consistent
with the more rigorous content of the above proposition.
3.3.4 Stochastic Lyapunov Functions for One-Layer

Turbulent Geophysical Flows
The statistical energy conservation form for the one-layer geophysical flows in 3.2.3
combined with the statistical energy principle in 3.3 gives us a straightforward Lya-
punov function, which is the total potential enstrophy. Define
ˆ
1 1 1
E = |qΛ |2 dx = |q|2 = |qk |2 ,
2 2 2
k∈I
´
then EE = 21 |q̄|2 dx + 21 trR. Applying the Theorem 3.1 in Section 3.3 [100] the
time derivative of EE is given by
d
EE = q̄ · Dq̄ + q̄ · F + tr (DR) + trQσ .
dt
Using the spectral decomposition of ω̄

q̄ · Dq̄ + tr (DR) = − dk |qk |2 + Rkk ≤ −2d0 EE ,
k

where we recall d0 = γl ≤ dk for all k. As a consequence, E is a Lyapunov
function because
d 1 1 1
EE ≤ −2d0 EE + Re (q̄ · F) + trQσ ≤ −d0 EE + |F|2 + trQσ . (3.63)
dt 2 2d0 2
In the derivation above, we used that 2EE ≥ |q̄|2 , and then applied Young’s inequal-
ity. Then from (3.63), it suffices to apply Grönwall’s inequality to see E is a Lyapunov
function.
When there is no topography, h ≡ 0, the total energy will also be a Lyapunov
function. The total energy is given by
ˆ
1 1
E= |∇ψΛ |2 + F 2 |ψΛ |2 dx = |vk |2 ,
2 2
k∈I
where vk = Ck−1 qk with Ck = |k|2 + F 2 . The dynamics of vk can be derived from

(3.18) by a linear transform
−dk + iβk1
dvk (t) = vk dt + am,n Cm Cn Ck−1 vm vn dt +Ck−1 fk dt +Ck−1 σk dWk (t) .
F + |k|
2 2
m+n=k
m,n∈I
We can as well rewrite this dynamic equation into a statistical energy conservation
form like (3.19), because for all m + n = k,
am,n Cm Cn Ck−1 + an,k Cn Ck Cm−1 + ak,m Ck Cm Cn−1 = 0.
The remaining derivation for the dissipation of E is identical to the one of E . This
illustrates a concrete use of the energy principle in a new inner-product as discussed
at the end of Section 3.2. If we need to include topography we need another equation
for the large-sale flow in the original dynamics as discussed extensively in [130].
This is an amusing exercise.
3.4 Geometric Ergodicity for Turbulent Dynamical Systems
Geometric ergodicity is an important property for a stochastic turbulent dynamical

system and means that there is a unique invariant measure (statistical steady state)
which attracts statistical initial data at an exponential rate. Geometric ergodicity for
finite dimensional Galerkin truncation of the two or three dimensional Navier–Stokes
equations with minimal stochastic forcing (but without deterministic forcing) is an
important research topic [41, 150]. A useful general framework for geometric ergod-
icity for finite dimensional diffusions has been developed and applied [136, 137].
There is a very recent proof of geometric ergodicity for the Galerkin truncation of
complex one-layer geophysical flows from 2.1 with general topography, dispersion,
inhomogeneous deterministic and minimal stochastic forcing [126]; the statistical
Lyapunov functional from Section 3.3 and 3.3.4 plays a crucial role. There are many
future applications for geometric ergodicity for turbulent dynamical systems using
the statistical energy principle developed in 3.3 so it is useful next to briefly sum-
marize the abstract framework that has been developed [136]. Here we change to a
notation more natural to abstract probabilistic problems.
Theorem 3.2 Let Xn be a Markov chain in a space E such that
1. There is a Lyapunov function E : E → R+ for the Markov process Xn with
compact sub-level sets, while EE (Xt ) ≤ e−γ t EE (X0 ) + K for certain γ , K > 0.
2. Minorization: for any compact set B, there is a compact set C ⊃ B such that the
minorization condition holds for C. That is, there is a probability measure ν with
ν (C) = 1, and η > 0 such that for any given set A
P (Xn ∈ A | Xn−1 = x) ≥ ην (A)
for all x ∈ C.
Then there is a unique invariant measure π and a constant r ∈ (0, 1), κ > 0 such
that ˆ
μ
P (Xn ∈ ·) − π tv ≤ κr 1 + E (x) μ (dx) .
n
Here Pμ (Xn ∈ ·) is the law of Xn´given X0 ∼ μ; and ·tv denotes the total variation
distance, which is μ − νtv = |p (x) − q (x)| dx, assuming μ and ν has density p
and q.
For diffusion processes in Rd , the minorization condition can be achieved by the
following proposition, which is a combination of Lemma 2.7 in [136], [69], Theorem
4.20 in [164] and Lemma 3.4 of [137].
Proposition 3.9 Let Xt be a diffusion process in Rd that follows

n
dXt = Y (Xt ) dt + Σk (Xt ) ◦ dBk . (3.64)
k=1
In above, Bk are independent 1-D Wiener processes, and ◦ stands for Stratonovich
integral. Y and Σk are smooth vector fields with at most polynomial growth for all
derivatives. Assume moreover that for any T > 0, k > 0, p > 0 and initial condition,
the following growth condition holds
! !p ! −1 !p
! (k) !
E sup |Xt |p < ∞, E sup !J0,t ! ≤ ∞, E sup !J0,t ! ≤ ∞. (3.65)
t≤T t≤T t≤T
(k)
Here J0,t is the Frechet derivative flow: J0,t v = lim→0 1 Xtx0 +v − Xtx0 , and J0,t
are the higher order derivatives. Then Xt satisfies the minorization assumption if the
following two hold:
• Hypoellipticity: Let L be the Lie algebra generated by {Y , Σ1 , · · · , Σn }. Let
L0 be the ideal of {Σ1 , · · · , Σn } inside L , which is essentially the linear space
spanned by

Σi , Σi , Σj , [Σi , Y ] , Σi , Σj , Σk , [[Σi , Y ] , Σk ] , · · · .
The diffusion process is hypoelliptic if L0 = Rd at each point.

• Reachability: there is a point x ∗ ∈ Rd , such that with any > 0 there is a T > 0,
such that from any point x0 ∈ Rd we can cadlag control process bk such that the
solution to the following ODE initialized at x0

n
dxt = Y (xt ) dt + Σk (xt ) bk dt
k=1
satisfies |xT − x ∗ | < . Here by cadlag control, we mean bk (t) is continuous from
right, has left limit and locally bounded.
Moreover, with arbitrary initial condition, Xt has a smooth density with respect
to the Lebesgue measure. So if π is an invariant measure, it has a smooth density.
There is a gap in previous verifications and some applications of the minorization
principle in the sense that the apriori estimates in (3.65) were not verified apriori
3.4 Geometric Ergodicity for Turbulent Dynamical Systems 41
for non-Lipschitz vector fields. This gap is closed in the proof by [126] by using
the fact that not only is the statistical energy in 3.3.4 a Lyapunov function but also
its higher powers and suitable exponentials. Also to illustrate the importance of
the reachability condition for turbulent dynamical system, an example of a two-
dimensional stochastic dynamical system is presented [126] which has the square
of the Euclidean norm as the Lyapunov function, is hypoelliptic with nonzero noise
forcing, yet fails to be reachable or ergodic.
An important mathematical problem is to extend the energy principle as a sta-
tistical Lyapunov function from 3.3 [100] to help prove geometric ergodicity for
turbulent dynamical systems with deterministic plus minimal stochastic forcing to
the infinite dimensional setting. At the present time, there is the celebrated proof of
geometric ergodicity of the 2-D Navier–Stokes equation under hypotheses of min-
imal stochastic forcing but making the mean flow vanish [71]. The only rigorous
result with a non-zero mean flow and random fluctuations interacting involves the
random bombardment of the Navier–Stokes equation by coherent vortices [129].
Some important finite dimensional problems for geometric ergodicity for complex
geophysical flows by the recent approach [126] include the following: geophysical
models on the sphere [130] where forcing by two stochastic modes is not enough, and
two-layer models with baroclinic instability [171] with deterministic and stochastic
wind stress forcing.
There is a recent novel application of geometric ergodicity to stochastic lattice
models for tropical convection [127]. These models involve Markov jump processes
with an infinite state space with both unbounded and degenerate transition rates. This
is another rich source of problems with new phenomena for turbulent dynamical
system.
Chapter 4
Statistical Prediction and UQ for Turbulent
Dynamical Systems
4.1 A Brief Introduction
As discussed in Chapter 1, a grand challenge with great practical impact is to devise

new methods for large dimensional turbulent dynamical systems with statistically
accurate prediction and UQ which overcome the curse of ensemble size. This is espe-
cially important for accurate assessment with uncertain initial data and the response
to changes in forcing where it is impossible to run Monte-Carlo simulations for all
possible uncertain forcing scenarios in order to do attribution studies. The key phys-
ical significant quantities are often characterized by the degrees of freedom which
carry the largest energy or variance and reduced order models (ROM) are needed on
such a low dimensional subspace. This chapter is about recent mathematical strate-
gies that can potentially overcome these obstacles by blending information theory,
the statistical energy conservation principles from Chapter 3, and statistical response
theory [106, 107, 109, 117, 147, 158]. In Section 4.2 we provide a discussion of these
recent blended strategies and list many quantitative and qualitative low-order mod-
els. Section 4.3 provides an introduction to their use in more complex applications
with the L-96 model used for illustration; future directions for applications are also
discussed.
4.1.1 Low-Order Truncation Methods for UQ and Their

Limitations
Next we briefly discuss some popular low-order truncation methods for UQ and their
limitations. Low-order truncation models for UQ include projection of the dynamics
on leading order empirical orthogonal functions (EOF’s) [75], truncated polynomial
chaos (PC) expansions [77, 86, 138], and dynamically orthogonal (DO) truncations

DOI 10.1007/978-3-319-32217-9_4
44 4 Statistical Prediction and UQ for Turbulent Dynamical Systems
[154, 155]. Despite some success for these methods in weakly chaotic dynamical
regimes, concise mathematical models and analysis reveal fundamental limitations
in truncated EOF expansions [11, 37], PC expansions [20, 103], and DO truncations
[156, 157], due to different manifestations of the fact that in many turbulent dynamical
systems, modes that carry small variance on average can have important, highly
intermittent dynamical effects on the large variance modes. Furthermore, the large
dimension of the active variables in turbulent dynamical systems makes direct UQ
by large ensemble Monte-Carlo simulations impossible in the foreseeable future
while once again, concise mathematical models [103] point to the limitations of
using moderately large yet statistically too small ensemble sizes. Other important
methods for UQ involve the linear statistical response to change in external forcing or
initial data through the fluctuation dissipation theorem (FDT) which only requires the
measurement of suitable time correlations in the unperturbed system [3, 5, 59, 60, 70,
109]. Despite some significant success with this approach for turbulent dynamical
systems [3, 5, 59, 60, 70, 109], the method is hampered by the need to measure
suitable approximations to the exact correlations for long time series as well as the
fundamental limitation to parameter regimes with a linear statistical response.
We end this brief introduction by illustrating a pioneering statistical prediction
strategy [42] which can overcome the curse of ensemble size for moderate size turbu-
lent dynamical systems and by discussing its mathematical limitations as motivation
for the more subtle methods developed in Section 4.2 and 4.3.
4.1.2 The Gaussian Closure Method for Statistical Prediction
This method [42] starts with the exact equations for the mean and covariance for
any turbulent dynamical system derived in Section 3.2. Here we assume the random
forcing vanishes so that σ j ≡ 0. Recall that the equation for the covariance is
not closed in general and involves the third moments through Q F . The Gaussian
closure method simply neglects these third-order moments by setting Q F ≡ 0 in the
covariance equations resulting in the approximate model statistical equations for the
mean, ū M , and the covariance, R M , given by the equations,
d ū M
= (L + D) ū M + B (ū M , ū M ) + R M,i j B vi , v j + F,
dt (4.1)
d RM
= L v R M + R M L ∗v .
dt
These coupled-equations in (4.1) are deterministic equations for the mean and covari-
ance and define the Gaussian closure prediction method since Gaussian distributions
are uniquely specified by these two statistics. This method completely avoids the
curse of ensemble size since no finite ensembles are introduced; however, the expense
4.1 A Brief Introduction 45
of integrating the deterministic

matrix covariance equation restricts applications to
roughly N = O 103 . The method in (4.1) has been applied to short time statistical
prediction for truncated geophysical models like the one-layer geophysical models in
Section 2.1 with moderate success [43]. This closure is popular among some groups
in the geoscience community (See [163] and references therein) despite the serious
limitations discussed next.
4.1.3 A Fundamental Limitation of the Gaussian Closure

Method
The above closure method neglects the third-order moments by setting Q F ≡ 0, while
most turbulent dynamical systems involve the important role of external forcing. Here
it is shown that this closure method fails to reproduce non-trivial statistical steady
states in the L-96 model; thus, neglecting the third-order moments leads to this
highly undesirable feature. To see this consider the homogeneous statistical steady
state dynamics for the L-96 model discussed in Section 3.2.2 and apply the same
manipulations there to the Gaussian closure approximation at a statistical steady state.
With this closure approximation the steady state equation in (3.14) from Section 3.2.2
becomes
J
−Γi ū M,eq − 1 r M,eq,i = 0, for 0 ≤ i ≤ . (4.2)
2
The equations in (4.2) are highly overdetermined and have a non-trivial steady state
solution provided ū is restricted to ū ∗M with −Γi ū ∗M,eq − 1 = 0 for some i, yielding
the trivial statistical steady states given by
1
ū ∗M,eq = − ∗
, r M,eq, ∗
j = δi j r M,eq,i . (4.3)
Γi
Furthermore, the dynamical equation for the covariance restricts the stability of this
steady state so that −Γ j ū M,eq − 1 < 0 for j = i resulting in the unique marginally
stable statistical state given by ū ∗M = ū M , with r M
∗
given explicitly by
∗ ū ∗M − F
r M,i = .
Γi
Thus, the Gaussian closure method fails to recover general non-trivial statistical
steady states for the L-96 model. One needs to develop more sophisticated but effi-
cient methods to include the impact of the third-moments on prediction and UQ with
non-trivial forcing [117, 158]. Such methods are discussed in Section 4.2 and 4.3.
4.2 A Mathematical Strategy for Imperfect Model

Selection, Calibration, and Accurate Prediction:
Blending Information Theory and Statistical Response
Theory
Here the framework for imperfect model selection using empirical information
theory [105] is summarized in Section 4.2.1. The theory for linear statistical response
for perturbed turbulent dynamical systems is summarized in Section 4.2.2 [102].
Using information theory for imperfect model selection is often used in statistical
science in purely data driven strategies [22]. However, these methods do not account
for the physical nature of the nonlinear dynamics and its impact on prediction as
well as the intrinsic dynamical information barriers that exist in the specific class of
imperfect models chosen [103]. Section 4.2.3 shows how to develop a recent cali-
bration strategy for accurate UQ in statistical prediction which blends information
theory and linear statistical response theory [106, 107]. Instructive applications of
this strategy to quantitative and qualitative low-order models are briefly discussed
in Section 4.2.4 while applications to complex turbulent dynamical systems are dis-
cussed in Section 4.3.
4.2.1 Imperfect Model Selection, Empirical Information

Theory, and Information Barriers
With a subset
of variables
u ∈ R N and a family of measurement functionals
E L (u) = E j (u) , 1 ≤ j ≤ L, for the perfect system, empirical information
theory [78, 130] builds the least biased probability measure π L (u) consistent with
the L measurements of the present climate, Ē L . There is a unique functional on
probability densities [78, 130] to measure this given by the entropy

S =− π log π, (4.4)
and π L (u) is the unique probability so that S (π L (u)) has the largest value among
those probability densities consistent with the measured information, Ē L . All inte-
grals as in (4.4) are over the phase space R N unless otherwise noted. For example,
measurements of the mean and second moments of the perfect system necessarily
lead to a Gaussian approximation [114, 130] to the perfect system from measure-
ments, π L (u) = πG (u). Any model of the perfect system produces a probability
density π M (u). The natural way [88, 130] to measure the lack of information in one
probability density q (u) compared with the true probability density p (u) is through
the relative entropy P ( p, q) given by
4.2 A Mathematical Strategy for Imperfect Model Selection … 47

p
P ( p, q) = p log . (4.5)
q
This asymmetric functional on probability densities P ( p, q) has two attractive

features [88, 114, 130] as a metric for model fidelity: (1) P ( p, q) ≥ 0 with equality
if and only if p = q, and (2) P ( p, q) is invariant under general nonlinear changes
of variables.
The first issue to contend with is the fact that π L (u) is not the actual perfect
model density but only reflects the best unbiased estimate of the perfect model given
the L measurements Ē L . Let π (u) denote the probability density of the perfect
model, which is not actually known. Nevertheless, P (π, π L ) precisely quantifies
the intrinsic error in using the L measurements of the perfect model Ē L . Consider
an imperfect model with its associated probability density
π M (u); then the intrinsic
model error in the climate statistics is given by P π, π M
. In practice, π M (u) is
determined by no more information than that available in the perfect model.
Consider a class of imperfect models M . The best imperfect model for the coarse-
grained variable u is the M ∗ ∈ M so that the perfect model has the smallest additional
∗
information beyond the imperfect model distribution π M (u), i.e.,
∗
P π, π M = min P π, π M . (4.6)
M∈M
Also, actual improvements in a given imperfect model with distribution π M (u)

resulting in a new πpost
M
(u) should result in improved information for the perfect

model, so that P π, πpost ≤ P π, π M . Otherwise, objectively, the model has
M
not been improved compared with the original perfect model. The following general
principle [102, 105] facilitates the practical calculation of (4.6):

P π, π LM = P (π, π L ) + P π L , π LM
(4.7)
= [S (π L ) − S (π )] + P π L , π LM , for L ≤ L .
The entropy difference, S (π L ) − S (π ) in (4.7), precisely measures an intrinsic

error from the L measurements of the perfect system and this is a simple example
of an information barrier for any imperfect model based on L measurements for
calibration. With (4.7) and a fixed family of L measurements of the actual climate, the
optimization principle in (4.6) can be computed explicitly by replacing the unknown
density π by the hypothetically known π L in these formulas so that, for example,
∗
π M is calculated by
∗
P π L , π LM = min P π L , π LM . (4.8)
M∈M
The most practical setup for applying the framework of empirical information
theory developed above arises when both the perfect system measurements and the
model measurements involve only the mean and covariance of the variables u so
that π L is Gaussian with climate mean ū and covariance R while π M is Gaussian

with model mean ū M and covariance R M . In this case, P π L , π M has the explicit
formula [85, 130]

1
P πL , π M = (ū − ū M )∗ R −1
M ( ū − ū M )
2
(4.9)
1 1
+ − log det R R −1 M + trR R −1
M − N .
2 2
Note that the first term in brackets in (4.9) is the signal, reflecting the model error in
the mean but weighted by the inverse of the model covariance R −1 M , while the second
term in brackets, the dispersion, involves only the model error covariance ratio R R −1 M .
The intrinsic metric in (4.9) is invariant under any (linear) change of variables that
maps Gaussian distributions to Gaussians, and the signal and dispersion terms are
individually invariant under these transformations; this property is very important.
Many examples of dynamic information barriers for imperfect (even linear) turbulent
dynamical systems are discussed elsewhere [103, 106, 107].
4.2.2 Linear Statistical Response and

Fluctuation-Dissipation Theorem for Turbulent
Dynamical Systems
The fluctuation-dissipation theorem is one of the cornerstones of the statistical

physics of identical molecules of gases and liquids [134]. In a very brief seminal
article from 1975, Leith [93] suggested that if FDT can be established for suitable
coarse-grained functionals in climate science, then climate change assessments can
be performed simply by gathering suitable statistics in the present climate. Here is a
brief summary of FDT for the stochastic dynamical system [102, 131].
Recall the turbulent dynamical system in (1.1)–(1.2) from Chapter 1 with time
independent coefficients given by
du
= F (u) + σ (u) Ẇ . (4.10)
dt
The ideal equilibrium state associated with (4.10) is the probability density πeq (u)
that satisfies LFP πeq = 0 and the equilibrium statistics of some functional A (u) are
determined by
A (u)
= A (u) πeq (u) du. (4.11)
Next, perturb the system in (4.10) by the change δw (u) f (t); that is, consider the
perturbed equation
duδ
= F uδ + δw (u) f (t) + σ uδ Ẇ . (4.12)
dt
Calculate perturbed statistics by utilizing the Fokker–Planck equation associated with
(4.12) with initial data given by the unperturbed statistical equilibrium. Then FDT
[102] states that if δ is small enough, the leading-order correction to the statistics in
(4.11) becomes t
δ A (u)
(t) = R (t − s) δ f (s) ds, (4.13)
0
where R (t) is the linear response operator that is calculated through correlation
functions in the unperturbed statistical equilibrium

divu wπeq
R (t) = A (u (t)) B (u (0))
, B (u) = − . (4.14)
πeq
The noise in (4.10) is not needed for FDT to be valid, but in this form the equilibrium
measure needs to be smooth. Such a FDT response is known to be valid rigorously
for a wide range of dynamical systems under minimal hypotheses [70].
There are important practical and computational advantages for climate change
science when a skillful FDT algorithm is established. The FDT response opera-
tor can be utilized directly for multiple climate change scenarios, multiple changes
in forcing, and other parameters, such as damping and inverse modeling directly
[59, 60], without the need for running the complex climate model in each individual
case. Note that FDT is a type of dynamic statistical linearization and does not involve
linearizing the underlying nonlinear dynamics. The direct application of FDT to the
natural perfect model in (4.10) is hampered by the fact that the dynamics in (4.10),
the equilibrium measure in (4.11), and even the dimension of the phase space in
(4.10) and (4.11) are unknown. Recently an important link [107] was established
through empirical information theory and FDT between the skill of specific predic-
tion experiments in the training phase for the imperfect model when the climate is
observed and the skill of the model for long-range perturbed climate sensitivity.
There is a growing literature in developing theory [51, 101, 102, 109, 131] and
algorithms for FDT [1, 3–6, 14, 24, 59–61, 93] for forced dissipative turbulent sys-
tems far from equilibrium. In fact, the earliest algorithms that tested the original
suggestion of Leith [93] utilized kicked perturbations without model error to evalu-
ate the response operator [14, 24], and these algorithms have been improved recently
[1, 4]; their main limitation is that they can diverge at finite times when there are
positive Lyapunov exponents [1, 4, 24]. Alternative algorithms utilize the quasi-
Gaussian approximation [102] in the formulas in (4.14); these algorithms have been
demonstrated to have high skill in both mean and variance response in the mid-
latitude upper troposphere to tropical forcing [59, 60] as well as for a variety of
other large-dimensional turbulent dynamical systems that are strongly mixing [3,
5, 102]. There are recent blended response algorithms that combine the attractive
features of both approaches and give very high skill for both the mean and variance
response for the L-96 model [3, 98] as well as suitable large-dimensional models
of the atmosphere [5] and ocean [6] in a variety of weakly and strongly chaotic
regimes. Finally, there are linear regression models [142] that try to calculate the
mean and variance response directly from data; these linear regression models can
have very good skill in the mean response but necessarily have no skill [109] in the
variance response; they necessarily have an intrinsic information barrier [105–107]
for skill in model response when the perfect model has a large variance response.
In fact, one can regard all of the above approximations as defining various systems
with model error in calculating the ideal response of a perfect model [102]; this is
a useful exercise for understanding the information theoretic framework for model
error and response proposed recently [107], and examples are presented there. There
are important generalizations and applications of linear statistical response theory to
systems with time dependent coefficients [51, 131].
4.2.2.1 Kicked Statistical Response
One strategy to approximate the linear response operator which avoids direct evalua-
tion of πeq through the FDT formula is through the kicked response of an unperturbed
system to a perturbation δu of the initial state from the equilibrium measure, that is,

π |t=0 = πeq (u − δu) = πeq − δu · ∇πeq + O δ 2 . (4.15)
One important advantage of adopting this kicked response strategy is that higher order
statistics due to nonlinear dynamics will not be ignored (compared with the other
linearized strategy using only Gaussian statistics [131]). Then the kicked response
theory gives the following fact [102, 107] for calculating the linear response operator:
Fact: For δ small enough, the linear response operator R (t) can be calculated by
solving the unperturbed system (4.10) with a perturbed initial distribution in (4.15).
Therefore, the linear response operator can be achieved through

δR (t) ≡ δu · R = A (u) δπ + O δ 2 . (4.16)
Here δπ is the resulting leading order expansion of the transient probability

density function from unperturbed dynamics using initial value perturbation. The
straight forward Monte Carlo algorithm to approximate (4.16) is sketched elsewhere
[102, 117].
4.2.3 The Calibration and Training Phase Combining

Information Theory and Kicked Statistical Response
Theory
We are interested in UQ in predicting the response to general changes in forcing in

turbulent dynamical systems. Consider the perfect model probability density on a
subset of variables u ∈ R N , πδ (u, t), compared with πδM (u, t) the imperfect model,
where δ denotes a specific external perturbation of the system. The important question
arises, how to calibrate the imperfect model so that it predicts the response to all
forcings within the given class with accurate UQ as regards the mean and variance of
the response? One necessary condition is statistical equilibrium fidelity [105–107].
In other words, if πG (u) , πGM (u) denote the Gaussian distributions which match
the first and second moments of the unperturbed perfect distribution π (u) and the
imperfect model distribution π M (u), respectively, then statistical equilibrium fidelity
means that the Gaussian relative entropy in (4.9) satisfies

P πG (u) , πGM (u) = 0. (4.17)
Statistical equilibrium fidelity is a natural necessary condition to tune the mean

and variance of the imperfect model to match those of the perfect model; it is far
from a sufficient condition. To see this, recall from Section 3.1 that there are many
completely different dynamical systems with completely different dynamics but all
have the same Gaussian invariant measure so statistical equilibrium fidelity among
the models is obviously satisfied (see [102] for several concrete examples). Thus
the condition in (4.17) should be regarded as an important necessary condition.
UQ requires an accurate assessment of both the mean and variance and at least
(4.17) guarantees calibration of this on a subspace, u ∈ R M , for the unperturbed
model. Climate scientists often just tune only the means (see [105] and references
therein).
As hinted by Section 4.2.2, the prediction skill of imperfect models can be
improved by comparing the information distance through the linear response operator
with the true model. The following fact offers a convenient way to measure the lack
of information in the perturbed imperfect model requiring only knowledge of linear
responses for the mean and variance δ ū ≡ δRu , δ R ≡ δR(u−ū)2 . For this result, it
is important
to tune the imperfect model to satisfy equilibrium model fidelity ([106,
107]), P πG , πGM = 0. Here is some important theory:
Under simplifying assumptions with covariance matrices R = diag (Rk )
diagonal
and equilibrium model fidelity P πG , πGM = 0, the relative entropy in (4.9)
between perturbed model density πδM and the true perturbed density πδ with small
perturbation δ can be expanded componentwise as

P πδ , πδM = S πG,δ − S (πδ )
1
+ δ ū k − δ ū M,k Rk−1 δ ū k − δ ū M,k
2 k
1 −2 2
+ Rk δ Rk − δ R M,k + O δ 3 . (4.18)
4 k

Here in the first line S πG,δ − S (πδ ) is the intrinsic error from Gaussian approx-
imation of the system. Rk is the equilibrium variance in kth component, and δ ū k and
δ Rk are the linear response operators for the mean and variance in kth component.
Proof of this result can be found in [107, 114].
The above facts about empirical information theory and linear response theory
together provide a convenient and unambiguous way of improving the performance
of imperfect models in terms of increasing their model sensitivity regardless of the
specific form of external perturbations δf . The formula (4.13) in Section 4.2.2 as well
as (4.14) illustrates that the skill of an imperfect model in predicting forced changes
to perturbations with general external forcing is directly linked to the model’s skill in
estimating the linear response operators R for the mean and variances (that is, use the
functional A = u, (u − ū)2 ) in a suitably weighted fashion as dictated by informa-
tion theory (4.18). This offers us useful hints of training imperfect models for optimal
responses for the mean and variance in a universal sense. From the linear response
theory in Section 4.2.2, it shows that the system’s responses to various external per-
turbations can be approximated by a convolution with the linear response operator
R (which is only related to the statistics in the unperturbed equilibrium statistics). It
is reasonable to claim that an imperfect model with precise prediction of this linear
response operator should possess uniformly good sensitivity to different kinds of
perturbations. On the other hand, the response operator can be calculated easily by
the transient state probability density function using the kicked response formula
as in (4.16). Considering all these good features of the linear response operator,
the information barrier due to model sensitivity to perturbations can be overcome by
minimizing the information error in the imperfect model kicked response distribution
relative to the true response [107].
To summarize, consider a class of imperfect models, M . The optimal model
M ∗ ∈ M that ensures best information consistent responses to various kinds of
perturbations is characterized with the smallest additional information in the linear
response operator R among all the imperfect models, such that

P πδ , π M ∗ = min P πδ , πδM L 1 ([0,T ]) , (4.19)
δ L 1 ([0,T ]) M∈M
where πδM can be achieved through a kicked response procedure (4.16) in the training
phase compared with the actual observed data π δ in nature, and the information dis-
tance between perturbed responses P πδ , πδM can be calculated with ease through
the expansion formula (4.18). The information distance P πδ (t) , πδM (t) is mea-
sured at each time instant, so the entire error is averaged under the L 1 -norm inside
a proper time window [0, T ]. Some low dimensional examples of this procedure for
turbulent systems can be found in [17, 18, 103] and more is said below.
4.2.4 Low-Order Models Illustrating Model Selection,

Calibration, and Prediction with UQ
Here is a brief discussion of some instructive quantitative and qualitative low-order

models where the calibration strategy for improved prediction and UQ developed
above in Section 4.2.3 is tested. The test models as in nature often exhibit intermit-
tency [47, 143] where some components of a turbulent dynamical system have low
amplitude phases followed by irregular large amplitude bursts of extreme events.
Intermittency is an important physical phenomena. Exactly solvable test models as
a test bed for the prediction and UQ strategy in Section 4.2.3 including information
barriers are discussed extensively in models ranging from linear stochastic models to
nonlinear models with intermittency in the research expository article [103] as well
as in [17, 18]. Some more sophisticated applications are mentioned next.
Turbulent diffusion in exactly solvable models are a rich source of highly nontriv-
ial spatiotemporal multi-scale models to test the strategies in Section 4.2.1 and 4.2.3
in a more complex setting [52, 106–108]. Even though these models have no posi-
tive Lyapunov exponents, they have been shown rigorously to exhibit intermittency
and extreme events [128]. Calibration strategies for imperfect models using infor-
mation theory have been developed recently to yield statistical accurate prediction
of these extreme events by imperfect inexpensive linear stochastic models for the
velocity field [417]. This topic merits much more attention by other modern applied
mathematicians.
4.2.4.1 Physics Constrained Nonlinear Regression Models for Time

Series
A central issue in contemporary science is the development of data driven statistical

dynamical models for the time series of a partial set of observed variables which
arise from suitable observations from nature ([36] and references therein); examples
are multi-level linear autoregressive models as well as ad hoc quadratic nonlinear
regression models. It has been established recently [132] that ad hoc quadratic multi-
level regression models can have finite time blow up of statistical solutions and
pathological behavior of their invariant measure even though they match the data with
high precision. A new class of physics-constrained multi-level nonlinear regression
models was developed which involve both memory effects in time as well as physics-
constrained energy conserving nonlinear interactions [72, 113] which completely
avoid the above pathological behavior with full mathematical rigor.
A striking application of these ideas combined with information calibration to
the predictability limits of tropical intraseasonal variability such as the monsoon has
been developed in a series of papers [25, 26, 30]. They yield an interesting class of
low-order turbulent dynamical systems with extreme events and intermittency.
Denote by u 1 and u 2 the two observed large-scale components of tropical intrasea-
sonal variability. The PDFs for u 1 and u 2 are highly non-Gaussian with fat tails
indicative of the temporal intermittency in the large-scale cloud patterns. To describe
the variability of the time series u 1 and u 2 , we propose the following family of
low-order stochastic models:
du 1
= −du u 1 + γ v + v f (t) u 1 − (a + ωu ) u 2 + σu Ẇu 1 ,
dt
du 2
= −du u 2 + γ v + v f (t) u 2 + (a + ωu ) u 1 + σu Ẇu 2 ,
dt (4.20)
dv
= −dv v − γ u 21 + u 22 + σv Ẇv ,
dt
dωu
= (−dω ωu ) + σω Ẇω ,
dt
where
v f (t) = f 0 + f t sin ω f t + φ . (4.21)
Besides the two observed variables u 1 and u 2 , the other two variables v and ωu are
hidden and unobserved, representing the stochastic damping and stochastic phase,
respectively. In (4.20), Ẇu 1 , Ẇu 2 , Ẇv , Ẇω are independent white noise. The con-
stant coefficients du , dv , dω represent damping for each stochastic process, and the
non-dimensional constant γ is the coefficient of the nonlinear interaction. The time
periodic damping v f (t) in the Equations (4.20) is utilized to crudely model the active
season and the quiescent season in the seasonal cycle. The constant coefficients ω f
and φ in (4.21) are the frequency and phase of the damping, respectively. All of the
model variables are real.
The hidden variables v, ωu interact with the observed variables u 1 , u 2 through
energy-conserving nonlinear interactions following the systematic
physics-constrained nonlinear regression strategies for time series [72, 113]. The
energy conserving nonlinear interactions between u 1 , u 2 and v, ωu are seen in the
following way. First, by dropping the linear and external forcing terms in (4.20), the
remaining equations involving only the nonlinear parts of (4.20) read,
du 1
=γ vu 1 − ωu u 2 ,
dt
du 2
=γ vu 2 + ωu u 1 ,
dt (4.22)
dv
=−γ u 21 + u 22 ,
dt
dωu
=0.
dt
equation of the energy from nonlinear interactions Ẽ =

To2 form2 the 2evolution
u 1 + u 2 + v + ωu2 /2, we multiply the four equations in (4.22) by u 1 , u 2 , v, ωu
respectively and then sum them up. The resulting equation yields
d Ẽ
= 0. (4.23)
dt
The vanishing of the right hand side in (4.23) is due to the opposite signs of the
nonlinear terms involving v multiplying u 1 and u 2 in (4.23) and those in (4.23)
multiplying by v as well as the trivial cancellation of skew-symmetric terms involving
ωu .
The stochastic damping v and stochastic phase ωu as well as their energy conserv-
ing nonlinear interaction with u 1 and u 2 distinguish the models in (4.20) from the
classic damped harmonic oscillator constant damping du and phase a. It is
with only
evident that a negative value of γ v + v f serves to strengthen the total damping of
the oscillator. On the other hand, when γ v + v f becomes positive and overwhelms
du , an exponential growth of u 1 and u 2 will occur for a random interval of time, which
corresponds to intermittent instability.
The nonlinear low-order stochastic model (4.20) has been shown to have signif-
icant skill for determining the predictability limits of the large-scale cloud patterns
of the boreal winter MJO [30] and the summer monsoon [25]. In addition, incor-
porating a new information-theoretic strategy in the calibration or training phase
[21], a simplified version of (4.20) without the time-period damping v f (t) has been
adopted to improve the predictability of the real-time multivariate MJO indices [26].
It is an interesting open problem to rigorously describe the intermittency and other
mathematical features in the turbulent dynamical systems in (4.20).
4.3 Improving Statistical Prediction and UQ in Complex

Turbulent Dynamical Systems by Blending Information
Theory and Kicked Statistical Response Theory
Here we illustrate how the strategy developed in Section 4.2.3 can be utilized for
complex turbulent dynamical systems through an application to the L-96 model intro-
duced in Section 2.2 and the statistical dynamics and statistical energy conservation
principle discussed extensively in Section 3.2.2. The systematic imperfect closure
models and the calibration strategies for UQ for the L-96 model serve as a template
for similar strategies for UQ with model error in vastly more complex realistic tur-
bulent dynamical systems. Some further progress is briefly discussed at the end of
this section. The discussion below closely follows the recent paper [117] which the
interested reader can consult for more details beyond the brief sketch below.
A reasonable goal for statistical prediction in any model is to produce the mean and
variance of the statistical solution at every spatial point for the response to a general
change in forcing. For example in climate science, this is the mean and variance of
the temperature at spatial locations on the surface of the earth. For the statistically
homogeneous L-96 model from (3.13) in Section 3.2.2, the single-point statistics for
the mean and variance are obtained by averaging over the Fourier modes so that
J −1
1
ū 1pt = u j = ū,
J j=0
(4.24)
1
J/2
1
R1pt = rk = trR.
J k=−J/2+1
J
Both the discussion of the fundamental limitation in 4.1.2 of the Gaussian closure
and the statistical dynamics in 3.2.4 for the L-96 model and more generally in
Section 3.2.2 point to the significance of accounting for the interaction of the mean
and covariance with the effect of the third moments.
Next we consider imperfect closure models which allow for this interaction and
satisfy statistical equilibrium fidelity (Section 4.2.3) for the one-point statistics in
(4.24) in calibration at the unperturbed statistical equilibrium in the L-96 model.
We consider a hierarchy of statistical dynamical closure models for the L-96 model
defined by closed equations for the mean and covariance. These imperfect models
satisfy the statistical dynamical equations
d ū M (t) 1
J/2
= −d (t) ū M (t) + r M,k (t) Γk + F (t) , (4.25a)
dt J k=−J/2+1
dr M,k (t)
= 2 [−Γk ū M (t) − d (t)] r M,k (t) + Q M
F,kk , k = 0, 1, . . . , J/2.
dt
(4.25b)
with the nonlinear flux Q F for the third moments is replaced by
F,kk = Q F−,kk + Q F+,kk = −2d M,k (R) r M,k + σ M,k (R) .

QM M M 2
(4.26)
Here Q M F− = −2d M,k (R) r M,k represents the additional damping to stabilize the
unstable modes with positive Lyapunov coefficients, while Q M F+ = σ M,k (R) is the
2
additional noise to compensate for the overdamped modes. Now the problem is
converted to finding expressions for d M,k and σ M,k 2
consistent with the calibration
strategy in 4.2.3 in order to lead to accurate statistical prediction. By gradually adding
more detailed characterization about the imperfect statistical dynamical model we
display the general procedure of constructing a hierarchy of the closure methods step
by step. We denote the equilibrium states for the mean and variance with unperturbed
uniform forcing as ū ∞ ≡ ū
, r j.∞ ≡ r j . And with a little abuse of notation, let
d = d (t)
, and F = F (t)
. Step by step, we bring in more and more considerations
in characterizing the uncertainties in each mode. Finally three different sets of closure
methods with increasing complexity and accuracy in prediction skill will be proposed,
illustrating one important statistical feature in each category.
4.3 Improving Statistical Prediction and UQ in Complex … 57
4.3.1 Models with Consistent Equilibrium Single Point

Statistics and Information Barriers
Here we want to construct the simplest closure model with consistent equilibrium
single point statistics (4.24). So the direct way is to choose constant damping and
noise term at most scaled with the total variance. We propose two possible choices
for (4.26).
• Gaussian closure 1 (GC1-1pt): let
d M,k (R) = d M ≡ const., σ M,k

2
(R) = σ M2 ≡ const.,
Q GC1
F = − (d M R + Rd M ) + σ M2 I ; (4.27)
• Gaussian closure 2 (GC2-1pt): let
J (trR)1/2 (trR)3/2
d M,k (R) = M ≡ M d̄, σ 2
M,k (R) = M
2 (trR∞ )3/2 (trR∞ )3/2
(trR)3/2
Q GC2
F = − M d̄ R + R d̄ + M I. (4.28)
(trR∞ )3/2
GC1-1pt is the familiar strategy of adding constant damping and white noise forcing
to represent nonlinear interaction [102]. In GC2-1pt, the term multiplying dissipation
scales with (trR)1/2 while the term multiplying noise scales with (trR)3/2 ; these are
dimensionally correct surrogates for the quadratic
nonlinear
terms.
Note that GC1-1pt includes parameters d M , σ M2 and the nonlinear energy
trQ GC1
F = −2d M trR M + J σ M2 may not be conserved, while GC2-1pt has one parame-
ter M and nonlinear energy conservation is enforced by construction trQ GC2F = 0.
Single point statistics consistency can be fulfilled through tuning the control para-
meters. But these models are calibrated by ignoring spatial correlations and a natural
information barrier is present which cannot be overcome by these imperfect models
(see Proposition 2 and Figure 2 of [117]).
4.3.2 Models with Consistent Unperturbed Equilibrium

Statistics for Each Mode
Next we improve the previous closure methods to ensure equilibrium statistical con-
sistency in each mode. Simply this can be achieved through changing the damping
rate for each mode according to the stationary state statistics. Specifically for the
above GC1 and GC2 in (4.27) and (4.28), the models can be improved by a slight
modification in the damping rates along each mode.
• GC1:
d M,k (R) = d M,k , σ M,k

2
(R) = σ M2 , d M,k = [−Γk ū ∞ − d] + σ M2 /2rk,∞ ;
(4.29)
• GC2:
(trR)3/2
d M,k (R) = 1,k d̄, σ M,k
2
(R) = M ,
(trR∞ )3/2
2 [−Γk ū ∞ − d] rk,∞ + M J (trR)1/2

1,k = , d̄ = . (4.30)
Jrk,∞ /trR∞ 2 (trR∞ )3/2
Above d M,k or 1,k is chosen so that the system in (4.25) has the same equilibrium
mean ū ∞ and variance rk,∞ as the true model, therefore ensuring equilibrium con-
sistency by finding the steady state solutions of (4.25) through simple algebraic
manipulations. Still in (4.29) and (4.30) the damping and noise are chosen empiri-
cally (depending on the one additional parameter σ M2 or M ) without consideration
about the true dynamical features in each mode. A more sophisticated strategy
with slightly more complexity in computation is to introduce the damping and
noise judiciously according to the linearized dynamics. Then climate consistency
for each mode can be satisfied automatically. That is the modified Gaussian clo-
sure model (MQG) introduced in [159]. We can also include this model into our
category as
• MQG:
f (R) trQ MQG

F−
d M (R) = N∞ , σ M2 (R) = − (Γk ū ∞ + d) rk,∞ δ I+ + qs ,
f (R∞ ) MQG
trQ F+,∞
(4.31)
with
1 −1
N∞,kk = [Γk ū ∞ + d] δ I− − qs rk,∞ .
2
Above I− represents the unstable modes with Γk ū ∞ + d > 0√while I+ is the
stable ones
Γk ū ∞ + d ≤ 0. We usually choose f (R) = trR, and qs =
with
ds λmax Q F,∞ (λmax the largest eigenvalue of Q F,∞ ) as one additional tuning
parameter to control the model responses.
The three classes of closure models GC1, GC2, and MQG all satisfy equilibrium
statistical fidelity in the unperturbed climate. The most sophisticated closure MQG
satisfies the statistical symmetry, trQ M
F
QG
≡ 0, GC2 satisfies this only at the unper-
turbed statistical equilibrium while GC1 never satisfies this statistical symmetry.
With the statistical energy conservation principle for L-96 as a guideline we expect
that the parameters in MQG and GC2 can both be calibrated according to the blend-
ing of information and kicked response from 4.2.3 with highly improved predictions
and UQ for the response to external forcing while GC1 is less skillful. The detailed
study in [117] confirms this. Below is a brief summary. We use the traditional value
F = 8 in the L-96 model.
4.3.3 Calibration and Training Phase
In Figure 4.1 we show the results of the calibration strategy using the optimization
formula in (4.19) including both information theory and kicked response for the three
classes of imperfect models. From the errors, the optimized information errors for
GC2 and MQG are smaller than GC1. This is consistent with our discussion consid-
ering the symmetry in nonlinear energy of each method. The same can be observed
from the plots for the response operators, for GC2 and MQG, good agreements for
the mean state always imply good fitting for the total variance, while large errors in
the total variance with GC1 appear even though the mean state is fit very well.
4.3.4 Testing Imperfect Model Prediction Skill and UQ with

Different Forced Perturbations
We have achieved the optimal model parameters by tuning response operators in the
training phase with the help of information theory. This optimal model can mini-
mize the information barrier in model predictions and offer uniform performance in
response to various perturbations. To validate this point, we compare and check the
model improvement in prediction skill according to various forcing perturbations.
Particularly here, we choose four different perturbed external forcing forms repre-
senting distinct dynamical features. In Figure 4.2, the four different external forcing
terms that will be tested are plotted. The first two are the ramp-type perturbations
of the external forcing driving the system smoothly from equilibrium to a perturbed
state with higher or lower energy. This could be viewed as a simple model mim-
icking a climate change scenario. Next considering the simulation about a seasonal
cycle, we would also like to check the case with periodic perturbations. And finally,
the case with random white noise forcing is applied to test the models’ ability for
random effects. All perturbations δ F are of an amplitude (or variance) of 10% of the
equilibrium value F = 8.
Figures 4.3, 4.4, 4.5 and 4.6 compare three imperfect model performances under
the four different forcing perturbations. To illustrate the improvement in predic-
tion skill through this information-response framework, the model predictions with
optimal parameters from the training phase are displayed together with another pre-
diction result using a non-optimal parameter by fitting the mean only in the training
phase. We can regard this imperfect model as a familiar sophisticated version of
the strategy from climate science when only the mean is tuned. But more, we show
the model outputs for the mean and total variance with closure methods GC1, GC2,
60
tuning model, GC1 tuning model, GC2 tuning model, MQG

0.5 0.5 0.2
0.4 0.18
0.4
0.16
0.3 0.3
0.14
P(δπ,δπM)
P(δπ,δπM)
P(δπ,δπM)
0.2 0.2
0.12
0.1 0.1 0.1

50 100 150 200 50 100 150 200 0 0.2 0.4 0.6 0.8
M
σ2M qs M
(a) Information distance in response operator

Linear response for the mean (optimal parameter) Linear response for total variance (optimal parameter)
1 200
truth
GC1 150
GC2
0.5 MQG 100
Ru
Rσ 2
50
0
0
-0.5 -50
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
time time
(b) Linear response operators for the mean and total variance with optimal parameter
Fig. 4.1 Training imperfect models in the training phase with full dimensional models by tuning model parameters. Time-averaged information distances in
linear response operators for the three closure models GC1, GC2, and MQG as the model parameter changes are shown in the first row (the point where the
information error is minimized is marked with a circle). The second row compares the optimized imperfect model response operators for the mean and total
variance with the truth for all these three imperfect models
4 Statistical Prediction and UQ for Turbulent Dynamical Systems
Forcing 1: upward ramp-type forcing Forcing 2: downward ramp-type forcing

8.8 8
8.6 7.8
8.4 7.6
8.2 7.4
8 7.2
0 5 10 15 20 25 30 0 5 10 15 20 25 30
time time
Forcing 3: periodic forcing Forcing 4: random forcing
9 9
8.5 8.5
8
8
7.5
7.5
7
7 6.5
0 5 10 15 20 0 5 10 15 20
time time
Fig. 4.2 External forcing perturbations δ F (upward ramp forcing, downward ramp forcing, periodic
forcing, and random forcing) to test the model sensitivity with unperturbed forcing F = 8 in the
unperturbed equilibrium
Model with optimal parameter Model with non-optimal parameter

State of the Mean State of the Mean
2.5 2.5
truth truth
GC1 GC1
2.4 GC2 2.4 GC2
MQG MQG
2.3 2.3
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Total variance Total variance
700 700
600 600
500 500
0 5 10 15 20 25 30 0 5 10 15 20 25 30
time time
Fig. 4.3 Upward ramp-type forcing: imperfect model predictions in mean state and total variance
for the closure methods GC1 (dotted-dashed blue), GC2 (dashed green), and MQG (solid red) are
compared with the truth (thick-solid black) from Monte-Carlo simulation. For comparison with
different parameters values, results with optimal parameter (left column) and one non-optimal case
by fitting only the mean in the training phase (right column) are displayed together
MQG compared with the truth from Monte-Carlo simulation. As expected, the model
prediction skill increases as more and more detailed calibration about the nonlinear
flux are proposed from GC1 to GC2, MQG.

2.35 2.35
2.3 truth 2.3 truth

GC1 GC1
2.25 GC2 2.25 GC2
MQG MQG
2.2 2.2
0 5 10 15 20 25 30 0 5 10 15 20 25 30
550 550
500 500
450 450
400 400
0 5 10 15 20 25 30 0 5 10 15 20 25 30
time time
Fig. 4.4 Downward ramp-type forcing: imperfect model predictions in mean state and total variance
for the closure methods GC1 (dotted-dashed blue), GC2 (dashed green), and MQG (solid red) are
compared with the truth (thick-solid black) from Monte-Carlo simulation. For comparison with
different parameters values, results with optimal parameter (left column) and one non-optimal case
by fitting only the mean in the training phase (right column) are displayed together

2.6 2.6
2.4 2.4
2.2 2.2
truth truth
2 GC1 2 GC1
0 GC2 5 10 15 20 0 GC2 5 10 15 20
MQG MQG
700 700
600 600
500 500
400 400
0 5 10 15 20 0 5 10 15 20
time time
Fig. 4.5 Periodic forcing: imperfect model predictions in mean state and total variance for the
closure methods GC1 (dotted-dashed blue), GC2 (dashed green), and MQG (solid red) are compared
with the truth (thick-solid black) from Monte-Carlo simulation. For comparison with different
parameters values, results with optimal parameter (left column) and one non-optimal case by fitting
only the mean in the training phase (right column) are displayed together
4.3.5 Reduced-Order Modeling for Complex Turbulent

Dynamical Systems
For reduced-order modeling on a lower dimensional subspace one needs to deal with
the new difficulty that the statistical symmetry trQ F = 0 is no longer satisfied on the
subspace. Nevertheless there is a systematic version of GC2 utilizing the statistical
energy conservation principle for L-96 from Chapter 3 which can be calibrated in

2.6 2.6
2.4 2.4
2.2 truth 2.2 truth

GC1 GC1
0 GC2 5 10 15 20 0 GC2 5 10 15 20
MQG MQG
600 600
500 500
400 400
0 5 10 15 20 0 5 10 15 20
time time
Fig. 4.6 Random forcing: imperfect model predictions in mean state and total variance for the
closure methods GC1 (dotted-dashed blue), GC2 (dashed green), and MQG (solid red) are compared
with the truth (thick-solid black) from Monte-Carlo simulation. For comparison with different
parameters values, results with optimal parameter (left column) and one non-optimal case by fitting
only the mean in the training phase (right column) are displayed together
a training phase using Section 4.2.3 to produce very skillful predictions and UQ
using only three reduced modes for the L-96 model [117]; the three modes consist
of the two most energetic modes and the mean state, ū. Current research on ROM
applies the statistical energy principle from Chapter 3 to the training calibration phase
from Section 4.2.3 to the one-layer models with complex geophysical effects from
Chapter 2 and 3 [145] and for two-layer baroclinic turbulence [146, 158].
Chapter 5
State Estimation, Data Assimilation,
or Filtering for Complex Turbulent
Dynamical Systems
State estimation, also called filtering or data assimilation, is the process of obtaining
an accurate statistical state estimate of a natural system from partial observations
of the true signal from nature. In many contemporary applications in science and
engineering real-time state estimation of a turbulent system from nature involving
many active degrees of freedom is needed to provide an accurate initial statistical state
in order to make accurate prediction and UQ of the future state. This is obviously a
problem with significant practical impact. Important contemporary examples involve
the real-time filtering of weather and climate as well as engineering applications
such as the spread of hazardous plumes or pollutants. The same mathematical issues
involving overcoming the curse of ensemble size or curse of dimension and making
judicious model errors for complex turbulent dynamical systems discussed earlier in
Chapter 1 and 4 for prediction and UQ apply to state estimation of turbulent dynamical
systems [112]; a key new challenge is to exploit the additional information from the
observations in a judicious fashion. This is a very active and important contemporary
research topic. In this chapter we provide a brief introduction to important recent
directions in the applied mathematical research for filtering to entice the reader to
further study but do not attempt a pedagogical development (see [8, 90, 112, 148] for
basic treatments). Of course these methods for state estimation are intimately linked
with prediction and UQ of complex turbulent dynamical systems.
Here is a brief summary of the topic treated below. In Section 5.1 we introduce
the topic of state estimation in complex turbulent systems by discussing the intuitive
and appealing inverse problem of recovering the turbulent velocity field from noisy
Lagrangian tracers, a central problem in contemporary oceanography with recent
rigorous mathematical theory [29, 31, 32, 55, 58]. Under natural hypothesis the
Lagrangian tracer problem is a special example of the fact that many interesting non-
Gaussian turbulent dynamical systems despite their nonlinearity have the hidden

DOI 10.1007/978-3-319-32217-9_5
66 5 State Estimation, Data Assimilation, or Filtering …
structure of conditional Gaussian systems [29] and an introduction to this important

topic is given in Section 5.2. The geoscience community [9, 44, 81, 112] introduced
the finite ensemble Kalman filter to cope with the curse of ensemble size in filtering
complex turbulent dynamical systems and this approach is popular and often very
useful in both engineering and geoscience. Section 5.3 contains an introduction to
that topic as well as recent rigorous mathematical theories for nonlinear stability
of these methods [83, 168, 169]. Section 5.4 includes the important topic of multi-
scale state estimation algorithms for complex turbulent systems; here the conditional
Gaussian framework introduced in Section 5.2 is the starting point for the discussion
of a suite of three novel types of multi-scale methods of varying complexity and
recent promising results with these novel algorithms which are a future source of
further mathematical analysis and practical algorithms.
5.1 Filtering Noisy Lagrangian Tracers

for Random Fluid Flows
An important practical inverse problem is the recovery of a turbulent velocity field

from noisy Lagrangian tracers moving with the fluid flow. Thus, we observe L-noisy
Lagrangian trajectories X j (t) with 1 ≤ j ≤ L with dynamics
dX j
= v(X j (t), t) + σ j Ẇ j , 1≤ j ≤L (5.1)
dt
where σ j is the noise strength. The goal is to recover an accurate statistical estimate for
the turbulent velocity field v(x, t) from these L-measurements X j (t). This appealing
inverse problem is central in contemporary oceanography [55, 58] and has attracted
the attention of many applied mathematical scientists [10, 89, 151, 152, 160]. There
has been recent major progress in mathematically rigorous theory for these inverse
problems for both incompressible flows [31] and compressible flows [32] including
a novel assessment of model error for approximate filters in the compressible case
utilizing pathwise error estimates through information theory [21, 29].
Below is a brief sketch of the main results and the strategy of proof for the random
incompressible case. The principle difficulty is that the Lagrangian tracer equation
in (5.1) is strongly nonlinear even if the underlying fluid flow is very simple. Never-
theless, despite the inherent nonlinearity in measuring noisy Lagrangian tracer, they
have the hidden structure of a conditional Gaussian filtering systems under general
hypotheses [31, 32, 95]; this means that there are closed equations for the conditional
mean and covariance of the filter with random coefficients involving random matrix
Riccati equations for the covariance. Such a general hidden conditional Gaussian
structure for partially observed turbulent dynamical systems in general is discussed
in Section 5.2 [29]. The next key step in the incompressible periodic setting is to
prove geometric ergodicity (see Section 3.4) of tracer paths with respect to the uni-
form measure conditional on the random velocity field. The final key step is to prove
5.1 Filtering Noisy Lagrangian Tracers for Random Fluid Flows 67
a mean field limit for the random Riccati equations as time goes to infinity and the
number of tracers L becomes large with an explicit deterministic limit.
Here the main results proved for the incompressible case [31] through the above
strategy are summarized informally:
• The posterior covariance matrix approaches a deterministic matrix R L , with R L
being a diagonal matrix and scales as L −1/2 . See Theorem 3.3, part (i) in [31];
• The posterior mean, i.e. the maximum likelihood estimator produced by the filter,
converges to the true value of the signal. See Theorem 3.3, part (ii) and (iii) in
[31];
• The total uncertainty reduction gained by the observations, being measured either
in relative entropy or mutual information, asymptotically increases as 41 |K| ln L,
where |K| is the number of modes included; in other words, we gain 41 nat for each
mode with each additional order of magnitude of L. See Corollary 3.4 in [31].
Thus the information gained as the number of tracers becomes large increases very
slowly and has a practical information barrier. All of these theoretical points above
are confirmed by careful numerical experiments.
Additional new multi-scale phenomena occur in the important practical case of
random compressible flows since the optimal filter is very expensive and includes
the effect of fast compressible gravity waves; thus reduced filters for the multi-
scale system are needed. The basic theorem in the compressible case involves a
fast wave averaging principle which establishes rigorously that the much cheaper
reduced filters have the same skill as the optimal filter in the limit of fast rotation on
bounded time intervals for the slow part of the random compressible velocity field
[32]. A practical rigorous theory of these model errors utilizing information theory
is developed recently [29]. Once again, careful numerical experiments confirm the
mathematical theory and reveal further multi-scale phenomena in these filters.
5.2 State Estimation for Nonlinear Turbulent Dynamical

Systems Through Hidden Conditional Gaussian
Statistics
The existence of closed analytic formulas with random coefficients for filtering non-
linear noisy Lagrangian tracers was a key step in the rigorous analysis discussed
in Section 5.1. Here we show that such hidden conditional Gaussian structure often
occurs in practically observed turbulent dynamical systems and list many other appli-
cations [27].
Partition the state variable u = (uI , uII ) of a turbulent dynamical system into
partially observed variables, uI , and remaining dynamic variables uII , which need
state estimation and prediction.
The conditional Gaussian systems are turbulent dynamical systems which have
the following abstract form,
duI = [A0 (t, uI ) + A1 (t, uI )uII ]dt + I (t, uI )dWI (t), (5.2a)
duII = [a0 (t, uI ) + a1 (t, uI )uII ]dt + II (t, uI )dWII (t), (5.2b)
where uI (t) and uII (t) are vector state variables, A0 , A1 , a0 , a1 , I and II are vec-
tors and matrices that depend only on time t and state variables uI , and WI (t) and
WII (t) are independent Wiener processes. Once the observed path uI (s) for s ≤ t
is given, uII (t) conditioned on uI (s) becomes a Gaussian process with mean ūII (t)
and covariance RII (t), i.e.,

p uII (t)|uI (s ≤ t) ∼ N (ūII (t), RII (t)). (5.3)
provided that the noise matrix II (t, uI ) is nonsingular. Here and below N (ū, R)
denotes a Gaussian random variable with mean ū and covariance R. Despite the
conditional Gaussianity, the coupled system (5.2) remains highly nonlinear and is
able to capture the non-Gaussian features such as skewed or fat-tailed distributions
as observed in nature [15, 139].
One of the desirable features of the conditional Gaussian system (5.2) is that the
conditional distribution in (5.3) has the following closed analytic form [95],
d ūII (t) =[a0 (t, uI ) + a1 (t, uI )ūII ]dt + (RII A∗1 (t, uI ))( I ∗I )−1 (t, uI )×
[duI − (A0 (t, uI ) + A1 (t, uI )ūII )dt],
(5.4)
dRII (t) = a1 (t, uI )RII + RII a1∗ (t, uI ) + ( II ∗II )(t, uI )

−(RII A∗1 (t, uI ))( I ∗I )−1 (t, uI )(RII A∗1 (t, uI ))∗ dt.
The exact and accurate solutions in (5.4) provide a general framework for studying
continuous-time filtering and uncertainty quantification of the conditional Gaussian
system (5.2). In filtering the turbulent system (5.2), if uI (s ≤ t) is the observed
process, then the posterior states of the unobserved process uII (t) in (5.3) are updated
following the analytic formula in (5.4) associated with the nonlinear filter.
In the special case, the coefficient matrices A0 , a0 are linear in these argument
and the matrices A1 (t), a1 (t), I (t), II (t) are only time dependent but do not have
nonlinear dependence on uI , the above formulas reduce to the celebrated Gaussian
Kalman–Bucy filter [79, 80] for continuous time with noisy observations of the
variables uI with optimal estimation of the variables uII (t). The key advantage of the
general systems in (5.2) is that they can be conditionally Gaussian while remaining
highly non-Gaussian even with intermittency and fat tails.
5.2.1 Examples and Applications of Filtering Turbulent

Dynamical Systems as Conditional Gaussian Systems
We begin with instructive examples of the triad models discussed earlier in Section 2.3
and Chapter 3 as the instructive building blocks of many turbulent dynamical sys-
tems. Here we show that a wide class of triad models have the filtering structure of
5.2 State Estimation for Nonlinear Turbulent Dynamical … 69
conditional Gaussian systems. Somewhat surprisingly, this class includes the noisy
version of the celebrated three mode model of chaotic dynamics due to Lorenz, the
L-63 model [97]. This example illustrates the non-Gaussian features in this gen-
eral framework discussed here involving conditional Gaussian distributions. Further
practical generalization of these conditional Gaussian ideas to multi-scale data assim-
ilation algorithms are discussed in Section 5.4.
5.2.1.1 Triad Models and the Noisy Lorenz Model
The nonlinear coupling in triad systems is generic of nonlinear coupling between any
three modes in larger systems with quadratic nonlinearities. Here, we introduce the
general form of the triad models that belongs to the conditional Gaussian framework
(5.2),
du I = (L 11 u I + L 12 u I I + F1 ) dt + σ1 dW I ,
(5.5)
du I I = (L 22 u I I + L 21 u I + u I I + F2 ) dt + σ2 dW I I ,
where u I = u 1 and u I I = (u 2 , u 3 )T and the coefficients L 11 , L 12 , L 21 , L 22 and are

functions of only the observed variable. In (5.5), either u I or u I I can be regarded as the
observed variable and correspondingly the other one becomes the unresolved variable
that requires filtering. The triad model (5.5) has wide applications in atmosphere and
ocean science. One example is the stochastic mode reduction model (also known as
MTV model) [121–124], which includes both a wave-mean flow triad model and a
climate scattering triad model for barotropic equations [122]. Another example of
(5.5) involves the slow–fast waves in the coupled atmosphere-ocean system [49–52,
112], where one slow vortical mode interacts with two fast gravity modes with the
same Fourier wavenumber.
With the following choice of the matrices and vectors in (5.5),
u I = x, u I I = (y, z)T , L 11 = −σ,

L 12 = (σ, 0), L 21 = (ρx, 0) , T
σ1 = σx ,

−1 0 −x σy
L 22 = , = , σ2 = ,
−β x 0 σz
the triad model (5.5) becomes the noisy Lorenz 63 (L-63) model [97],
d x = σ (y − x)dt + σx dWx ,

dy = x(ρ − z) − y dt + σ y dW y , (5.6)
dz = (x y − βz)dt + σz dWz .
As is known, adopting the following parameters

60 60
20
(a) 50
40 40
0
z
σx = σy = σz = 0 0 20 20
−20
20 20
0 0 0 0
−20 −20 x
y −20 0 20 −20 0 20 −20 0 20
x x y
60 60
20
(b) 50
40 40
0
z
z
σ =σ =σ =5 0 20 20
x y z −20
20 20
0 0 0 0
−20 −20 x
y −20 0 20 −20 0 20 −20 0 20
x x y
60 60
20
(c) 50
40 40
0
z
z
σx = σy = σz = 10 0 20 20
−20
20 20
0 0 0 0
−20 −20 x
y −20 0 20 −20 0 20 −20 0 20
x x y
Fig. 5.1 Simulations of the noisy L-63 model (5.6). Row a σx = σ y = σz = 0; Row b σx = σ y =
σz = 5; Row c σx = σ y = σz = 10
ρ = 28, σ = 10, β = 8/3, (5.7)
the deterministic version of (5.6) has chaotic solutions, where the trajectory of the
system has a butterfly profile at the attractor. Such a feature is preserved in the
appearance of small or moderate noise in (5.6). See Figure 5.1 for the trajectories of
(5.6) with σx = σ y = σz = 0, 5 and 10. Note that the noisy L-63 model possesses
the property of energy-conserving nonlinear interactions.
The noisy L-63 model (5.6) equipped with the parameters (5.7) is utilized as a test
model in [27]. Here is a brief summary. They first study filtering the unresolved tra-
jectories given one realization of the noisy observations. Then an efficient conditional
Gaussian ensemble mixture approach described next in Section 5.2.1.2 is designed
to approximate the time-dependent PDF associated with the unresolved variables,
which requires only a small ensemble of the observational trajectories. In both stud-
ies, the effect of model error due to noise inflation and underdispersion is studied.
The underdispersion occurs in many models for turbulence since they have too much
dissipation [141] due to inadequate resolution and deterministic parameterization
of unresolved features while noise inflation is adopted in many imperfect forecast
models to reduce the lack of information [9, 81, 112] and suppress the catastrophic
filter divergence [73, 168].
5.2.1.2 Recovering the Time-Dependent PDF of the Unobserved

Variables Utilizing Conditional Gaussian Mixtures
One important issue in uncertainty quantification for turbulent systems is to recover

the time-dependent PDF associated with the unobserved processes. In a typical sce-
nario, the phase space of the unobserved variables is quite large while that of the
observed ones remains moderate or small. The classical approaches involve solving
5.2 State Estimation for Nonlinear Turbulent Dynamical … 71
the Fokker–Planck equation or adopting Monte Carlo simulation, both of which are
quite expensive with the increase of the dimension, known as the “curse of dimen-
sionality” [39, 112]. For conditional Gaussian systems, the PDF associated with
the unobserved processes can be approximated by an efficient conditional Gaussian
ensemble mixture with high accuracy, where only a small ensemble of observed tra-
jectories is needed due to its relatively low dimension and is thus computationally
affordable. Note that the idea here is similar to that of the blended method for filtering
high dimensional turbulent systems [118, 144, 160] discussed later in Section 5.4.
Below, we provide a general framework of utilizing conditional Gaussian mixtures
in approximating the time-dependent PDF associated with the unobserved processes.
Test examples of this approach below are based on the 3D noisy L-63 system [27].
Although this method can be easily generalized to systems with a large number of
unobserved variables.
Let us recall the observed variables uI and the unobserved variables uII in the
conditional Gaussian system (5.2). Their joint distribution is denoted by
p(uI , uII ) = p(uI ) p(uII |uI ).
For simplicity here, assume we have L independent observational trajectories

uI1 , . . . , uIL starting from the same location and therefore they are equally weighted.
The marginal distribution of uI is approximated by
1
L
p(uI ) ≈ δ uI − uIi . (5.8)
L i=1
The marginal distribution of uII at time t is expressed by

p(uII ) = p(uI , uII )duI = p(uI ) p(uII |uI )duI
(5.9)
1
L
≈ p(uII |uIi ),
L i=1
where for each observation uIi , according to the analytically closed form (5.4),

p uII (t)|uIi (s ≤ t) ∼ N (ūII
i
(t), RII
i
(t)). (5.10)
Thus, the PDF associated with the unobserved variable uII is approximated utilizing
(5.9) and (5.10). Note that in many practical issues associated with turbulent systems,
such as in oceanography, the dimension of the observed variables is much lower
than that of the unobserved ones. Thus, only a small number of L is needed in
approximating the low-dimensional marginal distribution p(uI ) in (5.8) to recover
the marginal distribution p(uII ) associated with the unobserved process with this
conditional Gaussian ensemble mixture approach. For example [27], extensive tests
of accurate non-Gaussian features of the evolving PDF are demonstrated with high
accuracy for the noisy L-63 model with L = 100 while accurate simulations by
Monte Carlo require 50,000 realizations.
5.2.1.3 Other Applications of the Conditional Gaussian Framework

to Filtering Non-Gaussian Complex Systems Including Model
Error
Here is a brief list of other applications to complex turbulent dynamical systems

with model error. More applications of the basic conditional Gaussian framework
to judicious model errors for dyad models and parameter estimation can be found
in [27]. Recently, the conditional Gaussian nonlinear filter was adopted for filter-
ing the stochastic skeleton model for the Madden–Julian oscillation (MJO) [28,
166], where equatorial waves and moisture were filtered given the observations of
the highly intermittent envelope of convective activity. As discussed in Section 5.1,
another application of this exact and accurate nonlinear filter involves filtering tur-
bulent flow fields utilizing observations from noisy Lagrangian tracer trajectories
[31, 32], where an information barrier was shown with increasing the number of
tracers [31] and a multiscale filtering strategy was studied for the system with cou-
pled slow vortical modes and fast gravity waves [32] including the effect of model
errors [29]. In addition, a family of low-order physics-constrained nonlinear sto-
chastic models with intermittent instability and unobserved variables, which belong
to the conditional Gaussian family, was utilized for predicting the MJO and the
monsoon indices [25, 26, 30]. See Section 4.2.4.1 for a discussion of these models.
The effective filtering scheme was adopted for the on-line initialization of the unob-
served variables that facilitates a skillful ensemble prediction algorithm. As will be
discussed in Section 5.4, other applications that fit into conditional Gaussian frame-
work sometimes without explicit exact formulas include the cheap exactly solvable
forecast models in dynamic stochastic superresolution of sparsely observed turbulent
systems [19, 82], stochastic superparameterization for geophysical turbulence [91,
110, 112], and blended particle filters for large-dimensional chaotic systems [118,
144] that capture non-Gaussian features in an adaptively evolving low-dimensional
subspace through particles interacting with conditional Gaussian statistics on the
remaining phase space.
5.3 Finite Ensemble Kalman Filters (EnKF): Applied

Practice Mathematical Theory, and New Phenomena
With the growing importance of accurate weather forecasting and the expanding
availability of geophysical measurements, data assimilation has never been more vital
to society. Ensemble based assimilation methods, including the Ensemble Kalman
5.3 Finite Ensemble Kalman Filters (EnKF): Applied Practice … 73
filters (EnKF) [44] and Ensemble square root filters (ESRF) [9, 16], are crucial
components of data assimilation which are applied ubiquitously across the geophys-
ical sciences [81, 112]. The EnKF and ESRF are data assimilation methods used to
combine high dimensional, nonlinear dynamical models with observed data. Ensem-
ble methods are indispensable tools in science and engineering and have enjoyed
great success in geophysical sciences, because they allow for computationally cheap
low-ensemble-state approximation for extremely high-dimensional turbulent fore-
cast models. Despite their widespread application, the theoretical understanding of
these methods remains underdeveloped.
5.3.1 EnKF and ESRF Formulation
We now briefly describe the EnKF and ESRF algorithms. Let Ψ be the forecast
model, with the true model state satisfying Un = Ψ (Un−1 ) for all integers n ≥ 1
and with some given (possibly random) initial state U0 . At each time step we make
an observation Z n = HUn + ξn , where H is the observation matrix and ξn are i.i.d.
random variables distributed as N (0, Γ ) and for simplicity we take Γ = I . The
objective of data assimilation is to use the forecast model to combine an estimate of
the previous state Un−1 with the new observational data Z n to produce an estimate
of the current state Un .
In both EnKF and ESRF algorithms, an ensemble {Vn(k) }k=1 K
is used as an empirical
estimate of the posterior distribution of the state Un given the history of observations
Z 1 , . . . , Z n . The empirical mean of the ensemble is a useful estimate of the state of the
model and the empirical covariance matrix provides a quantification of uncertainty.
The EnKF algorithm is the iteration of two steps, the forecast step and the analysis
(k) K
step. In the forecast step, the time n − 1 posterior ensemble {Vn−1 }k=1 is evolved to
the forecast ensemble {V
n }k=1 where V
(k) K
n = Ψ (Vn−1 ). The primary function of the
(k) (k)
forecast ensemble is to represent uncertainty in the forecast model, this uncertainty

is quantified via the empirical covariance matrix C
n :
1
(k)
K

n =
C
n(k) − V
(V − V n ) ⊗ (V
n ), (5.11)
K − 1 k=1 n
K
where V
n = K −1 k=1
n(k) . In the analysis step, the time n observation Z n is
V
assimilated with the forecast ensemble to produce the posterior ensemble {Vn(k) }k=1 K
.
The assimilation update of each ensemble member is described by a Kalman type
update in a possibly nonlinear setting. The update uses a perturbed observation
Z n(k) = Z n + ξn(k) where ξn(k) is an i.i.d. sequence distributed identically to ξn . The
Kalman update is then given by

n(k) − C
Vn(k) = V
n H T (I + H C

n H T )−1 (H V

n(k) − Z n(k) ). (5.12)
The noise perturbations ξn(k) are introduced to maintain the classical Kalman prior-
posterior covariance relation

n − C
Cn = C
n H T (I + H C

n H T )−1 H C

n (5.13)
in an average sense, where Cn denotes the sample covariance of the posterior

ensemble.
The ESRF algorithms considered in this article are the Ensemble Transform
Kalman filter (ETKF) [16] and the Ensemble Adjustment Kalman Filer (EAKF)
[9]. Both filters employ the same forecast step as EnKF, but differ from EnKF (and
from each other) in the analysis step. In both ETKF and EAKF, the posterior ensemble
mean V n is updated from the forecast mean V
n via
Vn = V
n H T (I + H C

n − C
n H T )−1 (H V

n − Z n ). (5.14)
Given the updated mean, to compute the update for each ensemble member it is
sufficient to compute the posterior ensemble spread matrix Sn = [Vn1 −V n , . . . , VnK −
V n ]. This is computed using the similarly defined forecast spread matrix
Sn . Any
reasonable choice of Sn should satisfy the Kalman covariance identity

n − C
Cn = C
n H T (I + H C

n H T )−1 H C

n
where, by definition Cn = Sn SnT . The ETKF algorithm achieves (5.13) by setting

Sn =
Sn Tn where Tn is the transform matrix defined by

−1/2
SnT H T H
Tn = I + (K − 1)−1
Sn .
The EAKF algorithm achieves (5.13) by setting Sn = An Sn where An is the adjust-

ment matrix defined by
An = Q
G T (I + D)−1/2
† Q T
Here Q
R is the SVD decomposition of
Sn and G T DG is the diagonalization of

−1 T T
(K − 1)
Q H H Q
, and † indicates pseudo inverse of a matrix. For more
T T T
details on EnKF and ESRF see [112].
5.3.2 Catastrophic Filter Divergence
Recent rigorous mathematical theory is described below that studies the curious
numerical phenomenon known as catastrophic filter divergence [54, 73]. In [54, 73],
it was numerically demonstrated that state estimates provided by ensemble based
methods can explode to machine infinity, despite the forecast model being dissipative
5.3 Finite Ensemble Kalman Filters (EnKF): Applied Practice … 75
and satisfying the absorbing ball property [165], which demands that the true state
is always absorbed by a bounded region of the state space as illustrated in Chapter 1.
In [54, 73], the authors argue that catastrophic filter divergence is strongly associ-
ated with alignment of forecast ensemble members. Specifically, in [54] the forecast
model is a five dimensional Lorenz-96 model from Chapter 2 with one observed vari-
able. Alignment of the forecast ensemble is caused by stiffness in the forecast ODE,
as evidenced by a strongly negative Lyapunov exponent. When the ensemble aligns
in a subspace perpendicular to the observed direction, the analysis update can shift
the ensemble (within the subspace) to points that lie on higher energy trajectories
of the forecast model. Since the observation is perpendicular to the ensemble sub-
space, the observation does not contain any useful information and hence the analysis
update can potentially affect the performance of the filter. In the next forecast step,
the stiffness of the forecast leads to re-alignment of the ensemble, but on much higher
energy trajectories than in the previous forecast step.
5.3.3 Rigorous Examples of Catastrophic Filter Divergence
Reference [83] confirms the above intuition by constructing a two dimensional non-
linear map with the absorbing ball property and a class of linear observation operators
so that catastrophic filter divergence occurs for a set of initial data with positive mea-
sure through the above alignment amplification mechanism for any finite ensemble
size. This is the first instance with a rigorous proof that finite ensemble filters with
a nonlinear forecast model with the absorbing ball property can lead to drastic filter
malfunction and sheds light on when it should be expected and how it can be avoided.
5.3.4 Rigorous Nonlinear Stability and Geometric Ergodicity

for Finite Ensemble Kalman Filters
Despite their widespread application, the theoretical understanding of finite ensemble

Kalman filter remains underdeveloped. In the practical setting of high dimensional
nonlinear turbulent forecast models with small ensemble size, the opposite of the
conventional statistical setting, the focus has been on well-posedness [84] and rig-
orous nonlinear stability and geometric ergodicity [168, 169]. Geometric ergodicity
guarantees that the nonlinear filter has a unique attractor which nonlinearly attracts
all reasonable initial data at an exponential rate. The proofs depend on building an
augmented Lyapunov function incorporating the observable energy [168] and prov-
ing with suitably nonlinear adaptive additive inflation that the Lyapunov function of
the nonlinear forecast model absorbs the effects of any linear observation operator
[169]. These rigorous results eliminate catastrophic filter divergence and the adaptive
additive inflation algorithm is very promising as a potential practical algorithm. See
[92] for an application to geophysical turbulence.
An important mathematical challenge is to understand the accuracy of the statis-

tical mean and fidelity of the statistical covariance estimator from EnKF and ESRF
with high dimensional turbulent forecast models and small ensemble size. Practical
finite ensemble filters for large dimensional turbulent systems crucially use additive
and multiplicative inflation combined with covariance localization in order to have
skillful filtering performance [81, 90, 112]. These are sources of bias and model error
which make a rigorous numerical analysis an especially challenging and significant
problem.
5.4 Mathematical Strategies and Algorithms

for Multi-scale Data Assimilation
Data assimilation of turbulent signals is an important challenging problem because

of the extremely complicated large dimension of the signals and incomplete partial
noisy observations which usually mix the large scale mean flow and small scale
fluctuations. See Chapter 7 of [112] for examples of new phenomena due to this
multi-scale coupling through the observations even for linear systems. Due to the
limited computing power in the foreseeable future, it is desirable to use multi-scale
forecast models which are cheap and fast to mitigate the curse of dimensionality in
turbulent systems; thus model errors from imperfect forecast models are unavoidable
in the development of a data assimilation method in turbulence. I briefly discuss a suite
of multi-scale data assimilation methods which use stochastic Superparameterization
as the forecast model.
As a reduced or cheap forecast model, various multi-scale methods were intro-
duced to mitigate the curse of dimensionality of turbulent systems. Among others,
conventional Superparameterization is a multi-scale algorithm that was originally
developed for the purpose of parameterizing unresolved cloud process in tropi-
cal atmospheric convection [56, 57, 110]. This conventional Superparameterization
resolves the large scale mean flow on a coarse grid in a physical domain while the
fluctuating parts are resolved using a fine grid high resolution simulation on periodic
domains embedded in the coarse grid. A much cheaper version of Superparameteri-
zation, called stochastic Superparameterization [66, 67, 110, 111], replaces the non-
linear eddy terms by quasilinear stochastic processes on formally infinite embedded
domains where the stochastic processes are Gaussian conditional to the large scale
mean flow. The key ingredient of these multi-scale data assimilation methods is the
systematic use of conditional Gaussian mixtures which make the methods efficient
by filtering a subspace whose dimension is smaller than the full state. This condi-
tional Gaussian closure approximation results in a seamless algorithm without using
the high resolution space grid for the small scales and is much cheaper than the con-
ventional Superparameterization, with significant success in difficult test problems
[66, 67, 64] including the MMT model [62, 66] and ocean turbulence [63, 65, 68]
mentioned earlier in Section 2.4.
I briefly discuss multi-scale data assimilation or filtering methods for turbulent sys-
tems, using stochastic Superparameterization as the forecast model in the practically
5.4 Mathematical Strategies and Algorithms … 77
important setting where the observations mix the large and small scales. The key
idea of the multi-scale data assimilation method is to use conditional Gaussian mix-
tures [40, 118] whose distributions are compatible with Superparameterization. The
method uses particle filters (see [12] and Chapter 15 of [112]) or ensemble filters on
the large scale part [62, 63] whose dimension is small enough so that the non-Gaussian
statistics of the large scale part can be calculated from a particle filter whereas the
statistics of the small scale part are conditionally Gaussian given the large scale part.
This framework is not restricted to Superparameterization as the forecast model and
other cheap forecast models can also be employed. See [18] for another multi-scale
filter with quasilinear Gaussian dynamically orthogonality method as the forecast
method in an adaptively evolving low dimensional subspace without using Super-
parameterization. We note that data assimilation using Superparameterization has
already been discussed in [74] with noisy observations of the large scale part of the
signal alone. There it was shown that even in this restricted setting ignoring the small
scale fluctuations even when they are rapidly decaying can completely degrade the
filter performance compared with the high skill using Superparameterization. Here
in contrast to [74] we consider multi-scale data assimilation methods with noisy
observations with contributions from both the large and small scale parts of the sig-
nal, which is a more difficult problem than observing only the large scale because
it requires accurate estimation of statistical information of the small scales [62, 63,
91]. Also mixed observations of the large and small scale parts occur typically in real
applications. For example, in geophysical fluid applications, the observed quantities
such as temperature, moisture, and the velocity field necessarily mix both the large
and small scale parts of the signal [38, 112].
A suite of multi-scale data assimilation methods [91] are tested for a conceptual
dynamical model for turbulence which was mentioned in Section 2.4 and developed
recently [115]. Here I briefly introduce these models, superparameterization in this
context, and multi-scale data assimilation. The conceptual model is the simplest
model for anisotropic turbulence and is given by a K + 1 dimensional stochastic dif-
ferential equation (SDE) with deterministic energy conserving nonlinear interactions
between the large scale mean flow and the smaller scale fluctuating components. The
fluctuating parts have a statistical equilibrium state for given large scale mean flow
but the fluctuating parts develop instability through chaotic behavior of the large
mean flow and this instability generates nontrivial nonlinear feedbacks on the large
mean flow too.
5.4.1 Conceptual Dynamical Models for Turbulence

and Superparameterization
The conceptual dynamical model introduced and studied in [115] is a simple K + 1

dimensional SDE mimicking the interesting features of anisotropic turbulence even
for a small number K . Thus it is a useful test bed for multi-scale algorithms and
strategies for data assimilation. In this section we briefly review the conceptual
dynamical model with its interesting features resembling actual turbulence [47, 170,
171]. Also stochastic Superparameterization for the conceptual model using Gaussian
closure for the small scales conditional to the large scale variable will be discussed
in detail.
5.4.1.1 Conceptual Dynamical Model for Turbulence
In [115], a low dimensional stochastic dynamical system is introduced and studied as

a conceptual dynamical model for anisotropic turbulence. It is a simple K +1 dimen-
sional SDE which captures the key features of vastly more complicated turbulent
systems. The model involves a large scale mean flow, u, and turbulent fluctuations,
u = (u 1 , u 2 , . . . , u K ), on a wide range of spatial scales with energy-conserving
wave-mean flow interactions as well as stochastic forcing in the fluctuations. Here
and below we abuse notation compared with Chapter 3 and 4 in the sense that ū
denotes a large spatial scale mean and u denotes the small spatial scale fluctuations.
Although the model is not derived quantitatively from the Navier–Stokes equation, it
mimics key statistical features of vastly more complex anisotropic turbulent systems
in a qualitative fashion [47, 170, 171]: (1) The large scale mean is usually chaotic but
more predictable than the smaller scale fluctuations; (2) The large-scale mean flow
and the smaller-scale fluctuations have nontrivial nonlinear interactions which con-
serve energy; (3) There are wide ranges of scales for the fluctuations. The large scale
components contain more energy than the smaller scale components. Also the large
scale fluctuating components decorrelate faster in time than the mean flow while the
smaller scale fluctuating components decorrelate faster in time than the larger scale
components. (4) The overall turbulent field has a nearly Gaussian PDF while the
large scale mean flow has a sub-Gaussian PDF. The larger scale components of fluc-
tuations are nearly Gaussian while the smaller scale components are intermittent, and
have fat tailed PDFs, i.e., much more extreme events than a Gaussian distribution.
Following the above discussion, the conceptual dynamical model for turbulence
introduced in [115] is the following K +1 dimensional stochastic differential equation
K 2
du
dt
= −d u + γ k=1 (u k ) − α u 3 + F,
du k
(5.15)
dt
= −dk u k − γ uu k + σk Ẇk , 1 ≤ k ≤ K .
where Ẇk are independent white noises for each k. The mean scalar variable u
represents the largest scale and a family of small scale variables, u k , 1 ≤ k ≤ K ,

represent contributions to the turbulent fluctuations with u = kK u k the turbulent
fluctuations. The large scale u can be regarded as the large scale spatial average of
the turbulent dynamics at a single grid point in a more complex system while u k is
the amplitude of the kth Fourier cosine mode evaluated at a grid point. Thus it is
straightforward to generalize the conceptual model to many large-scale grid points,
which yields a coupled system of equations on the large scales [115].
There are random forces on the fluctuating turbulent modes, u k , 1 ≤ k ≤ K ,

to mimic the nonlinear interactions between turbulent modes, while the large scale
mean flow u has only a deterministic constant force F. But the large scale u can
have fluctuating, chaotic dynamics in time through interactions with turbulence and
its own intrinsic dynamics. The reader easily verifies that the nonlinear interactions
alone in (5.15) also conserve the total energy of the mean and fluctuations

1
K
2
E= u +
2
uk . (5.16)
2 k
The large scale damping, d, can be positive with α = 0 or negative with α > 0 but
it is essential to have dk > 0 in order for the turbulence to have a statistical steady state.
For a fixed γ > 0, the large scale can destabilize the smaller scales in the turbulent
fluctuations intermittently provided that −dk − γ u > 0 for the kth mode and the
chaotic fluctuation of u creates intermittent instability in u k , 1 ≤ k ≤ K . Thus, the
overall system can have a statistical steady state while there is intermittent instability
on the small scales creating non-Gaussian intermittent behavior in the system. It
can also be shown that (5.15) is geometrically ergodic under general hypothesis
which means that a unique smooth ergodic invariant measure exists with exponential
convergence of suitable statistics from time averages in the long time limit. More
details can be found in [115] with more mathematical intuition and typical solutions
exhibiting all this behavior.
5.4.1.2 Superparameterization in the Conceptual Model
Now we describe the stochastic Superparameterization of the conceptual model

(5.15). Stochastic Superparameterization [64, 66, 67, 110] is a seamless multi-scale
method parameterizing the small scale eddy terms by quasilinear stochastic processes
embedded in a formally infinite domain instead of periodic domains in physical space
as in conventional Superparameterization [56, 57]. In stochastic Superparameteriza-
tion, which we call Superparameterization here, the large scale mean flow is resolved
on a coarse grid in a physical domain while the fluctuating small scales are closed
by stochastic processes under a Gaussian assumption conditional to the large scale
mean flow [66, 67, 110, 111].
In Superparameterization, it is implicitly assumed that there is modest scale sepa-
ration in time between u and u k , 1 ≤ k ≤ K . Due to the scale separation assumption,
the nonlinear term containing u k in the equation of u is replaced by the statistical

average, that is, its variance Rk,k
du K

= −du + γ Rk,k − α u 3 + F. (5.17)
dt k=1
For the equation of u k , 1 ≤ k ≤ K , on the other hand, u is regarded as a fixed

parameter and the equations for the mean and variance of u k are closed under the
Gaussian assumption
d u k
dt
= −(dk + γ u) u k
, 1 ≤ k ≤ K ,

d Rk,k
(5.18)

dt
= −2(dk + γ u)Rk,k + σ̃k2
where there is no cross correlation, that is,

d Rk,l
= 0, k = l. (5.19)
dt
Due to approximation in Superparameterization, the same noise level σk , 1 ≤ k ≤
K of the true signal does not guarantee the same level of the stationary state variance
of Superparameterization and thus Superparameterization uses a tunable noise level
σ̃k , 1 ≤ k ≤ K to match the stationary state variance of Superparameterization with
the stationary state variance of the true signal. Note that stationary state information
of the true signal is usually assumed to be provided in data assimilation such as
climatology in atmosphere science and here we also assume that the stationary state
information is available for Superparameterization. Instead of tuning each noise level
σk , Superparameterization mimics the spectrum of the true signal by setting a relation
σ̃ 2
between the frozen u = u ∞
stationary state 2(dk +γk u ∞
) and the true signal spectrum
where u ∞
is the statistical stationary state mean of u. That is, Superparameterization
tunes the following relation coefficient A
σ̃k2
= A × (kth mode stationary state variance of the true model)
2(dk + γ u ∞
)
(5.20)
so that the actual stationary variance of the numerical solution by Superparame-
terization matches the true signal variance. Note that A is not necessarily 1 due to
variability in u.
Multi-scale Data Assimilation Algorithms with Conditionally Gaussian Filters
As mentioned earlier an important issue is that the observations mix the large and
small scale parts, u and u respectively. That is, observations v ∈ R M mix u and u
through a nonlinear observation operator G(u, u ):
v = G(u, u ) + σθ (5.21)
where σθ is the observation noise error with a probability distribution pθ (v−G(u, u ))

for the observational noise. One approach to deal with mixed observations is to treat
the contribution from the small scales as a component of observation error (which
is known as ‘representation error’ or ‘representativeness error’ [33, 96]) and use the
method of [74] which is a multi-scale method with observation of only the large scale
variables. But this approach has a limitation in that it only provides the prediction for
the large scales. The multi-scale method proposed in [91] can provide predictions
for the energetic small scale modes in addition to the large scales.
I only briefly discuss here a multi-scale data assimilation method using particle
filters on the large scale u and Superparameterization as the forecast method but
there is an alternative [91] using finite ensemble filters (Chapter 9 of [112]) on the
large scales.
5.4.1.3 Particle Filters with Superparameterization (PF-SP)
Superparameterization retains the large scale variables by resolving them on a coarse

grid while the effect of the small scales on the large scales is parameterized by
approximating the small scales on local or reduced spaces. Stochastic Superpara-
meterization discussed in the previous section uses Gaussian closure for the small
scales conditional to the large scale variable u with u ∈ R N [66, 67, 110, 111]. Thus
we consider a multi-scale filtering algorithm with forecast prior distributions given
by the conditional distribution
p f (u) = p f (u, u ) = p f (u) pG (u |u)

f
(5.22)
where pG (u |u) is a Gaussian distribution conditional to u

f
pG (u |u) = N (u (u), R (u)).

f
(5.23)
Here we assume that N1 is sufficiently small enough that particle filters (see
Chapter 15 of [112]) can be applied to the large scales. For a low dimensional space
u, the marginal distribution of u can be approximated by Q particles

Q
f
p f (u) = p j δ(u − u j ) (5.24)
j=1
f f
where p j ≥ 0 are particle weights such that j p j = 1. After the forecast step where
Superparameterization is applied to each particle member, we have the following
general form for the prior distribution p f (u)

Q
p f (u) = p f (u, u ) = p j δ(u − u j ) pG (u |u j )
f f f
j=1
(5.25)

Q
f
= p j δ(u − u j )N (u (u j ) , R (u j ) ),
f f
j=1
which is a conditional Gaussian mixture distribution where each summand is a

Gaussian distribution conditional to u j . The Gaussian mixture has already been
used in data assimilation [7, 76, 162] but the multi-scale method developed here is
different in that conditional Gaussian distributions are applied in the reduced sub-
space u with particle approximations only in the lower dimensional subspace u.
Thus the proposed multi-scale data assimilation method can be highly efficient and
fast in comparison with conventional data assimilation methods which use the whole
space for the filter.
For a general nonlinear observation operator G of (5.21) and observational noise
distribution pθ (v − G(u j , u )), the posterior distribution is not necessarily in the
same form as the prior distribution. If we restrict the multi-scale observation operator
mildly and assume that the observational noise error pθ (v − G(u j , u )) = N (0, rθ ),
is Gaussian, the posterior distributions has the same form as the prior distribution,
(5.25).
Proposition 5.1 Assume that the prior distribution from the forecast is in the from
(5.25) and that the observations have the following structure
v = G(u, u ) + σθ = Gu + G (u)u + σθ . (5.26)
where G (u j ) has rank M. Then the posterior distribution in the analysis step taking
into account the observations (5.26) is in the form of (5.25)

Q

p (u) = p (u, u ) =
a a
paj δ(u − u j )N (u (u j )a , R (u j )a ), (5.27)
j=1
The new mixture weights are

f
pj Ij
paj = Q f
(5.28)
k=1 pk Ik

where I j = p(v|u j , u ) p(u |u j )du and for each particle u j , the posterior mean
and variance of u , u (uj )a and R (uj )a respectively, are
u (u j ) = u + K (v − Gu j − G (u j )u )
a f f f
(5.29)
R (u j )a = (I − K G (u j ))R (u) f
f
where the Kalman gain matrix K is given by
K = R f G (u j )T (G (u j )R f G (u j )T + rθ )−1 .

f f f
(5.30)
For the proof of Proposition 5.1, see the supplementary material of [118].
Using Superparameterization as the cheap forecast model and the Proposition

5.1 in the analysis step gives the summary of the multi-scale algorithm. Numerical
results and implementation details can be found in [91].
5.4.2 Blended Particle Methods with Adaptive Subspaces

for Filtering Turbulent Dynamical Systems
In the multi-scale data assimilation algorithms discussed above based on Superpa-

rameterization, the subspace of particles defined by u is fixed. An attractive idea is
to change the subspace with particles adaptively in time to capture the non-Gaussian
features as they change in time. Very accurate filtering algorithms based on these
ideas for multi-scale filtering utilizing this adaptive strategy and utilizing the proposi-
tion in Section 5.4.1.3 have been developed [118, 144]. Nonlinear statistical forecast
models like versions of MQG described in Chapter 4 are implemented in the adap-
tive algorithm. In particular, the paper [144] also contains many detailed numerical
experiments and interesting counterexamples to more naive strategies for multi-scale
data assimilation.
5.4.3 Extremely Efficient Multi-scale Filtering Algorithms:

SPEKF and Dynamic Stochastic Superresolution (DSS)
The stochastic parameterized extended Kalman filters (SPEKF) are a class of nonlin-
ear filters which are exact statistical equations for the mean and covariance for non-
linear forecast models which learn hidden parameters “on the fly” from the observed
data. The parameters represent adaptive additive and multiplicative bias corrections
from model error. They explicitly make judicious model error and utilize condi-
tional Gaussian structure as developed in Section 5.2. The book [112] contains many
examples and successful applications of this method.
Dynamical Stochastic Superresolution (DSS) uses the same idea but in addition
exploits the aliased information in the observations to super-resolve a multi-scale tur-
bulent signal [19, 82]. Nontrivial applications of DSS including recovering geophys-
ical turbulence from surface satellite observations [82] and filtering “black swans”
and dispersive wave turbulence [19] with severe judicious model errors. An interest-
ing mathematical problem is to understand the reasons for the skill of these radical
methods.
References
1. R.V. Abramov, Short-time linear response with reduced-rank tangent map. Chin. Ann. Math.
Ser. B 30(5), 447–462 (2009)
2. R.V. Abramov, G. Kovačič, A.J. Majda, Hamiltonian structure and statistically relevant con-
served quantities for the truncated Burgers-Hopf equation. Commun. Pure Appl. Math. 56(1),
1–46 (2003)
3. R.V. Abramov, A.J. Majda, Blended response algorithms for linear fluctuation-dissipation for
complex nonlinear dynamical systems. Nonlinearity 20(12), 2793 (2007)
4. R.V. Abramov, A.J. Majda, New approximations and tests of linear fluctuation-response for
chaotic nonlinear forced-dissipative dynamical systems. J. Nonlinear Sci. 18(3), 303–341
(2008)
5. R.V. Abramov, A.J. Majda, A new algorithm for low-frequency climate response. J. Atmos.
Sci. 66(2), 286–309 (2009)
6. R.V. Abramov, A.J. Majda, Low-frequency climate response of quasigeostrophic wind-driven
ocean circulation. J. Phys. Oceanogr. 42(2), 243–260 (2012)
7. D.L. Alspach, V.S. Samant, H.W. Sorenson, Practical control algorithms for nonlinear sto-
chastic systems and investigations of nonlinear filters (Technical report, DTIC Document,
1980)
8. B.D.O. Anderson, J.B. Moore, Optimal filtering (Courier Corporation, 2012)
9. J.L. Anderson, An ensemble adjustment Kalman filter for data assimilation. Mon. Weather
Rev. 129(12), 2884–2903 (2001)
10. A. Apte, C.K.R.T. Jones, A.M. Stuart, A Bayesian approach to Lagrangian data assimilation.
Tellus A 60(2), 336–347 (2008)
11. N. Aubry, W.-Y. Lian, E.S. Titi, Preserving symmetries in the proper orthogonal decomposi-
tion. SIAM J. Sci. Comput. 14(2), 483–505 (1993)
12. A. Bain and D. Crisan. Fundamentals of Stochastic Filtering, Stochastic Modelling and
Applied Probability (2009)
13. N.A. Bakas, P.J. Ioannou, Structural stability theory of two-dimensional fluid flow under
stochastic forcing. J. Fluid Mech. 682, 332–361 (2011)
14. T.L. Bell, Climate sensitivity from fluctuation dissipation: some simple model tests. J. Atmos.
Sci. 37(8), 1700–1707 (1980)
15. J. Berner, G. Branstator, Linear and nonlinear signatures in the planetary wave dynamics of
an AGCM: probability density functions. J. Atmosph. Sci. 64(1), 117–136 (2007)
16. C.H. Bishop, B.J. Etherton, S.J. Majumdar, Adaptive sampling with the ensemble transform
Kalman filter. Part I: theoretical aspects. Monthly weather review 129(3), 420–436 (2001)
17. M. Branicki, N. Chen, A.J. Majda, Non-Gaussian test models for prediction and state estima-
tion with model errors. Chin. Ann. Math. Ser. B 34(1), 29–64 (2013)

DOI 10.1007/978-3-319-32217-9
86 References
18. M. Branicki, A.J. Majda, Quantifying uncertainty for predictions with model error in non-
Gaussian systems with intermittency. Nonlinearity 25(9), 2543 (2012)
19. M. Branicki, A.J. Majda, Dynamic stochastic superresolution of sparsely observed turbulent
systems. J. Comput. Phys. 241, 333–363 (2013)
20. M. Branicki, A.J. Majda, Fundamental limitations of polynomial chaos for uncertainty quan-
tification in systems with intermittent instabilities. Commun. Math. Sci. 11(1), 55–103 (2013)
21. M. Branicki, A.J. Majda, Quantifying Bayesian filter performance for turbulent dynamical
systems through information theory. Commun. Math. Sci. 12(5), 901–978 (2014)
22. K.P. Burnham, D.R. Anderson, Model Selection and Multimodel Inference: A Practical
Information-theoretic Approach (Springer Science & Business Media, Berlin, 2003)
23. D. Cai, A.J. Majda, D.W. McLaughlin, E.G. Tabak., Dispersive wave turbulence in one dimen-
sion. Phys. D 152, 551–572 (2001)
24. G.F. Carnevale, M. Falcioni, S. Isola, R. Purini, A. Vulpiani, Fluctuation-response relations
in systems with chaotic behavior. Phys. Fluids A: Fluid Dyn. (1989–1993) 3(9), 2247–2254
(1991)
25. N. Chen, A.J. Majda, Predicting the cloud patterns for the boreal summer intraseasonal oscil-
lation through a low-order stochastic model. Math. Clim. Weather Forecast. 1(1), 1–20 (2015)
26. N. Chen, A.J. Majda, Predicting the real-time multivariate Madden-Julian oscillation index
through a low-order nonlinear stochastic model. Mon. Weather Rev. 143(6), 2148–2169 (2015)
27. N. Chen, A.J. Majda, Filtering nonlinear turbulent dynamical system through conditional
Gaussian statistics. Mon. Weather Rev. (2016, accepted)
28. N. Chen, A.J. Majda, Filtering the stochastic skeleton model for the Madden-Julian oscillation.
Mon. Weather Rev. 144, 501–527 (2016)
29. N. Chen, A.J. Majda, Model error in filtering random compressible flows using noisy
Lagrangian tracers. Mon. Weather Rev. (2016, accepted)
30. N. Chen, A.J. Majda, D. Giannakis, Predicting the cloud patterns of the Madden-Julian Oscil-
lation through a low-order nonlinear stochastic model. Geophys. Res. Lett. 41(15), 5612–5619
(2014)
31. N. Chen, A.J. Majda, X.T. Tong, Information barriers for noisy Lagrangian tracers in filtering
random incompressible flows. Nonlinearity 27(9), 2133 (2014)
32. N. Chen, A.J. Majda, X.T. Tong, Noisy Lagrangian tracers for filtering random rotating com-
pressible flows. J. Nonlinear Sci. 25(3), 451–488 (2015)
33. S.E. Cohn, An introduction to estimation theory. J.-Meteorol. Soc. Jpn. Ser. 2(75), 147–178
(1997)
34. P. Constantin, C. Foias, R. Temam, Attractors Representing Turbulent Flows, vol. 53, no. 314
(American Mathematical Soc., Providence, 1985)
35. W. Cousins, T.P. Sapsis, Quantification and prediction of extreme events in a one-dimensional
nonlinear dispersive wave model. Phys. D 280, 48–58 (2014)
36. N. Cressie, C.K. Wikle, Spatial-Temporal Data (Wiley, Chichester, 2011)
37. D.T. Crommelin, A.J. Majda, Strategies for model reduction: comparing different optimal
bases. J. Atmos. Sci. 61(17), 2206–2217 (2004)
38. R. Daley, Atmospheric Data Analysis, vol. 6966, Cambridge Atmospheric and Space Science
Series. (Cambridge University Press, Cambridge, 1991), p. 25
39. F. Daum and J. Huang. Curse of dimensionality and particle filters. In Aerospace Conference,
2003. Proceedings. 2003 IEEE, vol. 4 (IEEE, 2003), pp. 4_1979–4_1993
40. A. Doucet, S. Godsill, C. Andrieu, On sequential Monte Carlo sampling methods for Bayesian
filtering. Stat. Comput. 10(3), 197–208 (2000)
41. E. Weinan, J.C. Mattingly, Y. Sinai, Gibbsian dynamics and ergodicity for the stochastically
forced Navier-Stokes equation. Commun. Math. Phys. 224(1), 83–106 (2001)
42. E.S. Epstein, Stochastic dynamic prediction. Tellus 21(6), 739–759 (1969)
43. E.S. Epstein, R.J. Fleming, Depicting stochastic dynamic forecasts. J. Atmos. Sci. 28(4),
500–511 (1971)
44. G. Evensen, The ensemble Kalman filter: theoretical formulation and practical implementa-
tion. Ocean Dyn. 53(4), 343–367 (2003)
References 87
45. B.F. Farrell, P.J. Ioannou, Structural stability of turbulent jets. J. Atmos. Sci. 60(17), 2101–
2118 (2003)
46. B.F. Farrell, P.J. Ioannou, Structure and spacing of jets in barotropic turbulence. J. Atmos.
Sci. 64(10), 3652–3665 (2007)
47. U. Frisch, Turbulence: The Legacy of AN Kolmogorov (Cambridge University Press, Cam-
bridge, 1995)
48. C. Gardiner, Stochastic Methods: A Handbook for the Natural and Social Sciences, 4th edn.
(Springer, Berlin, 2009)
49. B. Gershgorin, A.J. Majda, A nonlinear test model for filtering slow-fast systems. Commun.
Math. Sci. 6(3), 611–649 (2008)
50. B. Gershgorin, A.J. Majda, Filtering a nonlinear slow-fast system with strong fast forcing.
Commun. Math. Sci. 8(1), 67–92 (2010)
51. B. Gershgorin, A.J. Majda, A test model for fluctuation-dissipation theorems with time-
periodic statistics. Phys. D 239(17), 1741–1757 (2010)
52. B. Gershgorin, A.J. Majda, Quantifying uncertainty for climate change and long-range fore-
casting scenarios with model errors. Part I: Gaussian models. J. Clim. 25(13), 4523–4548
(2012)
53. A. Gluhovsky, E. Agee, An interpretation of atmospheric low-order models. J. Atmos. Sci.
54(6), 768–773 (1997)
54. G.A. Gottwald, A.J. Majda, A mechanism for catastrophic filter divergence in data assimilation
for sparse observation networks. Nonlinear Process. Geophys. 20(5), 705–712 (2013)
55. D. John Gould, S. Wijffels Roemmich, H. Freeland, M. Ignaszewsky, X. Jianping, S.
Pouliquen, Y. Desaubies, U. Send, K. Radhakrishnan et al., Argo profiling floats bring new
era of in situ ocean observations. Eos 85(19), 179–184 (2004)
56. W.W. Grabowski, An improved framework for superparameterization. J. Atmos. Sci. 61(15),
1940–1952 (2004)
57. W.W. Grabowski, P.K. Smolarkiewicz, Crcp: a cloud resolving convection parameterization
for modeling the tropical convecting atmosphere. Phys. D: Nonlinear Phenom. 133(1), 171–
178 (1999)
58. A. Griffa, A.D. Griffa Jr., A.J. Mariano, T. Özgökmen, H. Thomas Rossby, Lagrangian Analy-
sis and Prediction of Coastal and Ocean Dynamics (Cambridge University Press, Cambridge,
2007)
59. A. Gritsun, G. Branstator, Climate response using a three-dimensional operator based on the
fluctuation-dissipation theorem. J. Atmos. Sci. 64(7), 2558–2575 (2007)
60. A. Gritsun, G. Branstator, A.J. Majda, Climate response of linear and quadratic functionals
using the fluctuation-dissipation theorem. J. Atmos. Sci. 65(9), 2824–2841 (2008)
61. A.S. Gritsun, V.P. Dymnikov, Barotropic atmosphere response to small external actions: theory
and numerical experiments. Izv.-Russ. Acad. Sci. Atmos. Ocean. Phys. 35, 511–525 (1999)
62. I. Grooms, Y. Lee, A.J. Majda, Ensemble Kalman filters for dynamical systems with unre-
solved turbulence. J. Comput. Phys. 273, 435–452 (2014)
63. I. Grooms, Y. Lee, A.J. Majda, Ensemble filtering and low-resolution model error: covari-
ance inflation, stochastic parameterization, and model numerics. Mon. Weather Rev. 143(10),
3912–3924 (2015)
64. I. Grooms, Y. Lee, A.J. Majda, Numerical schemes for stochastic backscatter in the inverse
cascade of quasigeostrophic turbulence. Multiscale Model. Simul. 13(3), 1001–1021 (2015)
65. I. Grooms, A.J. Majda, Efficient stochastic superparameterization for geophysical turbulence.
Proc. Natl. Acad. Sci. 110(12), 4464–4469 (2013)
66. I. Grooms, A.J. Majda, Stochastic superparameterization in a one-dimensional model for
wave turbulence. Commun. Math. Sci. 12(3), 509–525 (2014)
67. I. Grooms, A.J. Majda, Stochastic superparameterization in quasigeostrophic turbulence. J.
Comput. Phys. 271, 78–98 (2014)
68. I. Grooms, A.J. Majda, S.K. Smith, Stochastic superparameterization in a quasigeostrophic
model of the antarctic circumpolar current. Ocean Model. 85, 1–15 (2015)
88 References
69. M. Hairer, On Malliavin’s Proof of Hörmander’s Theorem. Lecture notes at University of

Warwick. arXiv:1103.1998
70. M. Hairer, A.J. Majda, A simple framework to justify linear response theory. Nonlinearity
23(4), 909–922 (2010)
71. M. Hairer, J.C. Mattingly, Ergodicity of the 2d Navier-Stokes equations with degenerate
stochastic forcing. Ann. Math. pp. 993–1032 (2006)
72. J. Harlim, A. Mahdi, A.J. Majda, An ensemble Kalman filter for statistical estimation of
physics constrained nonlinear regression models. J. Comput. Phys. 257, 782–812 (2014)
73. J. Harlim, A.J. Majda, Catastrophic filter divergence in filtering nonlinear dissipative systems.
Commun. Math. Sci. 8(1), 27–43 (2010)
74. J. Harlim, A.J. Majda, Test models for filtering and prediction of moisture-coupled tropical
waves. Q. J. R. Meteorol. Soc. 139(670), 119–136 (2013)
75. P. Holmes, J.L. Lumley, G. Berkooz, Turbulence, Coherent Structures, Dynamical Systems
and Symmetry (Cambridge University Press, Cambridge, 1998)
76. I. Hoteit, X. Luo, D.-T. Pham, Particle Kalman filtering: a nonlinear Bayesian framework for
ensemble Kalman filters (2011). arXiv:1108.0168
77. T.Y. Hou, W. Luo, B. Rozovskii, H.-M. Zhou, Wiener chaos expansions and numerical solu-
tions of randomly forced equations of fluid mechanics. J. Comput. Phys. 216(2), 687–706
(2006)
78. E.T. Jaynes, Information theory and statistical mechanics. Phys. Rev. 106(4), 620 (1957)
79. R.E. Kalman, R.S. Bucy, New results in linear filtering and prediction theory. J. Basic Eng.
83(1), 95–108 (1961)
80. R.E. Kalman, A new approach to linear filtering and prediction problems. J. Fluids Eng. 82(1),
35–45 (1960)
81. E. Kalnay, Atmospheric Modeling, Data Assimilation, and Predictability (Cambridge Uni-
versity Press, Cambridge, 2003)
82. S.R. Keating, A.J. Majda, K. Shafer, Smith, New methods for estimating ocean eddy heat
transport using satellite altimetry. Mon. Weather Rev. 140(5), 1703–1722 (2012)
83. D. Kelly, A.J. Majda, X.T. Tong, Concrete ensemble Kalman filters with rigorous catastrophic
filter divergence. Proc. Natl. Acad. Sci. 112(34), 10589–10594 (2015)
84. D.T.B. Kelly, K.J.H. Law, A.M. Stuart, Well-posedness and accuracy of the ensemble Kalman
filter in discrete and continuous time. Nonlinearity 27(10), 2579 (2014)
85. R. Kleeman, Measuring dynamical prediction utility using relative entropy. J. Atmos. Sci.
59(13), 2057–2072 (2002)
86. O.M. Knio, H.N. Najm, R.G. Ghanem et al., A stochastic projection method for fluid flow: I.
basic formulation. J. Comput. Phys. 173(2), 481–511 (2001)
87. R.H. Kraichnan, D. Montgomery, Two-dimensional turbulence. Rep. Prog. Phys. 43(5), 547
(1980)
88. S. Kullback, R.A. Leibler, On information and sufficiency. Ann. Math. Stat. 22(1), 79–86
(1951)
89. L. Kuznetsov, K. Ide, C.K.R.T. Jones, A method for assimilation of Lagrangian data. Mon.
Weather Rev. 131(10), 2247–2260 (2003)
90. K. Law, A. Stuart, K. Zygalakis, Data Assimilation: A Mathematical Introduction, vol. 62
91. Y. Lee, A.J. Majda, Multiscale methods for data assimilation in turbulent systems. Multiscale
Model. Simul. 13(2), 691–713 (2015)
92. Y. Lee, A.J. Majda, D. Qi, Preventing catastrophic filter divergence using adaptive additive
inflation for baroclinic turbulence. Mon. Weather Rev. (2016, submitted)
93. C.E. Leith, Climate response and fluctuation dissipation. J. Atmos. Sci. 32(10), 2022–2026
(1975)
94. M. Lesieur, Turbulence in Fluids, vol. 40 (Springer Science & Business Media, Berlin, 2012)
95. R.S. Liptser, A.N. Shiryaev, Statistics of Random Processes II: II. Applications, vol. 2
References 89
96. A.C. Lorenc, Analysis methods for numerical weather prediction. Q. J. R. Meteorol. Soc.
112(474), 1177–1194 (1986)
97. E.N. Lorenz, Deterministic nonperiodic flow. J. Atmos. Sci. 20(2), 130–141 (1963)
98. E.N. Lorenz, Predictability: a problem partly solved. Proc. Semin. Predict. 1(1) (1996)
99. E.N. Lorenz, K.A. Emanuel, Optimal sites for supplementary weather observations: simula-
tion with a small model. J. Atmos. Sci. 55(3), 399–414 (1998)
100. A.J. Majda, Statistical energy conservation principle for inhomogeneous turbulent dynamical
systems. Proc. Natl. Acad. Sci. 112(29), 8937–8941 (2015)
101. A.J. Majda, R. Abramov, B. Gershgorin, High skill in low-frequency climate response through
fluctuation dissipation theorems despite structural instability. Proc. Natl. Acad. Sci. 107(2),
581–586 (2010)
102. A.J. Majda, R.V. Abramov, M.J. Grote, Information Theory and Stochastics for Multiscale
Nonlinear Systems, vol. 25 (American Mathematical Soc, Providence, 2005)
103. A.J. Majda, M. Branicki, Lessons in uncertainty quantification for turbulent dynamical sys-
tems. Discrete Cont. Dyn. Syst. 32(9) (2012)
104. A.J. Majda, C. Franzke, B. Khouider, An applied mathematics perspective on stochastic
modelling for climate. Philos. Trans. R. Soc. Lond. A: Math. Phys. Eng. Sci. 366(1875),
2427–2453 (2008)
105. A.J. Majda, B. Gershgorin, Quantifying uncertainty in climate change science through empir-
ical information theory. Proc. Natl. Acad. Sci. 107(34), 14958–14963 (2010)
106. A.J. Majda, B. Gershgorin, Improving model fidelity and sensitivity for complex systems
through empirical information theory. Proc. Natl. Acad. Sci. 108(25), 10044–10049 (2011)
107. A.J. Majda, B. Gershgorin, Link between statistical equilibrium fidelity and forecasting skill
for complex systems with model error. Proc. Natl. Acad. Sci. 108(31), 12599–12604 (2011)
108. A.J. Majda, B. Gershgorin, Elementary models for turbulent diffusion with complex physical
features: eddy diffusivity, spectrum and intermittency. Philos. Trans. R. Soc. Lond. A: Math.
Phys. Eng. Sci. 371(1982), 20120184 (2013)
109. A.J. Majda, B. Gershgorin, Y. Yuan, Low-frequency climate response and fluctuation-
dissipation theorems: theory and practice. J. Atmos. Sci. 67(4), 1186–1201 (2010)
110. A.J. Majda, I. Grooms, New perspectives on superparameterization for geophysical turbu-
lence. J. Comput. Phys. 271, 60–77 (2014)
111. A.J. Majda, M.J. Grote, Mathematical test models for superparametrization in anisotropic
turbulence. Proc. Natl. Acad. Sci. 106(14), 5470–5474 (2009)
112. A.J. Majda, J. Harlim, Filtering Complex Turbulent Systems (Cambridge University Press,
Cambridge, 2012)
113. A.J. Majda, J. Harlim, Physics constrained nonlinear regression models for time series. Non-
linearity 26(1), 201 (2012)
114. A.J. Majda, R. Kleeman, D. Cai, A mathematical framework for quantifying predictability
through relative entropy. Methods Appl. Anal. 9(3), 425–444 (2002)
115. A.J. Majda, Y. Lee, Conceptual dynamical models for turbulence. Proc. Natl. Acad. Sci.
111(18), 6548–6553 (2014)
116. A.J. Majda, D.W. McLaughlin, E.G. Tabak, A one-dimensional model for dispersive wave
turbulence. J. Nonlinear Sci. 7(1), 9–44 (1997)
117. A.J. Majda, D. Qi, Improving prediction skill of imperfect turbulent models through statistical
response and information theory. J. Nonlinear Sci. 26(1), 233–285 (2016)
118. A.J. Majda, D. Qi, T.P. Sapsis, Blended particle filters for large-dimensional chaotic dynamical
systems. Proc. Natl. Acad. Sci. 111(21), 7511–7516 (2014)
119. A.J. Majda, I. Timofeyev, Remarkable statistical behavior for truncated burgers-hopf dynam-
ics. Proc. Natl. Acad. Sci. 97(23), 12413–12417 (2000)
120. A.J. Majda, I. Timofeyev, Low-dimensional chaotic dynamics versus intrinsic stochastic noise:
a paradigm model. Phys. D 199(3), 339–368 (2004)
121. A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, Models for stochastic climate prediction. Proc.
Natl. Acad. Sci. 96(26), 14687–14691 (1999)
90 References
122. A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, A mathematical framework for stochastic cli-
mate models. Commun. Pure Appl. Math. 54(8), 891–974 (2001)
123. A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, A priori tests of a stochastic mode reduction
strategy. Phys. D 170(3), 206–252 (2002)
124. A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, Systematic strategies for stochastic mode reduc-
tion in climate. J. Atmos. Sci. 60(14), 1705–1722 (2003)
125. A.J. Majda, I. Tomofeyev, Statistical mechanics for truncations of the Burgers-Hopf equation:
a model for intrinsic stochastic behavior with scaling. Milan J. Math. 70(1), 39–96 (2002)
126. A.J. Majda, X.T. Tong, Ergodicity of truncated stochastic Navier Stokes with deterministic
forcing and dispersion. J. Nonlinear Sci. (2015. accepted)
127. A.J. Majda, X.T. Tong, Geometric ergodicity for piecewise contracting processes with appli-
cations for tropical stochastic lattice models. Commun. Pure Appl. Math. 69(6), 1110–1153
(2015)
128. A.J. Majda, X.T. Tong, Intermittency in turbulent diffusion models with a mean gradient.
Nonlinearity 28(11), 4171 (2015)
129. A.J. Majda, X. Wang, The emergence of large-scale coherent structure under small-scale
random bombardments. Commun. Pure Appl. Math. 59(4), 467–500 (2006)
130. A.J. Majda, X. Wang, Nonlinear Dynamics and Statistical Theories for Basic Geophysical
Flows (Cambridge University Press, Cambridge, 2006)
131. A.J. Majda, X. Wang, Linear response theory for statistical ensembles in complex systems
with time-periodic forcing. Commun. Math. Sci. 8(1), 145–172 (2010)
132. A.J. Majda, Y. Yuan, Fundamental limitations of ad hoc linear and quadratic multi-level
regression models for physical systems. Discrete Contin. Dyn. Syst. B 17(4), 1333–1363
(2012)
133. M.E. Maltrud, G.K. Vallis, Energy spectra and coherent structures in forced two-dimensional
and beta-plane turbulence. J. Fluid Mech. 228, 321–342 (1991)
134. U.M.B. Marconi, A. Puglisi, L. Rondoni, A. Vulpiani, Fluctuation-dissipation: response theory
in statistical physics. Phys. Rep. 461(4), 111–195 (2008)
135. B.J. Marston, E. Conover, T. Schneider, Statistics of an unstable barotropic jet from a cumulant
expansion. J. Atmos. Sci. 65, 1955 (2008)
136. J.C. Mattingly, A.M. Stuart, Geometric ergodicity of some hypo-elliptic diffusions for particle
motions. Markov Process. Relat. Fields 8(2), 199–214 (2002)
137. J.C. Mattingly, A.M. Stuart, D.J. Higham, Ergodicity for SDEs and approximations: locally
Lipschitz vector fields and degenerate noise. Stoch. Process. Appl. 101(2), 185–232 (2002)
138. H.N. Najm, Uncertainty quantification and polynomial chaos techniques in computational
fluid dynamics. Annu. Rev. Fluid Mech. 41, 35–52 (2009)
139. J.D. Neelin, B.R. Lintner, B. Tian, Q. Li, L. Zhang, P.K. Patra, M.T. Chahine, S.N. Stechmann,
Long tails in deep columns of natural and anthropogenic tropospheric tracers. Geophys. Res.
Lett. 37(5), L05804 (2010)
140. D.R. Nicholson, Introduction to Plasma Theory (Cambridge Univ Press, Cambridge, 1983)
141. T.N. Palmer, A nonlinear dynamical perspective on model error: a proposal for non-local
stochastic-dynamic parametrization in weather and climate prediction models. Q. J. R. Mete-
orol. Soc. 127(572), 279–304 (2001)
142. C. Penland, P.D. Sardeshmukh, The optimal growth of tropical sea surface temperature anom-
alies. J. Clim. 8(8), 1999–2024 (1995)
143. S.B. Pope, Turbulent Flows (IOP Publishing, Bristol, 2001)
144. D. Qi, A.J. Majda, Blended particle methods with adaptive subspaces for filtering turbulent
dynamical systems. Phys. D 298, 21–41 (2015)
145. D. Qi, A.J. Majda, Low-dimensional reduced-order models for statistical response and uncer-
tainty quantification: barotropic turbulence with topography. Phys. D (2016, submitted)
146. D. Qi, A.J. Majda, Low-dimensional reduced-order models for statistical response and uncer-
tainty quantification: two-layer baroclinic turbulence. J. Atmos. Sci. (2016, submitted)
147. D. Qi, A.J. Majda, Predicting fat-tailed intermittent probability distributions in passive scalar
turbulence with imperfect models through empirical information theory. Commun. Math. Sci.
14(6), 1687–1722 (2016)
References 91
148. S. Reich, C. Cotter, Probabilistic Forecasting and Bayesian Data Assimilation (Cambridge
University Press, Cambridge, 2015)
149. J.C. Robinson, Infinite-Dimensional Dynamical Systems: An Introduction to Dissipative Par-
abolic PDEs and the Theory of Global Attractors, vol. 28 (Cambridge University Press,
Cambridge, 2001)
150. M. Romito, Ergodicity of the finite dimensional approximation of the 3d Navier-Stokes equa-
tions forced by a degenerate noise. J. Stat. Phys. 114(1–2), 155–177 (2004)
151. H. Salman, K. Ide, C.K.R.T. Jones, Using flow geometry for drifter deployment in Lagrangian
data assimilation. Tellus A 60(2), 321–335 (2008)
152. H. Salman, L. Kuznetsov, C.K.R.T. Jones, K. Ide, A method for assimilating Lagrangian data
into a shallow-water-equation ocean model. Mon. Weather Rev. 134(4), 1081–1101 (2006)
153. R. Salmon, Lectures on Geophysical Fluid Dynamics (Oxford University Press, Oxford, 1998)
154. T.P. Sapsis, Attractor local dimensionality, nonlinear energy transfers and finite-time insta-
bilities in unstable dynamical systems with applications to two-dimensional fluid flows, in
Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering
Sciences, vol. 469, (The Royal Society, 2013), pp. 20120550
155. T.P. Sapsis, P.F.J. Lermusiaux, Dynamically orthogonal field equations for continuous sto-
chastic dynamical systems. Phys. D 238(23), 2347–2360 (2009)
156. T.P. Sapsis, A.J. Majda, Blended reduced subspace algorithms for uncertainty quantification
of quadratic systems with a stable mean state. Phys. D 258, 61–76 (2013)
157. T.P. Sapsis, A.J. Majda, Blending modified Gaussian closure and non-Gaussian reduced sub-
space methods for turbulent dynamical systems. J. Nonlinear Sci. 23(6), 1039–1071 (2013)
158. T.P. Sapsis, A.J. Majda, Statistically accurate low-order models for uncertainty quantification
in turbulent dynamical systems. Proc. Natl. Acad. Sci. 110(34), 13705–13710 (2013)
159. T.P. Sapsis, A.J. Majda, A statistically accurate modified quasilinear Gaussian closure for
uncertainty quantification in turbulent dynamical systems. Phys. D 252, 34–45 (2013)
160. L. Slivinski, E. Spiller, A. Apte, B. Sandstede, A hybrid particle-ensemble Kalman filter for
Lagrangian data assimilation. Mon. Weather Rev. 143(1), 195–211 (2015)
161. S.K. Smith, A local model for planetary atmospheres forced by small-scale convection. J.
Atmos. Sci. 61(12), 1420–1433 (2004)
162. H.W. Sorenson, D.L. Alspach, Recursive Bayesian estimation using Gaussian sums. Auto-
matica 7(4), 465–479 (1971)
163. K. Srinivasan, W.R. Young, Zonostrophic instability. J. Atmos. Sci. 69(5), 1633–1656 (2012)
164. D.W. Stroock, S. Karmakar, Lectures on Topics in Stochastic Differential Equations (Springer,
Berlin, 1982)
165. R. Temam, Infinite-Dimensional Dynamical Systems in Mechanics and Physics, vol. 68
(Springer Science & Business Media, Berlin, 2012)
166. S. Thual, A.J. Majda, S.N. Stechmann, A stochastic skeleton model for the MJO. J. Atmos.
Sci. 71(2), 697–715 (2014)
167. S.M. Tobias, K. Dagon, B.J. Marston, Astrophysical fluid dynamics via direct statistical
simulation. Astrophys. J. 727(2), 127 (2011)
168. X.T. Tong, A.J. Majda, D. Kelly, Nonlinear stability and ergodicity of ensemble based Kalman
filters. Nonlinearity 29(2), 657–691 (2016)
169. X.T. Tong, A.J. Majda, D. Kelly, Nonlinear stability of the ensemble Kalman filter with
adaptive covariance inflation. Commun. Math. Sci. 14(5), 1283–1313 (2016)
170. A.A. Townsend, The Structure of Turbulent Shear Flow (Cambridge University Press, Cam-
bridge, 1980)
171. G.K. Vallis, Atmospheric and Oceanic Fluid Dynamics: Fundamentals and Large-Scale Cir-
culation (Cambridge University Press, Cambridge, 2006)
172. G.K. Vallis, M.E. Maltrud, Generation of mean flows and jets on a beta plane and over
topography. J. Phys. Oceanogr. 23(7), 1346–1362 (1993)

Control System Design - An Introduction To State-Space Methods - Friedland

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Control System Design - An Introduction To State-Space Methods - Friedland

Uploaded by

Copyright:

Available Formats

Frontiers in Applied Dynamical Systems:

Reviews and Tutorials 5

More information about this series at http://www.springer.com/series/13763

ISSN 2364-4532 ISSN 2364-4931 (electronic)

© Springer International Publishing Switzerland 2016

Printed on acid-free paper

This Springer imprint is published by Springer Nature

Turbulent dynamical systems are ubiquitous complex systems in geoscience and

New York, USA Andrew J. Majda

3.3 A Statistical Energy Conservation Principle for Turbulent

5 State Estimation, Data Assimilation, or Filtering for Complex

for u ∈ R N where σ is an N × K noise matrix and W ∈ R K is K -dimensional white

1.1 Turbulent Dynamical Systems for Complex Systems:

by a large dimensional phase space, u ∈ R N , with N  1 and a large dimension

where F M maps R M to R M and σ M is a noise matrix as in (1.1). Practical complex

1.2 Detailed Structure and Energy Conservation Principles

(A) L is a linear operator typically representing dissipation and dispersion.

where L is skew symmetric representing dispersion

and D is a symmetric operator which represents strict dissipation so that

u · Du ≤ −d|u|2 with d > 0. (1.8)

Under the assumptions in (1.6)–(1.8) the energy E = 21 |u|2 satisfies

Here we introduce three different prototype models of complex turbulent

2.1 Turbulent Dynamical Systems for Complex

In the equation above:

2.2 The L-96 Model as a Turbulent Dynamical System

significantly as F increases while Tcorr decreases in these non-dimensional units;

2.3 Statistical Triad Models, the Building Blocks

where ‘×’ is the cross-product, L ∈ R3 , and the nonlinear term

with B1 + B2 + B3 = 0, so that u · B (u, u) = 0. They are the building blocks of

there is instability with B2 B3 > 0 and

state of the mean

2.4 More Rich Examples of Complex Turbulent

2.4.1 Quantitative Models

2.4.2 Qualitative Models

It is very interesting and accessible to develop a rigorous analysis of these models

© Springer International Publishing Switzerland 2016 13

dynamical system guarantees a unique invariant measure or statistically steady state.

3.1 Nontrivial Turbulent Dynamical Systems

Consider the stochastic dynamical equation (SDE)

u · Lu = 0, Skew Symmetry for L,

Insert peq and use (3.2) to get

3.2 Exact Equations for the Mean and Covariance

Consider the turbulent dynamical system from (1.5)–(1.7)

We use a finite-dimensional representation of the stochastic field consisting of a

u (t) = ū (t) + Zi (t; ω) vi , (3.4)

3.2.1 Turbulent Dynamical Systems with Non-Gaussian

Consider a turbulent dynamical system without noise, σ ≡ 0, and assume it has a

Lūeq Req + Req Lū∗eq = −QF,eq , (3.11)

3.2.2 Statistical Dynamics in the L-96 Model and Statistical

Rij = ri δij , ri > 0,

and define the quadratic form as

And by summing up all the modes in the variance equation in (3.14)

Lūs eq ,i = −Γi ūeq − 1, i = 0, · · · , J/2.

3.2.3 One-Layer Geophysical Model as a Turbulent

We say I is symmetric if k ∈ I , then −k ∈ I . One practical choice of I can be

for k ∈ I+ = {k ∈ I : k2 > 0} ∪ {k ∈ I : k2 = 0, k1 > 0}. The corresponding

|k|2 (qk − hk ) −qk + hk |k| (qk − hk )

It is easy to see that

so am,m = 0, am,n = an,m = an,m+n = −a−m,n ; moreover the triad conservation

due to the triad conservation property listed below (3.18).

{Lv }mn = (L + D) em + B (q̄, em ) + B (em , q̄) , en  .

Matrix Qσ expresses energy transfer due to external stochastic forcing, so it is a

3.3 A Statistical Energy Conservation Principle

by a large dimensional phase space, u ∈ R N , with N 1 and a large dimension

{Lv }mn = (L + D) em + B (q̄, em ) + B (em , q̄) , en .

Then the threedimensional Galerkin

em , B (em , en ) + B (en , em ) + en , B (em , em ) = −i (m + n) ei(2m+n)x − im ei(2m+n)x ≡ 0.