Hidden Markov Processes: Theory and Applications to Biology

Ebook555 pages6 hours

Hidden Markov Processes: Theory and Applications to Biology

Name: Hidden Markov Processes: Theory and Applications to Biology
Brand: Princeton University Press
Rating: 5.0 (1 reviews)

By M. Vidyasagar

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

This book explores important aspects of Markov and hidden Markov processes and the applications of these ideas to various problems in computational biology. The book starts from first principles, so that no previous knowledge of probability is necessary. However, the work is rigorous and mathematical, making it useful to engineers and mathematicians, even those not interested in biological applications. A range of exercises is provided, including drills to familiarize the reader with concepts and more advanced problems that require deep thinking about the theory. Biological applications are taken from post-genomic biology, especially genomics and proteomics.

The topics examined include standard material such as the Perron-Frobenius theorem, transient and recurrent states, hitting probabilities and hitting times, maximum likelihood estimation, the Viterbi algorithm, and the Baum-Welch algorithm. The book contains discussions of extremely useful topics not usually seen at the basic level, such as ergodicity of Markov processes, Markov Chain Monte Carlo (MCMC), information theory, and large deviation theory for both i.i.d and Markov processes. The book also presents state-of-the-art realization theory for hidden Markov models. Among biological applications, it offers an in-depth look at the BLAST (Basic Local Alignment Search Technique) algorithm, including a comprehensive explanation of the underlying theory. Other applications such as profile hidden Markov models are also explored.

Skip carousel

LanguageEnglish

PublisherPrinceton University Press

Release dateAug 24, 2014

ISBN9781400850518

Author

M. Vidyasagar

Related to Hidden Markov Processes

Titles in the series (33)

Skip carousel

Selfsimilar Processes
Ebook
Selfsimilar Processes
byPaul Embrechts
Rating: 4 out of 5 stars
4/5
Self-Regularity: A New Paradigm for Primal-Dual Interior-Point Algorithms
Ebook
Self-Regularity: A New Paradigm for Primal-Dual Interior-Point Algorithms
byJiming Peng
Rating: 5 out of 5 stars
5/5
Thermodynamics: A Dynamical Systems Approach
Ebook
Thermodynamics: A Dynamical Systems Approach
byWassim M. Haddad
Rating: 5 out of 5 stars
5/5
Graph Theoretic Methods in Multiagent Networks
Ebook
Graph Theoretic Methods in Multiagent Networks
byMehran Mesbahi
Rating: 5 out of 5 stars
5/5
Matrix Completions, Moments, and Sums of Hermitian Squares
Ebook
Matrix Completions, Moments, and Sums of Hermitian Squares
byMihály Bakonyi
Rating: 5 out of 5 stars
5/5
Positive Definite Matrices
Ebook
Positive Definite Matrices
byRajendra Bhatia
Rating: 5 out of 5 stars
5/5
Wave Scattering by Time-Dependent Perturbations: An Introduction
Ebook
Wave Scattering by Time-Dependent Perturbations: An Introduction
byG. F. Roach
Rating: 5 out of 5 stars
5/5
Matrices, Moments and Quadrature with Applications
Ebook
Matrices, Moments and Quadrature with Applications
byGene H. Golub
Rating: 5 out of 5 stars
5/5
Distributed Control of Robotic Networks: A Mathematical Approach to Motion Coordination Algorithms
Ebook
Distributed Control of Robotic Networks: A Mathematical Approach to Motion Coordination Algorithms
byFrancesco Bullo
Rating: 5 out of 5 stars
5/5
Robust Optimization
Ebook
Robust Optimization
byAharon Ben-Tal
Rating: 5 out of 5 stars
5/5
Control Theoretic Splines: Optimal Control, Statistics, and Path Planning
Ebook
Control Theoretic Splines: Optimal Control, Statistics, and Path Planning
byMagnus Egerstedt
Rating: 5 out of 5 stars
5/5
Topics in Quaternion Linear Algebra
Ebook
Topics in Quaternion Linear Algebra
byLeiba Rodman
Rating: 5 out of 5 stars
5/5
The Traveling Salesman Problem: A Computational Study
Ebook
The Traveling Salesman Problem: A Computational Study
byDavid L. Applegate
Rating: 5 out of 5 stars
5/5
Totally Nonnegative Matrices
Ebook
Totally Nonnegative Matrices
byShaun M. Fallat
Rating: 5 out of 5 stars
5/5
Auxiliary Signal Design for Failure Detection
Ebook
Auxiliary Signal Design for Failure Detection
byStephen L. Campbell
Rating: 0 out of 5 stars
0 ratings
Modern Anti-windup Synthesis: Control Augmentation for Actuator Saturation
Ebook
Modern Anti-windup Synthesis: Control Augmentation for Actuator Saturation
byLuca Zaccarian
Rating: 5 out of 5 stars
5/5
Mathematical Analysis of Deterministic and Stochastic Problems in Complex Media Electromagnetics
Ebook
Mathematical Analysis of Deterministic and Stochastic Problems in Complex Media Electromagnetics
byG. F. Roach
Rating: 5 out of 5 stars
5/5
Chaotic Transitions in Deterministic and Stochastic Dynamical Systems: Applications of Melnikov Processes in Engineering, Physics, and Neuroscience
Ebook
Chaotic Transitions in Deterministic and Stochastic Dynamical Systems: Applications of Melnikov Processes in Engineering, Physics, and Neuroscience
byEmil Simiu
Rating: 5 out of 5 stars
5/5
Algebraic Curves over a Finite Field
Ebook
Algebraic Curves over a Finite Field
byJ. W. P. Hirschfeld
Rating: 5 out of 5 stars
5/5
Hidden Markov Processes: Theory and Applications to Biology
Ebook
Hidden Markov Processes: Theory and Applications to Biology
byM. Vidyasagar
Rating: 5 out of 5 stars
5/5
Stability and Control of Large-Scale Dynamical Systems: A Vector Dissipative Systems Approach
Ebook
Stability and Control of Large-Scale Dynamical Systems: A Vector Dissipative Systems Approach
byWassim M. Haddad
Rating: 3 out of 5 stars
3/5
Entropy
Ebook
Entropy
byAndreas Greven
Rating: 5 out of 5 stars
5/5
Genomic Signal Processing
Ebook
Genomic Signal Processing
byIlya Shmulevich
Rating: 5 out of 5 stars
5/5
Max Plus at Work: Modeling and Analysis of Synchronized Systems: A Course on Max-Plus Algebra and Its Applications
Ebook
Max Plus at Work: Modeling and Analysis of Synchronized Systems: A Course on Max-Plus Algebra and Its Applications
byBernd Heidergott
Rating: 5 out of 5 stars
5/5
Rays, Waves, and Scattering: Topics in Classical Mathematical Physics
Ebook
Rays, Waves, and Scattering: Topics in Classical Mathematical Physics
byJohn Adam
Rating: 0 out of 5 stars
0 ratings
Impulsive and Hybrid Dynamical Systems: Stability, Dissipativity, and Control
Ebook
Impulsive and Hybrid Dynamical Systems: Stability, Dissipativity, and Control
byWassim M. Haddad
Rating: 5 out of 5 stars
5/5
Analytic Theory of Global Bifurcation: An Introduction
Ebook
Analytic Theory of Global Bifurcation: An Introduction
byBoris Buffoni
Rating: 0 out of 5 stars
0 ratings
Mathematical Methods in Elasticity Imaging
Ebook
Mathematical Methods in Elasticity Imaging
byHabib Ammari
Rating: 0 out of 5 stars
0 ratings
A Dynamical Systems Theory of Thermodynamics
Ebook
A Dynamical Systems Theory of Thermodynamics
byWassim M. Haddad
Rating: 0 out of 5 stars
0 ratings
Formal Verification of Control System Software
Ebook
Formal Verification of Control System Software
byPierre-Loïc Garoche
Rating: 0 out of 5 stars
0 ratings

Related ebooks

Skip carousel

Mathematical Methods: Linear Algebra / Normed Spaces / Distributions / Integration
Ebook
Mathematical Methods: Linear Algebra / Normed Spaces / Distributions / Integration
byJacob Korevaar
Rating: 0 out of 5 stars
0 ratings
Theory of Markov Processes
Ebook
Theory of Markov Processes
byE. B. Dynkin
Rating: 0 out of 5 stars
0 ratings
Spectral Radius of Graphs
Ebook
Spectral Radius of Graphs
byDragan Stevanovic
Rating: 0 out of 5 stars
0 ratings
Selfsimilar Processes
Ebook
Selfsimilar Processes
byPaul Embrechts
Rating: 4 out of 5 stars
4/5
Studies in the Theory of Random Processes
Ebook
Studies in the Theory of Random Processes
byA. V. Skorokhod
Rating: 0 out of 5 stars
0 ratings
Brownian Motion, Martingales, and Stochastic Calculus
Ebook
Brownian Motion, Martingales, and Stochastic Calculus
byJean-François Le Gall
Rating: 0 out of 5 stars
0 ratings
Markov Processes for Stochastic Modeling
Ebook
Markov Processes for Stochastic Modeling
byOliver Ibe
Rating: 5 out of 5 stars
5/5
Introduction to the Theory of Infiniteseimals
Ebook
Introduction to the Theory of Infiniteseimals
byElsevier Books Reference
Rating: 0 out of 5 stars
0 ratings
Dynamic Probabilistic Systems, Volume II: Semi-Markov and Decision Processes
Ebook
Dynamic Probabilistic Systems, Volume II: Semi-Markov and Decision Processes
byRonald A. Howard
Rating: 0 out of 5 stars
0 ratings
Fixed Point Theory and Graph Theory: Foundations and Integrative Approaches
Ebook
Fixed Point Theory and Graph Theory: Foundations and Integrative Approaches
byMonther Alfuraidan
Rating: 5 out of 5 stars
5/5
Invitation to Dynamical Systems
Ebook
Invitation to Dynamical Systems
byEdward R. Scheinerman
Rating: 5 out of 5 stars
5/5
Generatingfunctionology
Ebook
Generatingfunctionology
byHerbert S. Wilf
Rating: 4 out of 5 stars
4/5
Introduction to Abstract Analysis
Ebook
Introduction to Abstract Analysis
byMarvin E. Goldstein
Rating: 0 out of 5 stars
0 ratings
Introduction to Spectral Theory in Hilbert Space: North-Holland Series in Applied Mathematics and Mechanics
Ebook
Introduction to Spectral Theory in Hilbert Space: North-Holland Series in Applied Mathematics and Mechanics
byGilbert Helmberg
Rating: 0 out of 5 stars
0 ratings
An Introduction to Information Theory
Ebook
An Introduction to Information Theory
byFazlollah M. Reza
Rating: 0 out of 5 stars
0 ratings
Applied Automata Theory
Ebook
Applied Automata Theory
byJulius T. Tou
Rating: 0 out of 5 stars
0 ratings
The Convolution Transform
Ebook
The Convolution Transform
byIsidore Isaac Hirschman
Rating: 0 out of 5 stars
0 ratings
Theory of Formal Systems. (AM-47), Volume 47
Ebook
Theory of Formal Systems. (AM-47), Volume 47
byRaymond M. Smullyan
Rating: 4 out of 5 stars
4/5
Introduction to Computational Science: Modeling and Simulation for the Sciences - Second Edition
Ebook
Introduction to Computational Science: Modeling and Simulation for the Sciences - Second Edition
byAngela B. Shiflet
Rating: 3 out of 5 stars
3/5
Artificial and Mathematical Theory of Computation: Papers in Honor of John McCarthy
Ebook
Artificial and Mathematical Theory of Computation: Papers in Honor of John McCarthy
byVladimir Lifschitz
Rating: 0 out of 5 stars
0 ratings
Visualizing Quaternions
Ebook
Visualizing Quaternions
byAndrew J. Hanson
Rating: 3 out of 5 stars
3/5
Algebraic Methods in Statistical Mechanics and Quantum Field Theory
Ebook
Algebraic Methods in Statistical Mechanics and Quantum Field Theory
byDr. Gérard G. Emch
Rating: 0 out of 5 stars
0 ratings
Hodge Theory (MN-49)
Ebook
Hodge Theory (MN-49)
byEduardo Cattani
Rating: 0 out of 5 stars
0 ratings
A First Course in Functional Analysis
Ebook
A First Course in Functional Analysis
byMartin Davis
Rating: 0 out of 5 stars
0 ratings
Modern Multidimensional Calculus
Ebook
Modern Multidimensional Calculus
byMarshall Evans Munroe
Rating: 0 out of 5 stars
0 ratings
Introduction to Combinatorial Analysis
Ebook
Introduction to Combinatorial Analysis
byJohn Riordan
Rating: 5 out of 5 stars
5/5
Counterexamples in Probability: Third Edition
Ebook
Counterexamples in Probability: Third Edition
byJordan M. Stoyanov
Rating: 0 out of 5 stars
0 ratings
Fourier Series and Orthogonal Polynomials
Ebook
Fourier Series and Orthogonal Polynomials
byDunham Jackson
Rating: 0 out of 5 stars
0 ratings
Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition
Ebook
Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition
byJames Jeffers
Rating: 0 out of 5 stars
0 ratings
Local Fractional Integral Transforms and Their Applications
Ebook
Local Fractional Integral Transforms and Their Applications
byXiao-Jun Yang
Rating: 0 out of 5 stars
0 ratings

Biology For You

Skip carousel

Anatomy 101: From Muscles and Bones to Organs and Systems, Your Guide to How the Human Body Works
Ebook
Anatomy 101: From Muscles and Bones to Organs and Systems, Your Guide to How the Human Body Works
byKevin Langford
Rating: 4 out of 5 stars
4/5
Peptide Protocols: Volume One
Ebook
Peptide Protocols: Volume One
byMD William A. Seeds
Rating: 4 out of 5 stars
4/5
The Obesity Code: the bestselling guide to unlocking the secrets of weight loss
Ebook
The Obesity Code: the bestselling guide to unlocking the secrets of weight loss
byJason Fung
Rating: 4 out of 5 stars
4/5
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
Ebook
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
byScott H. Young
Rating: 4 out of 5 stars
4/5
Lifespan: Why We Age—and Why We Don't Have To
Ebook
Lifespan: Why We Age—and Why We Don't Have To
byDavid A. Sinclair
Rating: 4 out of 5 stars
4/5
Sapiens: A Brief History of Humankind
Ebook
Sapiens: A Brief History of Humankind
byYuval Noah Harari
Rating: 4 out of 5 stars
4/5
Dopamine Detox: Biohacking Your Way To Better Focus, Greater Happiness, and Peak Performance
Ebook
Dopamine Detox: Biohacking Your Way To Better Focus, Greater Happiness, and Peak Performance
byNick Trenton
Rating: 3 out of 5 stars
3/5
Outlive Diet Recipes: Over 60 Delicious and Healthy Recipes To Help You Live 10 Decades Younger in The Outlive Plan
Ebook
Outlive Diet Recipes: Over 60 Delicious and Healthy Recipes To Help You Live 10 Decades Younger in The Outlive Plan
byJesse Smith
Rating: 4 out of 5 stars
4/5
Fantastic Fungi: How Mushrooms Can Heal, Shift Consciousness, and Save the Planet
Ebook
Fantastic Fungi: How Mushrooms Can Heal, Shift Consciousness, and Save the Planet
byPaul Stamets
Rating: 5 out of 5 stars
5/5
This Will Make You Smarter: 150 New Scientific Concepts to Improve Your Thinking
Ebook
This Will Make You Smarter: 150 New Scientific Concepts to Improve Your Thinking
byJohn Brockman
Rating: 4 out of 5 stars
4/5
Why We Sleep: Unlocking the Power of Sleep and Dreams
Ebook
Why We Sleep: Unlocking the Power of Sleep and Dreams
byMatthew Walker
Rating: 4 out of 5 stars
4/5
The Confident Mind: A Battle-Tested Guide to Unshakable Performance
Ebook
The Confident Mind: A Battle-Tested Guide to Unshakable Performance
byDr. Nate Zinsser
Rating: 0 out of 5 stars
0 ratings
The Code Breaker: Jennifer Doudna, Gene Editing, and the Future of the Human Race
Ebook
The Code Breaker: Jennifer Doudna, Gene Editing, and the Future of the Human Race
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Homo Deus: A Brief History of Tomorrow
Ebook
Homo Deus: A Brief History of Tomorrow
byYuval Noah Harari
Rating: 4 out of 5 stars
4/5
The Grieving Brain: The Surprising Science of How We Learn from Love and Loss
Ebook
The Grieving Brain: The Surprising Science of How We Learn from Love and Loss
byMary-Frances O'Connor
Rating: 4 out of 5 stars
4/5
The Truth About COVID-19: Exposing The Great Reset, Lockdowns, Vaccine Passports, and the New Normal
Ebook
The Truth About COVID-19: Exposing The Great Reset, Lockdowns, Vaccine Passports, and the New Normal
byDoctor Joseph Mercola
Rating: 3 out of 5 stars
3/5
The Winner Effect: The Neuroscience of Success and Failure
Ebook
The Winner Effect: The Neuroscience of Success and Failure
byIan H. Robertson
Rating: 5 out of 5 stars
5/5
How Emotions Are Made: The Secret Life of the Brain
Ebook
How Emotions Are Made: The Secret Life of the Brain
byLisa Feldman Barrett
Rating: 4 out of 5 stars
4/5
Woman: An Intimate Geography
Ebook
Woman: An Intimate Geography
byNatalie Angier
Rating: 4 out of 5 stars
4/5
Gut: The Inside Story of Our Body's Most Underrated Organ (Revised Edition)
Ebook
Gut: The Inside Story of Our Body's Most Underrated Organ (Revised Edition)
byGiulia Enders
Rating: 4 out of 5 stars
4/5
The Book of Fungi: A Life-Size Guide to Six Hundred Species from around the World
Ebook
The Book of Fungi: A Life-Size Guide to Six Hundred Species from around the World
byPeter Roberts
Rating: 4 out of 5 stars
4/5
Mother of God: An Extraordinary Journey into the Uncharted Tributaries of the Western Amazon
Ebook
Mother of God: An Extraordinary Journey into the Uncharted Tributaries of the Western Amazon
byPaul Rosolie
Rating: 4 out of 5 stars
4/5
The Seven Sins of Memory: How the Mind Forgets and Remembers
Ebook
The Seven Sins of Memory: How the Mind Forgets and Remembers
byDaniel L. Schacter
Rating: 4 out of 5 stars
4/5
Jaws: The Story of a Hidden Epidemic
Ebook
Jaws: The Story of a Hidden Epidemic
bySandra Kahn
Rating: 4 out of 5 stars
4/5
All That Remains: A Renowned Forensic Scientist on Death, Mortality, and Solving Crimes
Ebook
All That Remains: A Renowned Forensic Scientist on Death, Mortality, and Solving Crimes
bySue Black
Rating: 4 out of 5 stars
4/5
Lies My Gov't Told Me: And the Better Future Coming
Ebook
Lies My Gov't Told Me: And the Better Future Coming
byRobert W. Malone
Rating: 4 out of 5 stars
4/5
Vax-Unvax: Let the Science Speak
Ebook
Vax-Unvax: Let the Science Speak
byRobert F. Kennedy, Jr.
Rating: 5 out of 5 stars
5/5
Honeybee Democracy
Ebook
Honeybee Democracy
byThomas D. Seeley
Rating: 4 out of 5 stars
4/5
Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness
Ebook
Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness
byPeter Godfrey-Smith
Rating: 4 out of 5 stars
4/5
Written in Bone: Hidden Stories in What We Leave Behind
Ebook
Written in Bone: Hidden Stories in What We Leave Behind
bySue Black
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

A New Map Traces the Limits of Computation: A major advance in computational complexity reveals deep connections between the classes of problems that computers can — and can’t — possibly do.
Podcast episode
A New Map Traces the Limits of Computation: A major advance in computational complexity reveals deep connections between the classes of problems that computers can — and can’t — possibly do.
byQuanta Science Podcast
0 ratings
0% found this document useful
Julian Havil, "Curves for the Mathematically Curious" (Princeton UP, 2019): You don’t have to be mathematically curious to appreciate Julian’s talent for weaving mathematics and history together...
Podcast episode
Julian Havil, "Curves for the Mathematically Curious" (Princeton UP, 2019): You don’t have to be mathematically curious to appreciate Julian’s talent for weaving mathematics and history together...
byNew Books in Mathematics
0 ratings
0% found this document useful
Partial Differential Equations: Origins, Developments and Roles in the Changing World - Gui-Qiang George Chen: Professor Gui-Qiang G. Chen presents in his inaugural lecture several examples to illustrate the origins, developments, and roles of partial differential equations in our changing world.
Podcast episode
Partial Differential Equations: Origins, Developments and Roles in the Changing World - Gui-Qiang George Chen: Professor Gui-Qiang G. Chen presents in his inaugural lecture several examples to illustrate the origins, developments, and roles of partial differential equations in our changing world.
byThe Secrets of Mathematics
0 ratings
0% found this document useful
A Twisted Path to Equation-Free Prediction: Complex natural systems defy standard mathematical analysis, so one ecologist is throwing out the equations.
Podcast episode
A Twisted Path to Equation-Free Prediction: Complex natural systems defy standard mathematical analysis, so one ecologist is throwing out the equations.
byQuanta Science Podcast
0 ratings
0% found this document useful
Being Bayesian: This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised...
Podcast episode
Being Bayesian: This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised...
byData Skeptic
0 ratings
0% found this document useful
#54 Gary Marcus and Luis Lamb - Neurosymbolic models
Podcast episode
#54 Gary Marcus and Luis Lamb - Neurosymbolic models
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Leila Schneps and Coralie Colmez, “Math on Trial” (Basic Books, 2013): You may well have seen “Numb3rs,” a TV show in which mathematicians help solve crimes. It’s fiction. But, as Leila Schneps and Coralie Colmez show in their eye-opening new book Math on Trial: How Numbers Get Used and Abused in the Court Room (Basic Boo...
Podcast episode
Leila Schneps and Coralie Colmez, “Math on Trial” (Basic Books, 2013): You may well have seen “Numb3rs,” a TV show in which mathematicians help solve crimes. It’s fiction. But, as Leila Schneps and Coralie Colmez show in their eye-opening new book Math on Trial: How Numbers Get Used and Abused in the Court Room (Basic Boo...
byNew Books in Mathematics
0 ratings
0% found this document useful
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
Podcast episode
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
byData Skeptic
0 ratings
0% found this document useful
Complex Geometries
Podcast episode
Complex Geometries
byModellansatz
0 ratings
0% found this document useful
Complex Geometries: Modellansatz 086
Podcast episode
Complex Geometries: Modellansatz 086
byModellansatz - English episodes only
0 ratings
0% found this document useful
From classical to non-classical stochastic shortest path problems: Professor Christel Baier delivers the Hillary Term 2024 Strachey Lecture
Podcast episode
From classical to non-classical stochastic shortest path problems: Professor Christel Baier delivers the Hillary Term 2024 Strachey Lecture
byComputer Science
0 ratings
0% found this document useful
How computers have changed the way we do physics - Breaking through the quantum barrier: The power of available computers has now grown exponentially for many decades. The ability to discover numerically the implications of equations and models has opened our eyes to previously hidden aspects of physics.
Podcast episode
How computers have changed the way we do physics - Breaking through the quantum barrier: The power of available computers has now grown exponentially for many decades. The ability to discover numerically the implications of equations and models has opened our eyes to previously hidden aspects of physics.
byTheoretical Physics - From Outer Space to Plasma
0 ratings
0% found this document useful
Homogenization: Modellansatz 116
Podcast episode
Homogenization: Modellansatz 116
byModellansatz - English episodes only
0 ratings
0% found this document useful
Assessing Quantum Computing, with Ignacio Cirac
Podcast episode
Assessing Quantum Computing, with Ignacio Cirac
byLondon Futurists
0 ratings
0% found this document useful
Electromagnetic Field Topology as a Solution to the Boundary Problem of Consciousness
Podcast episode
Electromagnetic Field Topology as a Solution to the Boundary Problem of Consciousness
byBuilding a Science of Consciousness
0 ratings
0% found this document useful
Dynamical Sampling
Podcast episode
Dynamical Sampling
byModellansatz
0 ratings
0% found this document useful
Dynamical Sampling: Modellansatz 173
Podcast episode
Dynamical Sampling: Modellansatz 173
byModellansatz - English episodes only
0 ratings
0% found this document useful
Up in the Electron Clouds—Preston J. MacDougall, Ph.D.—Author & Professor, Middle Tennessee State University: You may or may not remember learning about the periodic table in chemistry class and why it’s shaped the way it is. Dr. Preston MacDougall explains the orbital model that’s behind it, and why orbitals are actually just invented mathematical...
Podcast episode
Up in the Electron Clouds—Preston J. MacDougall, Ph.D.—Author & Professor, Middle Tennessee State University: You may or may not remember learning about the periodic table in chemistry class and why it’s shaped the way it is. Dr. Preston MacDougall explains the orbital model that’s behind it, and why orbitals are actually just invented mathematical...
byFinding Genius Podcast
0 ratings
0% found this document useful
Quantum: Why We Want 'Em: ENCORE Einstein thought that quantum mechanics might be the end of physics, and most scientists felt sure it would never be useful. Today, everything from cell phones to LED lighting is completely dependent on the weird behavior described...
Podcast episode
Quantum: Why We Want 'Em: ENCORE Einstein thought that quantum mechanics might be the end of physics, and most scientists felt sure it would never be useful. Today, everything from cell phones to LED lighting is completely dependent on the weird behavior described...
byBig Picture Science
0 ratings
0% found this document useful
Why the world is simple - Prof Ard Louis: The coding theorem from algorithmic information theory (AIT) - which should be much more widely taught in Physics! - suggests that many processes in nature may be highly biased towards simple outputs.
Podcast episode
Why the world is simple - Prof Ard Louis: The coding theorem from algorithmic information theory (AIT) - which should be much more widely taught in Physics! - suggests that many processes in nature may be highly biased towards simple outputs.
byTheoretical Physics - From Outer Space to Plasma
0 ratings
0% found this document useful
What is Quantum Mechanics?
Podcast episode
What is Quantum Mechanics?
byChemistry Made Simple
0 ratings
0% found this document useful
Anyone Listening? Quantum Cryptography Applications with Vlatko Vedral: Upgrading isn't just for phone systems. Quantum information science tackles the upgrade of old existing technologies, which run by classical physics laws, to those that function in the quantum realm. It's as easy as it sounds: Vlatko Vederal tells...
Podcast episode
Anyone Listening? Quantum Cryptography Applications with Vlatko Vedral: Upgrading isn't just for phone systems. Quantum information science tackles the upgrade of old existing technologies, which run by classical physics laws, to those that function in the quantum realm. It's as easy as it sounds: Vlatko Vederal tells...
byFinding Genius Podcast
0 ratings
0% found this document useful
104: Cumrun Vafa: Puzzles to Unlock The Universe!: Join me in welcoming Prof. Cumrun Vafa (Harvard) in his first major podcast interview! We discussed a wide range of topics including what message hed put into a billion-year time capsule. Plus, we discuss his delightful new book. Beneath all of the compl
Podcast episode
104: Cumrun Vafa: Puzzles to Unlock The Universe!: Join me in welcoming Prof. Cumrun Vafa (Harvard) in his first major podcast interview! We discussed a wide range of topics including what message hed put into a billion-year time capsule. Plus, we discuss his delightful new book. Beneath all of the compl
byInto the Impossible With Brian Keating
0 ratings
0% found this document useful
Excitable dynamics in a molecularly-explicit model of cell motility: Mixed-mode oscillations and beyond
Podcast episode
Excitable dynamics in a molecularly-explicit model of cell motility: Mixed-mode oscillations and beyond
byPaperPlayer biorxiv cell biology
0 ratings
0% found this document useful
Material Science with Houlong Zhuang at Q2B Paris
Podcast episode
Material Science with Houlong Zhuang at Q2B Paris
byThe New Quantum Era
0 ratings
0% found this document useful
Splitting Waves: Modellansatz 062
Podcast episode
Splitting Waves: Modellansatz 062
byModellansatz - English episodes only
0 ratings
0% found this document useful
#037 - Tour De Bayesian with Connor Tann
Podcast episode
#037 - Tour De Bayesian with Connor Tann
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Dawning of the Era of Logical Qubits with Dr Vladan Vuletic
Podcast episode
Dawning of the Era of Logical Qubits with Dr Vladan Vuletic
byThe New Quantum Era
0 ratings
0% found this document useful
Oxford Mathematics Public Lectures: Marc Lackenby - Knotty Problems: Knots are a familiar part of everyday life, for example tying your tie or doing up your shoe laces. They play a role in numerous physical and biological phenomena, such as the untangling of DNA when it replicates.
Podcast episode
Oxford Mathematics Public Lectures: Marc Lackenby - Knotty Problems: Knots are a familiar part of everyday life, for example tying your tie or doing up your shoe laces. They play a role in numerous physical and biological phenomena, such as the untangling of DNA when it replicates.
byThe Secrets of Mathematics
0 ratings
0% found this document useful
Nanophotonics: Modellansatz 066
Podcast episode
Nanophotonics: Modellansatz 066
byModellansatz - English episodes only
0 ratings
0% found this document useful

Skip carousel

Matrix Multiplication Inches Closer to Mythic Goal
Quanta
Article
Matrix Multiplication Inches Closer to Mythic Goal
Mar 23, 2021
5 min read
Imaginary Numbers May Be Essential for Describing Reality
Quanta
Article
Imaginary Numbers May Be Essential for Describing Reality
Mar 3, 2021
4 min read
Physicists Attack Math’s $1,000,000 Question
Quanta
Article
Physicists Attack Math’s $1,000,000 Question
Apr 4, 2017
Physicists are attempting to map the distribution of the prime numbers to the energy levels of a particular quantum system.
4 min read
Math After COVID-19
Quanta
Article
Math After COVID-19
Apr 28, 2020
4 min read
What Question Will You Be Remembered For?
Nautilus
Article
What Question Will You Be Remembered For?
Feb 7, 2018
3 min read
Mathematicians Crack the Cursed Curve
Quanta
Article
Mathematicians Crack the Cursed Curve
Dec 7, 2017
3 min read
Math’s Beautiful Monsters: How a destructive idea paved the way for modern math.
Nautilus
Article
Math’s Beautiful Monsters: How a destructive idea paved the way for modern math.
Apr 3, 2014
Much like its creator, Karl Weierstrass’ monster came from nowhere. After four years at university spent drinking and fencing, Weierstrass had left empty handed. He eventually took a teaching course and spent most of the 1850s as a schoolteacher in B
10 min read
Visionary Mathematician Vladimir Voevodsky Dies at 51
Quanta
Article
Visionary Mathematician Vladimir Voevodsky Dies at 51
Oct 11, 2017
3 min read
Why the Laws of Physics Are Inevitable
Nautilus
Article
Why the Laws of Physics Are Inevitable
Dec 9, 2019
5 min read
Kolmogorov Complexity and Our Search for Meaning
Nautilus
Article
Kolmogorov Complexity and Our Search for Meaning
Aug 2, 2018
Was it a chance encounter when you met that special someone or was there some deeper reason for it? What about that strange dream last night—was that just the random ramblings of the synapses of your brain or did it reveal something deep about your u
8 min read
Math’s Beautiful Monsters
Nautilus
Article
Math’s Beautiful Monsters
Oct 26, 2017
Much like its creator, Karl Weierstrass’ monster came from nowhere. After four years at university spent drinking and fencing, Weierstrass had left empty handed. He eventually took a teaching course and spent most of the 1850s as a schoolteacher in B
10 min read
A Simple Visual Proof of a Powerful Idea in Graph Theory
Nautilus
Article
A Simple Visual Proof of a Powerful Idea in Graph Theory
Sep 8, 2017
2 min read
Why Is Chaos Theory Important?
All About Space
Article
Why Is Chaos Theory Important?
Jan 3, 2020
1 min read
Why Mathematicians Should Stop Naming Things After Each Other
Nautilus
Article
Why Mathematicians Should Stop Naming Things After Each Other
Sep 2, 2020
Any student of modern math must know what it feels like to drown in a well of telescoping terminology. For a high-profile example, let’s take the Calabi-Yau manifold, made famous by string theory. A Calabi-Yau manifold is a compact, complex Kähler ma
6 min read
Business applications For Quantum computing
Rotman Management
Article
Business applications For Quantum computing
May 1, 2022
COMPUTERS DO ARITHMETIC. Underlying every amazing application of computers today is math, calculated using binary digits or ‘bits.’ The original computers of the early 1950s could perform about 465 multiplications per second — much faster than the ‘h
11 min read
New Quantum Algorithms Finally Crack Nonlinear Equations
Quanta
Article
New Quantum Algorithms Finally Crack Nonlinear Equations
Jan 5, 2021
4 min read
What Goes On in a Proton? Quark Math Still Conflicts With Experiments.
Quanta
Article
What Goes On in a Proton? Quark Math Still Conflicts With Experiments.
May 6, 2020
5 min read
‘Rope-jumping’ Rotor Could Pave Way For Molecular Machines
Futurity
Article
‘Rope-jumping’ Rotor Could Pave Way For Molecular Machines
Jul 15, 2018
Researchers have created a new type of molecular rotor that shows promise for future development as a functional machine capable of manipulating matter at atomic and subatomic levels. The research could transform multiple branches of chemistry, along
2 min read
Why Physics Is Not a Discipline: Physics is not just what happens in the Department of Physics.
Nautilus
Article
Why Physics Is Not a Discipline: Physics is not just what happens in the Department of Physics.
Apr 21, 2016
Have you heard the one about the biologist, the physicist, and the mathematician? They’re all sitting in a cafe watching people come and go from a house across the street. Two people enter, and then some time later, three emerge. The physicist says,
12 min read
“Q’s” Real Gadgets Could Go Beyond Lasers Inside Pens
PC Pro Magazine
Article
“Q’s” Real Gadgets Could Go Beyond Lasers Inside Pens
Jul 8, 2021
Recently, I read Quantum Computing: How it Works and Why it Could Change the World by Amit Katwala. It’s a great intro to the present state of and prospects for quantum computing – not too technical, wasting little space on the basics but avoiding th
2 min read
Quantum Simulators An Overview
Techfastly
Article
Quantum Simulators An Overview
Oct 1, 2021
4 min read
The Puzzle Of Quantum Reality
NPR
Article
The Puzzle Of Quantum Reality
Mar 20, 2018
5 min read
Light Could Solve One Of Quantum Computing’s Big Problems
Futurity
Article
Light Could Solve One Of Quantum Computing’s Big Problems
May 8, 2018
Researchers have demonstrated how infrared laser pulses can shift electrons between two different states, the classic 1 and 0, in a thin sheet of semiconductor. The technique could help to solve a major issue with quantum computing. “Ordinary electro
3 min read
Why Does The Universe Exist?
BBC Science Focus Magazine
Article
Why Does The Universe Exist?
Jun 30, 2021
9 min read
Machine Learning Pushes Quantum Computing Forward
Futurity
Article
Machine Learning Pushes Quantum Computing Forward
Mar 18, 2020
4 min read
Computer Science Proof Unveils Unexpected Form of Entanglement
Nautilus
Article
Computer Science Proof Unveils Unexpected Form of Entanglement
Aug 3, 2022
5 min read
Are Neural Networks About to Reinvent Physics?
Nautilus
Article
Are Neural Networks About to Reinvent Physics?
Nov 21, 2019
Can AI teach itself the laws of physics? Will classical computers soon be replaced by deep neural networks? Sure looks like it, if you’ve been following the news, which lately has been filled with headlines like, “A neural net solves the three-body p
9 min read
The Universe Began With a Big Melt, Not a Big Bang
Nautilus
Article
The Universe Began With a Big Melt, Not a Big Bang
Oct 5, 2017
There are two tantalizing mysteries about our universe, one dealing with its final fate and the other with its beginning, that have intrigued cosmologists for decades. The community has always believed these to be independent problems—but what if the
10 min read
The Quest To Find The Graviton May Need To Go Quantum
Futurity
Article
The Quest To Find The Graviton May Need To Go Quantum
Mar 16, 2020
2 min read
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Nautilus
Article
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Feb 12, 2015
I’ve never seen the computer you’re reading this story on, but I can tell you a lot about it. It runs on electricity. It uses binary logic to carry out programmed instructions. It shuttles information using materials known as semiconductors. Its brai
7 min read

Related categories

Skip carousel

Reviews for Hidden Markov Processes

Rating: 5 out of 5 stars

5/5

1 rating0 reviews

Book preview

Hidden Markov Processes - M. Vidyasagar

Preface

Every author aspiring to write a book should address two fundamental questions: (i) Who is the targeted audience? (ii) What does he wish to say to them? In the world of literature, it is often said that every novel is autobiographical to some extent or another. Adapting this maxim to the current situation, I would say that every book I have ever written has been aimed at readers who are in the situation in which I found myself at the time that I embarked on the book-writing project. To put it another way, every book I have written has been an attempt to make it possible for my readers to circumvent some of the difficulties that I myself faced when learning a subject that was new to me.

In the present instance, for the past few years I have been interested in the broad area of computational biology. With the explosion in the sheer quantity of biological data, and an enhanced understanding of the fundamental mechanisms of genomics and proteomics, there is now greater interest than ever in this topic. I got very interested in hidden Markov processes (HMPs) when I realized that several researchers were applying HMPs to problems in computational biology. Thus, after spending virtually my entire research career in blissful ignorance of all matters stochastic, I got down to try to learn something about Markov processes and HMPs. At the same time, I was trying to learn enough basic biology, and to read the literature on the applications of Markov and hidden Markov methods in computational biology. The Markov process literature and the computational biology literature each presented its own sets of problems. The choices regarding the level and contents of the book have been dictated primarily by a desire to enable my readers to learn these topics more easily than I could.

Hidden Markov processes (HMPs) were introduced into the statistics literature as far back as 1966 [12]. Starting in the mid 1970s [9, 10], HMPs have been used in speech recognition, which is perhaps the earliest application of HMPs in a nonmathematical context. The paper [50] contains a wonderful survey of most of the relevant theory of HMPs. In recent years, Markov processes and HMPs have also been used in computational biology. Popular algorithms for finding genes from a genome are based either on Markov models, such as GLIMMER and its extensions [113, 36], or on hidden Markov models, such as GENSCAN [25, 26]. Methods for classifying proteins into one of several families make use of a very special type of hidden Markov model known as a profile hidden Markov model. See [142] for a recent survey. Finally, the BLAST algorithm, which is universally used to carry out sequence alignment in an efficient though probabilistic manner, makes use of some elements of large deviation theory for i.i.d. processes. Thus any text aspiring to provide the mathematical foundations of these methods would need to cover all of these topics in mathematics.

Most existing books on Markov processes invariably focus on processes with infinite state spaces. Books such as [78, 121] restrict themselves to Markov processes with countable state spaces, since in this case many of the technicalities associated with uncountable state spaces disappear. From a mathematician’s standpoint, the case of a finite state space is not worth expounding separately, since the extension from a finite state space to a countably infinite state space usually comes for free. However, even the simplified theory as in [121] is inaccessible to many if not most engineers, and certainly to most biologists. A typical engineer or mathematically inclined biologist can cope with Markov processes with finite state spaces because they can be analyzed using only matrices, eigenvalues, eigenvectors, and the like. At the same time, books on Markov processes with finite state spaces seldom go beyond computing stationary distributions, and generally ignore advanced topics such as ergodicity, parameter estimation, and large deviation theory. Yet, as I have stated above, these advanced ideas are used in computational biology, even if this fact is not always highlighted. There are some notable exceptions such as [116] that discusses ergodicity of Markov processes, and [100] that discusses parameter estimation; but neither of these books discusses large deviation theory, even for i.i.d. processes, let alone for Markov processes.

Thus the current situation with respect to books on Markov processes can be summarized as follows: There is no treatment of advanced notions using only elementary techniques, and in an elementary setting. In contrast, in the present book the focus is exclusively on stochastic processes assuming values in a finite set, so that technicalities are kept to an absolute minimum. By restricting attention to Markov processes with finite state spaces, I try to capture most of the interesting phenomena such as ergodicity and large deviation theory, while giving elementary proofs that are accessible to anyone who knows undergraduate-level linear algebra.

In the area of HMPs, most of the existing texts discuss only the computation of various likelihoods, such as the most likely state trajectory corresponding to a given observation sequence (also known as the Viterb algorithm), or to the determination of the most likely parameter set for a hidden Markov model of a given, prespecified order (also known as the Baum-Welch algorithm). In contrast, very little attention is paid to realization theory, that is, constructing a hidden Markov model on the basis of its statistics. Perhaps the reason is that, until recently, realization theory for hidden Markov processes was not in very good shape, despite the publication of the monumental paper [6]. In the present book, I have attempted to remedy the situation by including a thorough discussion of realization theory for hidden Markov processes, based primarily on the paper [133].

Finally, I decided to include a discussion of large deviation theory for both i.i.d. processes as well as Markov processes. The material on large deviation theory for i.i.d. processes, especially the so-called method of types, is used in the proofs of the BLAST algorithm, which is among the most widely used algorithms in computational biology, used for sequence alignment. I decided also to include a discussion of large deviation theory for Markov processes, even though there aren’t any applications to computational biology, at least as of now. This discussion is a compendium of various results that are scattered throughout the literature, and is based on the paper [135].

At present there are several engineers and mathematicians who would like to contribute to computational biology and suggest suitable algorithms. Such persons face some obvious difficulties, such as the need to learn (I am tempted to say memorize) a great deal of unfamiliar terminology. Mathematicians are accustomed to a reductionist approach to their subject whereby everything follows from a few simply stated axioms. Such persons are handicapped by the huge differences in the styles of exposition between the engineering/mathematics community on the one hand and the biology community on the other hand. Perhaps nothing illustrates the terminology gap better than the fact that the very phrase reductionist approach means entirely different things to mathematicians and to biologists. To mathematicians reductionism means reducing a subject to its core principles. For instance, if a ring is defined as a set with two binary associative operations etc., then certain universal conclusions can be drawn that apply to all rings. In contrast, to biologists reductionism means constructing the simplest possible exemplar of a biological system, and studying it in great detail, in the hope that conclusions derived from the simplified system can be extrapolated to more complex exemplars. Or to put it another way, biological reductionism is based on the premise that a biological system is merely an aggregation of its parts, each of which can be studied in isolation. The difficulties with this premise are obvious to anyone familiar with emergent behavior, whereby complex systems exhibit behavior that has not been explicitly programmed into them; but despite that, reductionism (in the biological sense) is widely employed, perhaps due to the inherent complexity of biological systems.

Computational biology is a vast subject, and is constantly evolving. In choosing topics from computational biology for inclusion in the book, I restricted myself to genomics and proteomics, as these are perhaps the two aspects of biology that are the most reductionist in the sense described above. Even within genomics and proteomics, I have restricted myself to those algorithms that have a close connection with the Markov and HMP theory described here. Thus I have omitted any discussion of, to cite just one example, neural network-based methods. Readers wishing to find an encyclopedic treatment of many aspects of computational biology are referred to [11, 53]. But despite laying out these boundary conditions, I still decided not to attempt a thorough and up-to-date treatment of all available algorithms in genomics and proteomics that are based on hidden Markov processes. The main reason is that the actual details of the algorithms keep changing very rapidly, whereas the underlying theory does not change very much over time. For this reason, the material in Part 3 of the book on computational biology only presents the flavor of various algorithms, with up-to-date references.

I hope that the book would not only assist biologists and other users of the theory to gain a better understanding of the methods they use, but also spur the engineering and statistics research community to study some new and interesting research problems.

I would like to conclude by dedicating this book to the memory of Jan Willems, who passed away just a few months before it was finished. Jan was a true scholar, a personal friend, and a role model for aspiring researchers everywhere. With his passing, the world of control theory is infinitely poorer. Dedicating this book to him is but a small recompense for all that I have learned from him over the years. May his soul rest in peace!

M. Vidyasagar

Dallas and Hyderabad

December 2013

PART 1

Preliminaries

Chapter One

Introduction to Probability and Random Variables

1.1 INTRODUCTION TO RANDOM VARIABLES

1.1.1 Motivation

Probability theory is an attempt to formalize the notion of uncertainty in the outcome of an experiment. For instance, suppose an urn contains four balls, colored red, blue, white, and green respectively. Suppose we dip our hand in the urn and pull out one of the balls at random. What is the likelihood that the ball we pull out will be red? If we make multiple draws, replacing the drawn ball each time and shaking the urn thoroughly before the next draw, what is the likelihood that we have to make at least ten draws before we draw a red ball for the first time? Probability theory provides a mathematical abstraction and a framework where such issues can be addressed.

When there are only finitely many possible outcomes, probability theory becomes relatively simple. For instance, in the above example, when we draw a ball there are only four possible outcomes, namely: {R, B, W, G} with the obvious notation. If we draw two balls, after replacing the first ball drawn, then there 4² = 16 possible outcomes, represented as {RR, … , GG}. In such situations, one can get by with simple counting arguments. The counting approach can also be made to work when the set of possible outcomes is countably infinite.¹ This situation is studied in Section 1.3. However, in probability theory infinity is never very far away, and counting arguments can lead to serious logical inconsistencies if applied to situations where the set of possible outcomes is uncountably infinite. The great Russian mathematician A. N. Kolmogorov invented axiomatic probability theory in the 1930s precisely to address the issues thrown up by having uncountably many possible outcomes. Subsequent developments in probability theory have been based on the axiomatic foundation laid out in [81].

Example 1.1 Let us return to the example above. Suppose that all the four balls are identical in size and shape, and differ only in their color. Then it is reasonable to suppose that drawing any one color is as likely as drawing any other color, neither more nor less. This leads to the observation that the likelihood of drawing a red ball (or any other ball) is 1/4 = 0.25.

Example 1.2 Now suppose that the four balls are all spherical, and that their diameters are in the ratio 4 : 3 : 2 : 1 in the order red, blue, white, and green. We can suppose that the likelihood of our fingers touching and drawing a particular ball is proportional to its surface area. In this case, it follows that the likelihoods of drawing the four balls are in the proportion 4² : 3² : 2² : 1² or 16 : 9 : 4 : 1 in the order red, blue, white, and green. This leads to the conclusion that

Example 1.3 There can be instances where such analytical reasoning can fail. Suppose that all balls have the same diameter, but the red ball is coated with an adhesive resin that makes it more likely to stick to our fingers when we touch it. The complicated interaction between the surface adhesion of our fingers and the surface of the ball may be too difficult to analyze, so we have no recourse other than to draw balls repeatedly and see how many times the red ball comes out. Suppose we make 1,000 draws, and the outcomes are: 451 red, 187 blue, 174 white, and 188 green. Then we can write

is used instead of P to highlight the fact that these are simply observed frequencies, and not the true but unknown probabilities. Often the observed frequency of an outcome is referred to as its empirical probability, or the empirical estimate of the true but unknown probability based on a particular set of experiments. It is tempting to treat the observed frequencies as true probabilities, but that would not be correct. The reason is that if the experiment is repeated, the outcomes would in general be quite different. The reader can convince himself/herself of the difference between frequencies and probabilities by tossing a coin ten times, and another ten times. It is extremely unlikely that the same set of results will turn up both times. One of the important questions addressed in this book is: Just how close are the observed frequencies to the true but unknown probabilities, and just how quickly do these observed frequencies converge to the true probabilities? Such questions are addressed in Section 1.3.3.

1.1.2 Definition of a Random Variable and Probability

Suppose we wish to study the behavior of a random variable X that can assume one of only a finite set of values belonging to a set A = {a1, … , an}. The set A of possible values is often referred to as the alphabet of the random variable. For example, in the ball-drawing experiment discussed in the preceding subsection, X can be thought of as the color of the ball drawn, and assumes values in the set {R, B, W, G}. This example, incidentally, serves to highlight the fact that the set of outcomes can consist of abstract symbols, and need not consist of numbers. This usage, adopted in this book, is at variance from the convention in many mathematics texts, where it is assumed that A is a subset of the real numbers R. However, since biological applications are a prime motivator for this book, it makes no sense to restrict A in this way. In genomics, for example, A consists of the four symbol set of nucleic acids, or nucleotides, usually denoted by {A, C, G, T}. Moreover, by allowing A to consist of arbitrary symbols, we also allow explicitly the possibility that there is no natural ordering of these symbols. For instance, in this book the nucleotides are written in the order A, C, G, T purely to follow the English alphabetical ordering. But there is no consensus on the ordering in biology texts. Thus any method of analysis that is developed here must be permutation independent. In other words, if we choose to order the symbols in the set A in some other fashion, the methods of analysis must give the same answers as before.

Now we give a general definition of the notion of probability, and introduce the notation that is used throughout the book.

Definition 1.1 Given an integer n, the n-dimensional simplex Sn is defined as

Thus Sn consists of all nonnegative vectors whose components add up to one.

Definition 1.2 Suppose A = {a1, … , an} is a finite set. Then a probability distribution on the set A is any vector µ ∈ Sn.

The interpretation of a probability distribution µ on the set A is that we say

to be read as "the probability that the random variable X equals xi is µi." Thus, if A = {R, B, W, G} and µ = [0.25 0.25 0.25 0.25], then all the four outcomes of drawing the various colored balls are equally likely. This is the case in Example 1.1. If the situation is as in Example 1.2, where the balls have different diameters in the proportion 4 : 3 : 2 : 1, the probability distribution is

If we now choose to reorder the elements of the set A in the form {R, W, G, B}, then the probability distribution gets reordered correspondingly, as

Thus, when we speak of the probability distribution µ on the set A, we need to specify the ordering of the elements of the set.

The way we have defined it above, a probability distribution associates a weight with each element of the set A of possible outcomes. Thus µ can be thought of as a map from A into the interval [0, 1]. This notion of a weight of individual elements can be readily extended to define the weight of each subset of A. This is called the probability measure Pµ associated with the distribution µ. Suppose A ⊆ A. Then we define

where IA(·) is the so-called indicator function of the set A, defined by

So (1.2) states that the probability measure of the set A, denoted by Pµ(A), is the sum of the probability weights of the individual elements of the set A. Thus, whereas µ maps the set A into [0, 1], the corresponding probability measure Pµ maps the power set 2A (that is, the collection of all subsets of A) into the interval [0, 1].

In this text, we need to deal with three kinds of objects:

• A probability distribution µ on a finite set A.

• A random variable X assuming values in A, with the probability distribution µ.

• A probability measure Pµ on the power set 2A, associated with the probability distribution µ.

We will use whichever interpretation is most convenient and natural in the given context. As for notation, throughout the text, boldface Greek letters such as µ denote probability distributions. The probability measure corresponding to µ is denoted by Pµ. Strictly speaking, we should write Pµ, but for reasons of aesthetics and appearance we prefer to use Pµ. Similar notation applies to all other boldface Greek letters.

From (1.2), it follows readily that the empty set  has probability measure zero, while the complete set A has probability measure one. This is true irrespective of what the underlying probability distribution µ is. Moreover, the following additional observations are easy consequences of (1.2):

Theorem 1.3 Suppose A is a finite set and µ is a probability distribution on A, and let Pµ denote the corresponding probability measure on A. Then

1. 0 ≤ Pµ(A) ≤ 1 ∀A ⊆ A.

2. Pµ() = 0 and Pµ(A) = 1.

3. If A, B are disjoint subsets of A, then

In the next paragraph we give a brief glimpse of axiomatic probability theory in a general setting, where the set A of possible outcomes is not necessarily finite. This paragraph is not needed to understand the remainder of the book, and therefore the reader can skip it with no aftereffects. In axiomatic probability theory, one actually begins with generalizations of the two properties above. One starts with a collection of subsets S of A that has three properties:

1. Both the empty set  and A itself belong to S.

2. If A belongs to S, so does its complement Ac.

3. If {A1, A2, …} is a countable collection of sets belonging to Salso belongs to S.

Such a collection S is called a σ-algebra of subsets of A, and the pair (A, S) is called a measurable space. Note that, on the same set A, it is possible to define different σ-algebras. Given a measurable space (A, S), a probability measure P is defined to be a function that maps the σ-algebra S into [0, 1], or in other words a map that assigns a number P(A) ∈ [0, 1] to each set A belonging to S, such that two properties hold.

1. P() = 0 and P(A) = 1.

2. If {A1, A2, …} is a countable collection of pairwise disjoint sets belonging to S, then

Starting with just these two simple sets of axioms, together with the notion of independence (introduced later), it is possible to build a tremendously rich edifice of probability theory; see [81].

In the case where the set A is either finite or countably infinite, by tradition one takes S to be the collection of all subsets of A, because any other σ-algebra S′ of subsets of A must in fact be a subalgebra of S, in the sense that every set that is contained in the collection S′ must also belong to S. Now suppose P is a probability measure on S. Then P assigns a weight P({ai}) =: µi to each element ai ∈ A. Moreover, if A is a subset of A, then the measure P(A) is just the sum of the weights assigned to individual elements of A. So if we let µ denote the sequence of nonnegative numbers µ := (µi, i = 1, 2, …), then, in conformity with earlier notation, we can identify the probability measure P with P. Conversely, if µ then the associated probability measure Pµ is defined for every subset A ⊆ A by

where as before IA(·) denotes the indicator function of the set A.

If A is finite, and if {A1, A2, …} is a countable collection of pairwise disjoint sets, then the only possibility is that all but finitely many sets are empty. So Property 3 above can be simplified to:

which is precisely Property 2 from Theorem 1.3. In other words, the case where A is countably infinite is not any more complicated than the case where A is a finite set. This is why most elementary books on Markov chains assume that the underlying set A is countably infinite. But if A is an uncountable infinite set (such as the real numbers, for example), this approach based on assigning weights to individual elements of the set A does not work, and one requires the more general version of the theory of probability as introduced in [81].

At this point the reader can well ask: But what does it all mean? As with much of mathematics, probability theory exists at many distinct levels. It can be viewed as an exercise in pure reasoning, an intellectual pastime, a challenge to one’s wits. While that may satisfy some persons, the theory would have very little by way of application to real situations unless the notion of probability is given a little more concrete interpretation. So we can think of the probability distribution µ as arising in one of two ways. First, the distribution can be postulated, as in the previous subsection. Thus if we are drawing from an urn containing four balls that are identical in all respects save their color, it makes sense to postulate that each of the four outcomes is equally likely. Similarly, if the balls are identical except for their diameter, and if we believe that the likelihood of drawing a ball is proportional to the surface area, then once again we can postulate that the four components of µ are in proportion to the surface areas (or equivalently, to the diameter squared) of the four balls. Then the requirement that the components of µ must add up to one gives the normalizing constant. Second, the distribution can be estimated, as with the adhesive-coated balls in Example 1.3. In this case there is a true but unknown probability vector µ, and our estimate of µ. Then we can try to develop theories that allow us to say how close is to µ, and with what confidence we can make this statement. This question is addressed in Section 1.3.3.

1.1.3 Function of a Random Variable, Expected Value

Suppose X is a random variable assuming values in a finite set A = {a1, … , an}, with the probability measure Pµ and the probability distribution µ. Suppose f is a function mapping the set A into another set B. Since A is finite, it is clear that the set {f(a1), … , f(an)} is finite. So there is no loss of generality in assuming that the set B (the range of the function f) is also a finite set. Moreover, it is not assumed that the values f(a1), … , f(an) are distinct. Thus the image of the set A under the function f can have fewer than n elements. Now f(X) is itself a random variable. Moreover, the distribution of f(X) can be computed readily from the distribution of X. Suppose µ ∈ Sn is the distribution of X. Thus µi = Pr{X = xi}. To compute the distribution of f(X), we need to address the possibility that f(a1), … , f(an) may not be distinct elements. Let B = {b1, … , bm} denote the set of all possible outcomes of f(X), and note that m ≤ n. Then

In other words, the probability that f(X) = bj is the sum of the probabilities of all the preimages of bj under the function f.

Example 1.4 Suppose that in Example 1.1, we receive a payoff of $2 if we draw a green ball, we pay a penalty of $1 if we draw a red ball, and we neither pay a penalty nor receive a payment if we draw a white ball or a blue ball. The situation can be represented by defining X = {R, B, W, G}, Y = {−1, 0, 2}, and f(R) = −1, f(G) = 2, and f(W) = f(B) = 0. Moreover, if the balls are identical except for color, then

Example 1.5 We give an example from biology, which is discussed in greater detail in Chapter 8. Let D equal {A, C, G, T}, the set of DNA nucleotides, and let R equal {A, C, G, U}, the set of RNA nucleotides. As explained in Chapter 8, during the transcription phase of DNA replication thymine (T) gets replaced by uracil (U). Then, during the translation phase of DNA replication, each triplet of RNA nucleotides, known as a codon, gets converted into one of 20 amino acids, or into the STOP symbol. The map that associates each codon with the corresponding amino acid (or the STOP symbol) is called the genetic code. The complete genetic code was discovered by Hargobind mhorana, building upon earlier work by Marshall Nirenberg, and is shown in Figure 1.1. For this pioneering effort, Khorana and Nirenberg were awarded the Nobel Prize in medicine in 1968, along with Robert Holley who discovered tRNA (translation RNA). Further details can be found in Section 8.1.2.

Now let A = R³, the set of codons, which has cardinality 4³ = 64, and let B denote the set of amino acids plus the STOP symbol, which has cardinality 21. Then the genetic code can be thought of as a function f : A → B. From Figure 1.1 it is clear that the number of preimages f−1(b) varies quite considerably for each of the 21 elements of B, ranging from a high of 6 for leucine (Leu) down to 1 for tryptophan (Trp). Thus, if we know the frequency of distribution of codons in a particular stretch of genome (a frequency distribution on A), we can convert this into a corresponding frequency distribution of amino acids and stop codons.

Figure 1.1 The Genetic Code in Tabular Form. Reproduced with permission from Stephen Carr.

Enjoying the preview?

Page 1 of 1

Hidden Markov Processes: Theory and Applications to Biology

About this ebook

M. Vidyasagar

Read more from M. Vidyasagar

Related authors

Related to Hidden Markov Processes

Titles in the series (33)

Related ebooks

Biology For You

Related podcast episodes

Related articles

Related categories

Reviews for Hidden Markov Processes

What did you think?

Book preview

Hidden Markov Processes - M. Vidyasagar

Preface

1.1.1 Motivation

1.1.2 Definition of a Random Variable and Probability

1.1.3 Function of a Random Variable, Expected Value