You are on page 1of 15

CSCE6810

Advanced Topics in Computational Life Science

Mathematical Models in Biology: An Introduction. Ch.4. Modeling Molecular Evolution

Department of Computer Science and Engineering University of North Texas

Mentors: Dr. Armin.Mikler Dr. Rajeev Azad

4.4. Matrix Models of Base Substitutions:


Pages: 138-147

Introduction & Markov Model..

Presenter: Sultanah Al-Shammari 11 Sep 2012

CSCE6810: Mathematical Models in Biology

Vectors & Matrices


Vector: is a list of n real numbers, usually written as a column. Example: 1 V = 2 3 Matrix: An m x n matrix is a two-dimensional rectangular array of real numbers, with m rows and n columns. Example: 1 2 3 M = 4 5 6 7 8 9 - if no. of columns = no. of rows square matrix.

No difference between a vector and n X 1 matrix.


Ch.2: 2.1-2.3Linear Models and Matrix Algebra

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Introduction


Creating a basic model of molecular evolution using: Probability & Matrix algebra. Problem Model the mutation process over one time step. Assuming only base substitutions can occur (no deletions, insertions, or inversions) Solution 1) Modeling the ancestral sequence probabilistically. Each site in the sequence is one of the four bases: A, G, C, or T

PA , PG , PC and PT PA + PG + PC + PT = 1 p0 = ( P A , P G , P C , P T )

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Introduction


2) Specify 16 conditional probabilities of observing a base substitution.

P(S1 = i |S0 = j), i, j = A, G, C, and T


PA|A PA|G PA|C PA|T

3) Create a 4 x 4 matrix to hold the 16 conditional probabilities. + +

M=
The chance that base A mutates to base C

PG|A
PC|A PT|A

PG|G
PC|G P+ T|G
=1

PG|C
PC|C PT|C

PG|T
PC|T PT|T

In each column are entries referring to the same ancestral base S0, and each row are entries referring to the same descendent base S1.

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Introduction


Expressing the the mutation process using a vector p0 and a matrix M. Let us now multiply them :

PA|A M p0 = PG|A PC|A

PA|G PG|G PC|G

PA|C PG|C PC|C

PA|T PG|T PC|T

PA PG PC

PT|A

PT|G

PT|C

PT|T

PT

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Introduction

PA|A PA + M p0 =

PA|G PG + PA|C PC + PA|T PT

PC|A PA +
PG|A PA + PT|A PA +

PC|G PG + PC|C PC + PC|T PT


PG|G PG + PG|C PC + PG|T PT PT|G PG + PT|C PC + PT|T PT

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Introduction


PT|A PA + PT|G PG + PT|C PC + PT|T PT
= The probability that a site in S1 has base T
PT|A PA = P(S1 = T|S0 = A) P(S0 = A) this is same as P(S1 = T and S0 = A)

Using Eq. (4.1) page 132, and by applying similar reasoning: = P(S1 = T and S0 = A) + P(S1 = T and S0 = G) + P(S1 = T and S0 = C)+ P(S1 = T and S0 = T)
This is the sum of four probabilities of mutually exclusive events. By addition rule the probability of the union of the four events P(S1 = T )

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Introduction


Applying the similar reasoning to the other entries in the matrix, we find: M p0 = p1 Where: p0 is a vector of probabilities for various bases occurring in S0 p1 is a vector of probabilities for various bases occurring in S1

M is a transition matrix
P(S0 = i) P(S1 = j)

i, j = A, G, C, and T

CSCE6810: Mathematical Models in Biology

Example Base Substitutions: Page 132


Suppose a 40-base ancestral DNA sequence S0 and its descendent S1

S0 = ACTTGTCGGATGATCAGCGGTCCATGCACCTGACAACGGT S1= ACATGTTGCTTGACGACAGGTCCATGCGCCTGAGAACGGC


S1 \ S0 A G C T A 7 1 0 1
.778

G 0 9 2 0
0 .818 .182 0

C 1 2 7 1
.091 .182 .636 .091

T 1 0 2 6
.111 0 .222 .667

p0 =

PA = .225 PG =.275 PC =.275 PT =.225

.111 0 .111

.225 p1 = .275 .300 .200

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Markov Model

Andrei Markov Russian mathematician

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Markov Model


Markov model describe a system that must be in one of n different states, but may switch from one state to another with time.

Important assumption:
What happens to the system over a given time step depends only on:

the state the system is in at the start of that step, (current state of the system) and the transition probabilities.

CSCE6810: Mathematical Models in Biology

Markov Model/Chain/Process
Markov chain is a mathematical system that undergoes transitions from one state to another, between a finite number of possible states.
http://en.wikipedia.org/wiki/Markov_chain

Markov property is stated as the future is independent of the past given the present
http://www.columbia.edu/~ks20/stochastic-I/stochastic-I-TimeReversibility.pdf

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Markov Model


DNA Substitution Model:

The system is a site in a DNA sequence: the site is initially in one of 4 states (A,G,C, or T) p0 vector of initial probabilities that the system is in each of these states. (all entries 0) Markov/Transition matrix M (4 x 4) hold the conditional probabilities. (all entries 0 and the sum of each column = 1)

Assuming that each site in the sequence behaves identically and independently of every other site.
Markov Model
This assumption is not very reasonable for DNA in some genes. WHY?

CSCE6810: Mathematical Models in Biology

4.4 Matrix Models of Base Substitutions: Markov Model


Why assuming that each site in the sequence behaves identically and independently is not very reasonable for DNA in some genes?

The genetic code allows for many changes in the third site of each codon to have no effects on the product of the gene. Since genes may produce proteins, a change at one site may well be tied to changes at another (dependence).

CSCE6810: Mathematical Models in Biology

Markov Model: Applications


Weather Speech Recognition Bioinformatics

Public Health

Useful Tutorial:
http://www.youtube.com/watch?v=7KGdE2AK_MQ&feature=relmfu

You might also like