Professional Documents
Culture Documents
LECTURE 1:
SEQUENCE ALIGNMENT
4
DEFINITION
Sequence Alignment
a. Sequence alignment (BIOINFORMATIC) is a way of arranging the
sequences of DNA, RNA, or protein to identify regions of similarity
that may be a consequence of functional,
structural, or
evolutionary relationships between the sequences.
b. Sequence alignments are also used for non-biological sequences,
such as calculating the edit distance cost between strings in a
natural language or in financial data.
5
DEFINITION
Sequence Alignment
Sequence alignment is the most
important task in bioinformatics! Mismatches can be interpreted as point
mutations (that is, insertion or deletion
mutations)
6
DEFINITION - Sequence Alignment
Sequence alignment is important for:
* Prediction of function
* Database searching
* Gene finding
* Sequence divergence
* Sequence assembly
7
DEFINITION - Sequence Alignment
Find the similarity between two (or more) DNA-sequences
by finding a good alignment between them.
CCATCAAGTCC
5/15 = 33 %
CCATGTACAGAGTCC
11/15 = 73 %
CCAT---CA-AGTCC
CCATGTACAGAGTCC
8
DEFINITION - Sequence Alignment
HOW IT WORKS
CCATCAAGTCC
CCATGTACAGAGTCC
CCAT---CA-AGTCC
CCATGTACAGAGTCC
DEFINITION - Sequence Alignment
HOW IT WORKS
Ketidakcocokan
(mismatch) dalam
alignment DNA-sequence-1
diasosiasikan dengan
proses mutasi,
sedangkan tcctctgcctctgccatcat---caaccccaaagt
kesenjangan (gap, |||| ||| ||||| ||||| ||||||||||||
tanda "–") tcctgtgcatctgcaatcatgggcaaccccaaagt
diasosiasikan dengan
proses insersi atau DNA-sequence-2 Alignment
delesi.
10
DEFINITION - Sequence Alignment
C - - - T T AA C T
C G G A T C A - - T
+8 -3 -3 -3 +8 -5 +8 -3 -3 +8 = +12
Alignment
score 12
Alignment METHODS
Computational approaches to sequence
alignment generally fall into two categories:
a. Global alignments and
b. Local alignments (BLAST).
13
Alignment METHODS
Global alignment vs Local alignment
Global alignment is attempting to match as much of the sequence as possible. The tool
for Global alignment is based on Needleman-Wunsch algorithm.
Local alignment is to try to find the regions with highest density of matches. The tool for
local alignment is based on Smith-Waterman.
Both algorithms are derivates from the basic dynamic programming algorithm.
LGPSSKQTGKGS-SRIWDN
Global alignment
LN-ITKSAGKGAIMRLGDA
-------TGKG--------
Local alignment
-------AGKG--------
LOCAL Alignment : BLAST
BLAST (Basic Local Alignment Search Tool) merupakan perkakas bioinformatika
yang berkaitan erat dengan penggunaan basis data sekuens biologis.
Penelusuran BLAST (BLAST search) pada basis data sekuens memungkinkan
ilmuwan untuk mencari sekuens asam nukleat maupun protein yang mirip
dengan sekuens tertentu yang dimilikinya.
Hal ini berguna misalnya:
Menemukan gen sejenis pada beberapa organisme atau
Memeriksa keabsahan hasil sekuensing maupun
Memeriksa fungsi gen hasil sekuensing.
Algoritma yang mendasari kerja BLAST adalah penyejajaran sekuens.
15
LOCAL Alignment : BLAST
16
LOCAL Alignment : BLAST
17
LOCAL Alignment : BLAST
Beberapa metode alignment lain yang merupakan pendahulu BLAST
adalah metode "Needleman-Wunsch" dan "Smith-Waterman".
Metode Needleman-Wunsch digunakan untuk menyusun alignment
global di antara dua atau lebih sekuens, yaitu alignment atas
keseluruhan panjang sekuens tersebut.
Metode Smith-Waterman menghasilkan alignment lokal, yaitu alignment
atas bagian-bagian dalam sekuens.
Kedua metode tersebut menerapkan pemrograman dinamik (dynamic
programming) dan hanya efektif untuk alignment dua sekuens (pairwise
alignment)
LOCAL Alignment