Professional Documents
Culture Documents
Tri Basuki Kurniawan, Zuwairie Ibrahim, Muhammad Faiz Mohamed Saaid, and Azli Yahya
Faculty of Electrical Engineering, Universiti Teknologi Malaysia, 81310 UTM Skudai,
Johor Darul Takzim, Malaysia.
Abstract
In DNA based computation, the design of good DNA sequences has turned out to be a
fundamental problem and one of the most practical and important research topics. Although the
design of DNA sequences is dependent on the protocol of biological experiments, it is highly
required to establish a method for the systematic design of DNA sequences, which could be
applied to various design constraints. Much works have focused on designing the DNA sequences
to obtain a set of good DNA sequences. In this paper, Ant System (AS) is proposed to solve the
DNA sequence optimization problem. AS, which is the first approach proposed in Ant Colony
Optimization (ACO), uses some ants to search the solutions based on the pheromone information.
A model is adapted, which consists of four nodes representing four DNA bases. The results of the
proposed approach are compared with other methods, such as evolutionary algorithm.
1.0 Introduction
The objective of the DNA sequence optimization problem is basically to obtain a set of
DNA sequences, where each sequence is unique or cannot be hybridized with other sequences in
the set. In this paper, the objective functions and constraints from [9] are used. Two objective
functions, namely Hmeasure and similarity, are chosen to estimate the uniqueness of each DNA
sequence. Moreover, two additional objective functions, which are hairpin and continuity, are used
in order to prevent secondary structure of a DNA sequence. Furthermore, two constraints, which
are GC% and melting temperature, are used to keep uniform chemical characteristics.
DNA sequence optimization is actually a multi-objective optimization problem. However,
the problem is converted into single-objective problem, which can be formulated as follows:
subjected to Tm and GCcontent constraints, where fi is the objective function for each i ∈ {Hmeasure,
similarity, hairpin, continuity}, and ωi is the weight for each fi. In this study, the weights are
defined by the user.
[τ (r , s )]α [η ( r , s)]β
α β
if s ∈ J k (r )
pk (r , s) = ∑ [τ (r , u )] [η (r , u )] (2)
u∈ J k ( r )
0 otherwise
where Jk(r) is a set of feasible components; that is, edges (r, s) where s is a city not yet visited by
the ant k. Eq. (2) also includes other edges (r, u), where u refers to all cities not yet visited by the
ant k. The parameters α and β control the relative importance of the pheromone versus the heuristic
information.
The pheromone τ(r, s) is associated with the edge joining cities i and j, and the pheromone
updating is done as follows:
τ (r , s) = (1 − ρ ) ⋅ τ (r , s) + ∆τ (r , s) (3)
where ρ is the evaporation rate and ∆τ (r , s ) is the quantity of pheromone laid on edge (r, s) by ant
k:
1 if (r , s ) ∈ sequence done by ant k (4)
∆τ ( r , s ) = Q
0 otherwise
During the initialization step, all parameters, such as α, ρ, and nk were set using default
parameters for ACO [13], α was set to 1 and ρ was set to 0.5. Number of mutation (N) was set as
half of number of ants and 500 iterations were used. The set of sequences consists of 7 sequences
(nk) of 20-mer length.
The DNA parameters are initialized with values in [14]. Hmeasure and similarity employed
continuous parameters (hcon and scon) equal to six bases and discontinuous parameters (hdis and sdis)
Fig 1. Finite state machine as a model for constructing a DNA sequence.
is 17%. For continuity, the threshold value was 2. Hairpin formation was set at least six base-
pairings and a six base loop. The melting temperature was computed base on the nearest-neighbour
(NN) method with 1 M salt concentration and 10 nM nucleic acid concentration.
In this paper, the multi-objective optimization problem is converted into single-objective
problem. Since it is difficult to find the proper weight values for every objective [15], the weight in
Eq. (1) were set equal to 1.
Based on the proposed model and algorithm, 100 independent runs of the AS approach for
DNA sequence optimization were executed and the average values are calculated. Comparisons
are made with the previously proposed ACS approach in [9] and MOEA method in [14] as shown
in Table 1 and Fig. 2.
Our sequences show lower Hmeasure value but a bit higher similarity value. For sequences
generated by ACS, no continuity is observed, whereas the continuity value of sequences generated
by the proposed approach and MOEA are 0.1 and 1.3, respectively. The total objective values
show that the result obtained is the best compared to others.
5.0 Conclusion
Ant System has been implemented without heuristic information for DNA sequence
optimization with four objective functions: Hmeasure, similarity, continuity, and hairpin and two
constraints: GCcontent and melting temperature. The DNA sequences obtained from the proposed
approach were compared with sequences designed by ACS and MOEA approach. The results
show that AS can generate relatively better in total objective values than MOEA approach.
6.0 Acknowledgement
7.0 References
[1] L.M. Adleman, Molecular computation of solutions to combinatorial problem, Science, 266, 1994.
[2] A. Brenneman and A. Condon, Strand design for biomolecular computation, Theory: Comput. Sci, 287,
2002, 39–58.
[3] S. Kashiwamura, A. Kameda, M. Yamamoto, A. Ohuchi, Two-step search for DNA sequence design,
Proceedings of the ITC-CSCC 2003, 2003, 1815–1818.
Table 1. The comparison results of AS approach, ACS [9], and MOEA [14].
The average value of objective for 100 time running of AS approach
C Hr Hm Sm Total
Average 0.1 0.5 36.8 57.9 95.1
Standard Deviation 0.3 0.8 8.5 8.8 1.7
The DNA sequences taken from [9]
Average 0.0 0.0 54.1 51.9 106.0
Standard Deviation 0.0 0.0 12.4 7.8 10.5
The DNA sequences taken from [14]
Average 1.3 0.0 52.0 52.4 107.1
Standard Deviation 3.4 0.0 8.3 6.7 5.5
Remark: C, Hr, Hm, and Sm denote continuity, hairpin, Hmeasure, and similarity objectives values, respectively
Fig 2. Comparison of average fitness of AS, ACS [9], and MOEA [14]. Here, the number of
sequences = 7 and length = 20.
[4] A.J. Hartemink, D.K. Gifford, and J. Khodor, Automated constraint based nucleotide sequence
selection for DNA computation, Proceedings of the 4th DIMACS Workshop DNA Based Computers,
1998, 227–235.
[5] A.G. Frutos, A.J. Thiel, A.E. Condon, L.M. Smith, and R.M. Corn, DNA computing at surfaces: four
base mismatch word design, Proceedings of the 3rd DIMACS Workshop DNA Based Computers, 1997.
[6] U. Feldkamp, S. Saghafi, W. Banzhaf, and H. Rauhe, DNA sequence generator - A program for the
construction of DNA sequences, Proceedings of the 7th International Workshop DNA Based
Computers, 2001, 179–188.
[7] F. Tanaka, M. Nakatsugawa, M. Yamamoto, T. Shiba, and A. Ohuchi, Developing support system for
sequence design in DNA computing, Proceedings of the 7th International Workshop DNA Based
Computers, 2001, 340–349.
[8] F. Tanaka, A. Kameda, M. Yamamoto, and A. Ohuchi, Thermodynamic parameter based on nearest-
neighbor model for DNA sequences with a single-bulge loop, Biochemistry, 43 (22), 2004, 7143–7150.
[9] T.B. Kurniawan, N.K. Khalid, Z. Ibrahim, M. Khalid, and M. Middendorf, An ant colony system for
DNA sequence design based on thermodynamics, Proceedings of IASTED-ACST, 2008, 144-149.
[10] M. Dorigo, Optimization, learning, and natural algorithms, PhD Thesis, Politechnico di Milano, Italy,
1992.
[11] V. Maniezzo, A. Colorni, and M. Dorigo, The ant system applied to the quadratic assignment problem,
Universite Libre de Bruxelles, Belgium, Tech. Rep. IRIDIA/94-28, 1994.
[12] N. McMcMellan, RNA secondary structure prediction using ant colony optimisation, Master Thesis,
School of Informatics, University of Edinburgh, 2006.
[13] M. Dorigo and T. Stutzle, Ant Colony Optimization, Massachusets Institute of Technology, 2004.
[14] S.Y. Shin, I.H. Lee, D. Kim, and B.T. Zhang, Multiobjective evolutionary optimization of DNA
sequences for reliable DNA computing, IEEE Transaction on Evolutionary Computation, 9 (2), 2005,
143-158.
[15] Y. Jin, M. Olhofer, and B. Sendhoff, Dynamic weighted aggregation for evolutionary multi-objective
optimization: How does it work and how?, In Proceeding of GECCO Conference, 2001.