Professional Documents
Culture Documents
Combinatorial Chemistry
Principles and Techniques
rpd Furka
Combinatorial Chemistry
Principles and Techniques
Preface
Combinatorial technologies that were invented in 1980s provided a possibility to produce
new compounds in practically unlimited number. New strategies and technologies have also been
developed that made possible to screen very large number of compounds and to identify useful
components of mixtures containing millions of different substances. This dramatically changed
the drug discovery process in the pharmaceutical industry and the way the researchers design
their experiments. Instead of preparing and examining a single compound, families of new
substances are synthesized and screened. In addition, combinatorial thinking and practice proved
to be useful in areas outside the pharmaceutical research. Such area are, for example, search for
more effective catalysts and materials research.
Combinatorial chemistry became an accepted new branch within chemistry. It is the
subject of numerous books, journals, international conferences and university courses. This book
is written for university students and young researchers. The author feels it important to make it
freely available for all potential readers. For this reason the book will be published exclusively in
electronic form that can be downloaded from appropriate Web sites free of charge.
The author wishes to express his appreciation to Dr. Jzsef L. Margitfalvi of the Central
Chemical Research Institute of the Hungarian Academy of Sciences and Dr. Gyrgy Kri of
Gedeon Richter Ltd, Budapest for reading parts of the manuscript and for their valuable
suggestions.
The mother tong of the author is Hungarian. Despite all efforts the text obviously contains
grammatical errors. Correction of these errors, of course, would be important. The help of the
readers in this respect would be highly appreciated. If you can help please contact the author by
e-mail: afurka@szerves.chem.elte.hu.
rpd Furka
vi
Table of Contents
Preface............................................................................
Table of Contents.......................................................................................................
1. Introduction............
1.1. Birth of the combinatorial approach...................................................................
1.2. The translated version of the document notarized in 1982.................................
1.3. Publication of the split-mix combinatorial synthesis..........................................
References..................................................................................................................
2. The solid phase synthesis.......................................................................................
2.1. Solid supports......................................................................................................
2.1.1. Crosslinked polystyrene...................................................................................
2.1.2. Polyethylene glycol (PEG) grafted supports....................................................
2.1.3. Inorganic ports.................................................................................................
2.1.4. Non-bead form supports..................................................................................
2.2. Linkers, anchors..................................................................................................
2.3. Protecting groups................................................................................................
2.3.1. Protection of amino groups..............................................................................
2.3.2. Protection of carboxyl groups..........................................................................
2.3.3. Protection of other functional groups..............................................................
2.3.4. Coupling reagents for peptide synthesis..........................................................
2.4. Solid phase synthesis of organic molecules........................................................
2.5. Solid phase reagents and scavenger resins in solution phase synthesis..............
References..................................................................................................................
3. Parallel synthesis. Synthesis of compound arrays based on
saving reaction time...............................................................................................
3.1. The parallel synthesis..........................................................................................
3.1.1. The multipin metod of Geysen........................................................................
3.1.2. The SPOT technique of Frank.........................................................................
3.1.3. Other devices for parallel synthesis.................................................................
3.1.4. Parallel synthetic methods with reduced number of operations......................
3.1.4.1. Synthesis of oligonucleotides on paper discs................................................
3.1.4.2. The tea-bag synthesis....................................................................................
3.2. The Ugi multicomponent reactions.....................................................................
3.3. Solution phase combinatorial synthesis..............................................................
3.3.1. Dendrimer supported synthesis........................................................................
3.3.2. Separations using fluorous tags and fluorous solvents....................................
3.3.3. Application of solid phase reagents.................................................................
3.3.4. The use of scavengers in solution phase reactions...........................................
3.4. Automation in parallel synthesis.........................................................................
vii
v
vi
1
2
5
13
14
15
18
18
19
20
20
20
22
23
24
24
25
26
27
28
29
30
31
32
33
36
36
37
37
39
39
40
40
41
42
viii
42
45
47
48
52
55
55
55
56
61
64
66
71
75
79
80
82
83
84
86
87
88
92
92
93
104
110
110
115
117
118
121
122
124
124
124
129
133
ix
135
138
138
138
145
146
147
148
150
151
151
151
154
155
156
161
164
168
171
176
182
183
1. Introduction
The discovery of new materials played an important role in the history of mankind. Many
discovered materials had effect on every days life. The impact of some of these materials was so
definitive that they gave the name of long historical eras. So bronze gave the name for Bronze
Age, for example, and iron for the Iron Age.
The life today is also largely affected by the materials we use. The standard of life could
not be the same without semiconductors, insulators, adhesives, synthetic fibers, drugs, pesticides,
paints etc. In order to improve our life, more and more useful materials and compounds need to
be discovered. The question is how to do that? When we need a new bridge or want to build a
skyscraper, for example, first these objects are designed then they are built according to the plans.
Can we follow this route when we wish to make a new super conductor or a new drug? Certainly
not. Our theoretical knowledge may be sufficient for designing a bridge or a skyscraper but is
definitely not enough for designing a new more effective drug or designing a super conductor
working at or near room temperature. We do not know exactly how the super conducting or other
important properties of materials depend on their structure. The drugs exert their effects by
interactions with proteins or other molecules found in living organisms. The rules governing
these interactions, however, are largely unknown. The rational design of drugs had some
successes. The drug candidates are designed in computers based on the already known three
dimensional structures of target proteins. Both the ligand molecule and the protein itself can take
up a practically unlimited number of conformations, and that leads to difficulties. The
consequence is that mostly the traditional approach is followed: series of compounds are
synthesized then the useful drug candidates are identified by trial and error. In practice, thousands
of compounds are needed to be prepared and tested in order to find a drug candidate.
In the pharmaceutical research one of the bottlenecks was the synthesis of the very large
number of compounds needed in the discovery process. Before 1980 the traditional approach was
used. The compounds were prepared one at a time, and their testing were also carried out one by
one. In the industry, however, sophisticated methods were developed and applied in order to
improve productivity in the mass production of goods. It seems worthwhile to compare the
production of compounds to that of automobiles. Compounds are mostly prepared step by step
from the starting materials. The automobiles are also assembled from parts. The drug candidates
are unique substances all differing from each other. The automobiles are also unique products
since they can differ, for example, in their color, in their engines, in their transmission etc. They
certainly differ from each other in their locks and keys.
The first car manufacturers in the world were Panhard & Levassor in 1889 and Peugeot in
18911. These French manufacturers did not standardize their car models, each car was different
from the other. The first standardized car was the Benz Velo. Benz manufactured 134 identical
Velos in 1895.
Ransome Eli Olds invented the basic concept of the assembly line in 1901 that was
improved Henry Ford and installed it in his car factory in 1913. As a result, by 1927, 15 million
Ford Model Ts had been manufactured. As a result of further improvements and application of
automation, today the streets are full of cars. This shows the power of organizing the process of
production and application of automation. These methods that proved to be very successful in
industry were not applied at all in the mass production of compounds.
After 1980 the situation began to change. Several innovative papers were published which
radically changed our theory and practice in designing and preparing new substances for
pharmaceutical research and other areas of application. The new synthetic and screening
procedures and, which is also very important, the new way of thinking introduced in these papers
founded a rapidly growing new scientific field, Combinatorial Chemistry, revolutionized the
pharmaceutical research and are gradually expanding to other areas within and outside chemistry.
The new methods were developed in several laboratories. The way of thinking that led to these
methods was probably different in all cases. The reasons that lad to the development of the
combinatorial synthesis of peptides in the author's laboratory is described below.
Although the figures expressing the number of components in peptide libraries were far
from being as frightening as the number of the possible protein sequences, they seemed still very
large if the possibility of their synthesis was considered. I thought that many useful bioactive
peptides could - supposedly - be found among the largely unknown components of the libraries.
For this reason the nonexistent peptide libraries reminded me of exceptionally rich gold reefs
which await exploitation. Gold can be produced by mining out all the gold containing rock then
separating the gold from the useless stone.
Table 1.1. The number of possible peptide sequences.
Number of
residues
2
3
4
5
6
7
Name
Number of
sequences
400
8,000
160,000
3,200,000
64,000,000
1,280,000,000
Dipeptides
Tripeptides
Tetrapeptides
Pentapeptides
Hexapeptides
Heptapeptides
Exploitation of the peptide libraries could be achieved via the synthesis of all possible
sequences followed by screening them against all potential targets. At that time, however, even
the synthesis of all, say, pentapeptides seemed absolutely impossible. We usually prepared one
peptide at a time mostly by solid phase synthesis (see later the details of this synthetic method)
with an elongation rate of one amino acid a day.
A
A
A
AA
E
EA
A
R
RA
E
A
R
AE
E
E
AR
R
A
R
RE
EE
E
ER
R
RR
Figure 1.1. The optimized synthesis of peptides from three amino acids (A, E and R). The solid
support is represented by .
With this rate, the synthesis of all the 3.2 million pentapeptides would have taken 3.2x5=16
million days, that is, 43.8 thousand years of uninterrupted work.
The synthesis could have been optimized by reducing to an absolute minimum the
number of necessary coupling steps. This can be achieved by using the already prepared peptides
as starting materials in the synthesis of the longer ones. This is illustrated in Figure 1.1. A peptide
library is prepared by solid phase synthesis using three amino acids (A, E and R). First the amino
acids are attached to the solid support (resin). Then the resin containing one of the attached amino
acid is divided into three portions and the synthesis is continued with the coupling of one of the
amino acids to one of the resin portions and so on. In the first step 3 couplings are carried out,
exactly the number of the formed products. In the second step 9 couplings are needed and 9
dipeptides are formed on the resin samples. In general, the number of coupling steps in such an
optimized synthesis of a peptide library is the same as the total number of products formed in the
whole synthetic process. If the 3.2 million pentapeptides are prepared the number of coupling
steps is the sum of amino acids + dipeptides + tripeptides + tetrapeptides + pentapeptides.
20 + 400 + 8,000 + 160,000 + 3,200,000 = 3,368,420
Supposing again the rate of one coupling per day in order to get the necessary time in
years, the above figure is divided by 365. The result is 9,228 years. This shows that optimization
of the synthesis reduces the time of the synthetic process from 43,800 years to 9,228 years, which
is still too long to be realizable.
I considered the accessibility of all peptide sequences to be very important, and around
1980 I began to think about potential solutions for their synthesis. It took only a short time to find
one, which would work at least in principle. The idea was also based on the method of solid
phase synthesis developed by professor Merrifield5. According to this first idea, the amino acids
used in the solid phase preparation of peptides would be replaced by an equimolar mixture of 20
different amino acids in every coupling step of the synthesis. This would lead - at least in
principle - to formation of a rapidly growing number of sequences and finally a full peptide
library could be cleaved from the support in the form of a mixture. It was clear, however, that in
such couplings the products are expected to form in unequal molar quantities as a consequence of
the differences in the reactivity of the amino acids. The differences in molarities would be
amplified in each successive coupling step leading to a mixture with uncertain composition. I felt
that a better solution might exist and I was rethinking the problem again and again. In early
spring in 1982 I spent a weekend in a little town in South-East of Hungary forgetting this time the
whole diversity problem. To my great surprise, however, next morning I awoke with the perfect
solution in my mind. The method based on this idea is known nowadays as the split-mix
procedure.
The split-mix method opened the possibility for producing peptide mixtures containing
millions of components. Such mixtures, however seemed unacceptable in the conventional drug
discovery practice where single compounds were used in pure form. For this reason there was an
urgent need to present in addition, an efficient strategy for identification of the bioactive
substance that may be present in the complex synthetic mixture. This task, however, looked
similar to finding the proverbial needle in a huge haystack. Nevertheless I could develop a
theoretical solution in a very short time. I called it synthetic back searching strategy which later
proved to be in principle identical with the "iteration strategy", published by others.
I was fully aware of the importance of the combinatorial approach in the pharmaceutical
research but one of the leading Hungarian pharmaceutical companies I contacted showed no
interest at all. In addition, the split-mix method was considered by the patent attorneys only as a
potential research tool and for this reason it was judged not to be patentable. They suggested me,
4
however, to describe the method in a document and - in order to give me some support in
potential future priority disputes - notarize it. I did so and the document written in Hungarian - in
which the principles of combinatorial chemistry including both synthesis and screening were first
clearly explained - was notarized in May, 1982. The photo of the first and last pages of the
document is demonstrated in Figure 1.2.
Figure 1.2. The photo of the first and last page of the 1982 document
The 1982 document, as shown in the Figure 1.2, was written in Hungarian. This is the
first authentic document in which the principles of combinatorial chemistry are described. The
translated version can be seen below.
potential therapeutic effect. This fact motivates the intensive international and domestic research
activity in this field.
Two, in principle different, approaches offer themselves for searching for peptides
bearing new biological effects:
1. Isolation of peptides from living organisms based on their previously known
biological effects.
2. Preparation of peptides by synthesis with post determination of their biological
effects.
Until now the isolation procedure proved to be more effective in spite of the fact that this
method is also very laborious. This may be explained by the fact that the number of possible
peptides grows rapidly with the number of residues so even the synthesis of all tetrapeptides (160
thousands) seems to be a hopeless task. If we consider the 20 natural amino acids the dependence
of the number (Nn) of possible peptides on the number of residues (n) is expressed by the
following formula:
Nn = 20n
If the n-residue peptides are synthesized stepwise and independently, the number of the required
synthetic steps (Sn) can be calculated as follows:
Sn = (n-1) 20n
It is noted, that a synthetic step means a complete coupling cycle, that is, in addition to the
coupling step itself incorporates the operations connected with the protecting groups, too.
With good organization, that is, choosing a systematic synthesis route the number of
synthetic steps can be reduced. The minimum number of synthetic steps is:
n
S n 20i
i2
The synthesized peptides are supposed to be submitted to screening tests. Since several
tests have to be done on each peptide, the total number of the required screening tests is
hopelessly large. If the number of kinds of screening tests is denoted by t, the total number of
screening tests is expressed by the following equation:
Tn = t 20n
Table 1.2 shows the possible number of peptides depending on the number of residues,
the number of synthetic steps required for their synthesis, and number of the screening tests,
calculating with 10 different tests (t=10). The figures - which are rounded - clearly show, that
even the synthesis and testing of all tripeptides would be an almost hopeless venture.
Because of the very large number of possible peptides, the stepwise synthesis of all
peptides - even in the case of small ones - is an unrealizable task. The large number of the
screening experiments constitutes a further problem. The proposal to be outlined on the next
pages will try to somewhat improve this almost hopeless situation.
6
Table 1.2.
Possible number of peptides (Nn ) containing different number of residues (n),
the number of synthetic steps required for their synthesis (Sn ) in an optimized
process, furthermore the number of screening experiments (Tn ) calculating
with 10 different screening tests (t=10)
(the figures are rounded)
n
2
3
4
5
6
7
8
9
10
4
8
160
3
64
1
25
512
10
Nn
hundred
thousand
thousand
million
million
billion
billion
billion
trillion
Sn
hundred
thousand
thousand
million
million
billion
billion
billion
trillion
4
8
168
3
67
1
26
537
10
4
80
2
30
640
13
256
5
102
Tn
thousand
thousand
million
million
million
billion
billion
trillion
trillion
Systematic search for biologically active small peptides through synthesis and screening of
peptide mixtures
The proposal to be outlined here constitutes a research project which makes possible to
search for biologically active peptides with much greater chance than before. When I write down
this project I'm fully aware of its potential importance in industry. It is also clear, that it's
realization is possible only through cooperation of different institutions. Primarily the
participation of the pharmaceutical industry is desirable since the investments can be recovered
through pharmaceutical industry.
The essence of the proposal is that instead of one by one synthesis of peptides, peptide
mixtures should be prepared containing several hundred or several thousand peptides in
approximately 1 to 1 molar ratio, and these peptide mixtures should be submitted to screening
tests. It will be shown that on this way much labor can be saved both in the synthetic work and in
the screening experiments. In the first stage one has to determine whether or not the mixture
shows any biological effect. If biological effect is observed, of course, it has to be determined
which component (or which components) are responsible for the activity.
Method for synthesis of peptide mixtures
Since not single peptides but rather mixtures of peptides are synthesized, post synthetic
purification and removal of by-products are out of question. Because of this, the classical method
of synthesis (in solution) can not be used either. In the synthesis of peptide mixtures the solid
phase method has to be applied. It is noted here, that in the syntheses not necessarily the 20
amino acids are used. In some cases more than 20 amino acids may be used, for example if - in
addition - non-common amino acids are intended to be used as building blocks. Less than 20
amino acids may be used, for example, in decapeptides, since the synthesis of all peptides seems
to be unrealistic and have to compromise with the use of fewer kinds of amino acids. Let denote
by k the number of the amino acids intended to vary in the i-th position. The numbers of amino
acids varied in the C-terminal and N-terminal position are k1 and kn , respectively.
Realization of the synthesis
The resin is divided into k1 equal portions (that is to as many portions as many amino
acids are intended to vary at the C-terminal of peptides). Then each portion of resin is coupled
with one of the k1 kinds of amino acids then the amino-protecting group is removed from every
sample. A small quantity is removed from every sample and they are taken aside for later use,
then the samples are thoroughly mixed. Then the mixture of aminoacyl resins is divided into
k2 equal portions and each of them is coupled with one of the k2 kinds of protected amino acids
then the amino-protecting groups are removed from each sample. Before mixing, again small
samples are removed and taken aside. The mixture of dipeptides is cleaved from a small portion
of the mixed resin to use it in biological tests. The rest of the mixed resin is divided into k3 equal
parts and the amino acids intended to occupy the third position are coupled to them. Then the
synthesis is likewise continued until the mixture of n-residue peptides is reached.
It is worthwhile to add some notes. As in an ordinary solid phase synthesis, one has to
make an effort to achieve good conversion by applying the reagents in excess. Fortunately,
however, conversions lower than 100%, or minor unwanted splitting reactions do not cause so
serious problems like in ordinary syntheses. The labour requirement could be significantly
reduced by using mixtures of properly protected amino acids in acylation reactions. This,
however, does not seem to be an acceptable solution because of the differences in the reactivity
of the activated amino acids which would lead to the formation of peptides in significantly
different concentrations thus causing problems in the screening experiments. Formation of
peptides in equal concentrations can only be assured by mechanical mixing of samples followed
by dividing into equal portions. This makes possible a complete conversion for every amino acid
component. Possibility of acylations with mixtures of several amino acids of identical reactivity
might be a matter of further considerations. Smaller differences in reactivities could be
compensated by properly selected molar ratios of the amino acid derivatives of the mixture. In
the following calculations, however, the possibility of acylations with the mixtures of amino acid
derivatives will be left out of considerations.
The number of peptides formed in the synthesis, that is, the number of components in the
peptide mixtures - in a general case - can be calculated by the following formula:
Nn = k1.k2 . . . . . . kn-1.kn
If the same number (k) of amino acids are varied in every position
Nn = k n
The number of synthetic steps in the synthesis of a peptide mixture containing Nn peptides
(considering the attachment of the first amino acid to the resin as separate step) is:
Sn = k1 + k2 + . . . . + kn-1 + kn
If the same number (k) of amino acids are varied in each position,
Sn = nk
The formulae show the advantage of the synthesis of peptide mixtures: the number of the
synthetic steps can be calculated by summing the numbers of the varied amino acids, while the
number peptides is given by the product of the numbers of the varied amino acids.
One example: the synthesis of the mixture of tetrapeptides prepared by varying the 20 kinds
of amino acids, needs only 80 synthetic steps! It is noted, that in the same run all shorter peptides
- that is the 400 dipeptides and the 8000 tripeptides - are formed, too. The traditional synthesis of
these peptides would need 168 400 synthetic steps. A different comparison: in the traditional
method with 80 steps only about 30 tetrapeptides can be synthesized.
Screening of peptide mixtures
Peptides mixtures - in the first approximation - are synthesized to determine whether or
not they contain biologically active component. It is supposed - although it needs experimental
verification - that screening experiments can be made with mixtures, too. This offers great
advantage over the traditional method since the number of screening tests is reduced by a factor
equal to the number of components of the mixture. For example, the mixture of the 8000
tripeptides can be examined by a single series of tests. If there is active peptide among them, one
of the executable t tests gives positive result. If the number of active peptides is more than one,
then, of course, more tests may give positive result. In the synthesis of the mixture of n-residue
peptides it is wortwhile to test the shorter peptides, too. The synthesis is so designed to allow for
this. Taking this requirement into account, and the number of kinds of tests being t, the total
number of the executable tests is:
Tn = t(n-1)
Although this equation certainly holds, its realization in practice deserves some notes. There is without any doubt - an upper limit in the number of components of the peptide mixtures to be
submitted to screening tests. It is difficult to estimate this number without experiments. The
mixtures may probably contain many thousands of components, and as it can be judged today,
the method outlined above is rather limited by possibilities of screening tests than by the number
of the required synthetic steps. If there are too many components in the mixture, too large
samples have to be applied in the screening experiments to achieve observable effect for a single
component. The mixture supposedly contains a number of more or less active analogs and their
effect is probably summarized. Nevertheless, an unsurpassable limit in the number of
components certainly exists. Therefore in certain cases may prove useful to examine the effect of
the n-residue mixtures without final mixing. In other cases the synthesis should be designed so
not to surpass the optimal number of components.
"Back-searching" for the active peptide
If the peptide mixture is detected to contain active component, that is, if the mixture
shows a new type biological effect, then the further task is the isolation and structure
determination of the active peptide followed by its synthesis. Once the mixture containing the
9
active component or components is in our hand the isolation can be carried out using the effective
separation methods, since these make possible to separate the active compound even from
thousands of inactive components. It is possible, however, to follow a different method, too. This
will be outlined here. This approach to the identification of the active peptides is supposed to be
less tedious then the isolation methods, moreover it supplies additional information concerning
the structure-effect relationship. Applicability of the method requires a procedure for quantitative
determination of activity. For the sake of simplicity let's suppose that the mixture contains a
single effective component (besides analogs having the same kind of effect but smaller activity).
Back-searching step No. 1
The experiments are started with the kn samples taken aside in the synthesis of the nresidue peptides before final mixing. The mixtures of n-residue peptides are cleaved from each
resin sample. The mixtures of peptides differ from each other only in the n-th (that is the Nterminal) residue of their component peptides. Each peptide mixture is submitted to a quantitative
activity determination. This shows how the activity depends on the terminal amino acid residue,
that is, this way we can determine the N-terminal residue of the active peptide, and in addition it
will show the effect of its replacement by other amino acid residues. Let's suppose, for example,
that the N-terminal residue in the sample showing the highest activity (as well as in the active
peptide) is Phe (phenylalanine). It is noted here that if there are several samples showing equally
high activity it is practical to choose as the N-terminal residue of the active peptide the cheapest
or the synthetically less problematical amino acid. This note holds for the subsequent backsearching steps, too.
Back-searching step No. 2
The experiment is continued with the kn-1 samples taken aside in the synthetic stage of the
(n-1)-residue peptides. The amino acid determined before, that is Phe in our example, is coupled
to each sample. Cleavage of the peptides from the support gives kn-1 different peptide mixtures.
Their common feature is that every peptide has Phe in the N-terminal position. By submitting the
peptide mixtures to quantitative screening experiments one can determine the amino acid residue
occupying position n-1 (that is, the pre-N-terminal position) in the active peptide. This
experiment also shows the effect on activity of substitution of this amino acids with other ones.
Let's suppose that the pre-aminoterminal amino acid is Arg (arginine). It should be noted that in
this back-searching step Phe is coupled to kn-1 samples and the same number (kn-1) of screening
experiments have to be done. Not all of the t kinds of tests are required, only the one proved
before to be positive. Consequently the number of the synthetic steps and the number of
screening experiments are the same: kn-1. It is also noted that in the previous back-searching step
only screening test are done (their number is kn) synthetic steps are not needed.
Back-searching step No. 3
This, and the subsequent back-searching steps may be realized using two different
approaches. The peptides in samples taken aside during the synthesis have to be elongated to
contain n residues, in such way, to carry on their N-terminal section the amino acid residues
assuring activity. This can be realized on two ways. Either by stepwise coupling with amino acids
(in our example with protected Arg then Phe) or by coupling in a single step with a previously
10
synthesized oligopeptide having the required sequence (in our example Phe.Arg). The required
synthetic steps in the two approaches significantly differ. The number of the screening
experiments, however, are the same in both cases. Let's turn now to the No. 3. back-searching
step.
Stepwise elongation
Let's take the kn-2 samples taken aside in the synthesis of (n-2)-residue peptides. Each
sample is coupled first with protected Arg then with protected Phe. After cleaving the peptides
from the support each of the kn-2 peptide mixtures are submitted to activity tests to determine the
amino acid residue occupying the third position counting from the N-terminal end. The number
of screening tests to be executed is kn-2. The number of the required synthetic steps is: 2kn-2. The
multiplying factor preceding k is the bigger the shorter are the peptides to be elongated. The
numerical value of the factor is equal to the number of amino acids to be coupled with in the
elongation process.
Elongation with oligopeptide
A previously synthesized dipeptide (in our example Phe.Arg) is coupled to each of the kn-2
samples taken aside and the process is continued as described above. The number of screening
test is also kn-2. The number of synthetic steps (leaving out of consideration the synthesis of the
oligopeptide) is also kn-2. This procedure seems to be more economical. In practice it means that
the active peptide is synthesized in parallel with the screening tests using the classical method
started from the N-terminus. Small fractions of the growing peptide are sacrificed in the backsearching steps. This back-searching method has the great advantage (in addition to the fact that
it needs less synthetic steps) that when the back-searching procedure is finished the active peptide
is synthesized, too.
Back-searching of more than one active peptide
In the synthetic peptide mixtures several active peptides may be present, showing
different effects. In these cases the number of back-searching steps will be bigger by a factor
equal to the number of the differing active peptides. That is, if the number of the active peptides
is "a" the values deduced above are multiplied by a. It is noted that the presence in the mixture of
peptides having different effects may complicate the back-searching process especially in the
case of peptides with opposing effects. This, however, is not treated in details.
The back-searching process ends when the sequences of all active peptides are
determined by applying either the oligopeptide or the stepwise elongation method.
Total number of synthetic steps and screening tests summarized
for the whole synthetic backsearching process
Number of synthetic steps using oligopeptide elongation
n
S n ki
In synthesis:
i 1
11
n 1
S n a ki
In back-searching:
i 1
n 1
S n (a 1) ki k n
i 1
Sn = [n(a + 1) - a]k
S n ki
In synthesis:
i 1
n 1
S n a ( n 1)ki
In back-searching:
i 1
n 1
S n k i a (n 1) ki
i 1
i 1
n 1
S n nk ak i
i 1
Number of screening tests equally valid using the oligopeptide and stepwise elongation
In synthesis:
Tn = t(n-1)
In back-searching:
Tn a ki
n
i 1
n
Tn t ( n 1) a ki
Tn = t(n-1) + ank
i 1
n=5
12
k=20
t=10
180
300
140
a=1
13
(year 2004)
References
1. http://inventors.about.com/library/weekly/aacarsassemblya.htm
2. L. B. Smillie, . Furka, N. Nagabhushan, K. J. Stevenson, C. O. Parkes Nature 1968, 218,
343.
3. A. Einstein The meaning of relativity, Princeton University Press, 1955, 5th Ed., Princeton,
NY, p. 107.
4. A. Linde Scientific American 1994, November, p. 48.
5. R. B. Merrifield J. Am. Chem. Soc. 1963, 85, 2149.
6. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherland, 1988, Vol. 5, p 47.
7. . Furka, F. Sebestyn, M. Asgedom, G. Dib Proceedings of the 10th International
Symposium of Medicinal Chemistry, Budapest, Hungary, 1988, p 288, Abstract P-168.
8. . Furka, F. Sebestyn, M. Asgedom, G. Dib Int. J. Peptide Protein Res. 1991, 37, 487.
9. R. A. Houghten, C. Pinilla, S. E. Blondelle, J. R. . Appel, C. T. Dooley, J. H. Cuervo Nature
1991, 354, 84.
10. K. S. Lam, S. E. Salmon, Hersh E. M, V. J. Hruby, W. M. Kazmierski, R. J. Knapp Nature
1991, 354, 82.
11. . Furka, I Hargittai PERIODICA POLYTECHNICA SER. CHEM. 2004, 48, No. 1, p. 13.
14
Side chain
-R
-CH3
-(CH2)3NH(C=NH)NH2
-CH2CONH2
-CH2COOH
-CH2SH
-(CH2)2CONH2
-(CH2)2COOH
-H
-CH2(4-imidazolyl)
-CH(CH3)CH2CH3
-CH2CH(CH3)2
-(CH2)4NH2
-(CH2)2SCH3
-(benzyl)
-CH2OH
-CH(CH3)OH
-CH2(3-indolyl)
-(4-hydroxybenzyl)
-CH(CH3)2
One letter
symbol
A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V
The only exception is proline in which the side chain and the amino group form a ring.
15
COOH
N
H
The twenty -amino acids that are components of proteins are listed in Table 2.1.
In the traditional way, peptides are synthesized in solution from properly protected amino
acids.
Z-NH-CH-COOX + H2N-CH-COOB
R1
Z-NH-CH-CO-NH-CH-COOB
R2
R1
R2
The carboxyl group of one amino acid is protected (by protecting group B) while the
amino group is free. The amino group of the other amino acid is protected (Z) and the carboxyl
group is activated (X) in order to make it capable to acylate the other amino acid.
Z-NH-CH-COOH + Cl-CH2R
Z-NH-CH-COO -CH2R
NH2-CH-COO-CH2R
Z-NH-CH-COX + NH2-CH-COO-CH23
R1
Z-NH-CH-CO-NH-CH-COO -CH2R1
NH2-CH-CO-NH-CH-COOH +
+
R1
R
Figure 2.1. Solid phase synthesis of a dipeptide.
1: Attachment of the first N-protected amino acid to the solid support ( ). 2. Removal of the
protecting group (Z) from the amino group of the attached amino acid. 3: Coupling the second Nprotected amino acid to the attached one. 4: Cleaving the dipeptide from the solid support and
removing the protecting group.
16
Resin
Frit
17
the coupling reactions a mixture of reagents is added which cleaves the peptide from the resin
and removes the protecting groups. The synthesized peptide can be recovered from the filtrate.
In the solid phase synthesis the amino acids and the reagents can be added in excess to
drive the reactions to completion. The excess of the amino acids and reagents can easily be
removed by filtration. The coupling step can even be repeated to ensure complete conversion.
The traces of the reagents are removed by repeated washings and the product of coupling remains
on the filter in pure form.
As outlined above the elongation of the peptide chain on the support is realized in
identical coupling cycles (of course the added protected amino acid may vary from cycle to
cycle). This opens the possibility of automation. In fact Professor Merrifield and his colleagues
constructed and published an automatic peptide synthesizer2. Today many kinds of solid phase
peptide synthesizers are commercially available. In addition, solid phase automatic synthesizers
have also been developed for preparation of other kinds of organic compounds, too (see later).
Linker
Start compound
The core ensures the insolubility of the support, determines the swelling properties, while the
linker provides the functional group for attachment of the start compound and determines the
reaction conditions for the cleavage of the product. The linker itself and the covalent bond
formed with the start compound must be stable under the reaction conditions of the synthesis.
solvents. Table 2.2 shows the swelling factor (ml/g) of 1% crosslinked polystyrene in different
solvents.
Swelling factor
5.5
5.3
5.2
4.9
Solvent
Acetonitrile
Dimethylformamide
Methanol
Water
Swelling factor
4.7
3.5
1.8
-
Functional groups can be introduced into the resin by two approaches: either by postfunctionalization of the aromatic rings of polystyrene, or by using functionalized styrene in
polymerization.
The bead size of the resin is an important factor that has to be considered in solid phase
synthesis. The reactions are faster when small beads are used, but application of very small beads
may cause problems in filtration. The bead size is characterized either by the diameter of the
beads or by the inversely proportional mesh size. In practice most often the 200-400 mesh (35-75
micron) or the 100-200 mesh (75-150 micron) bead sizes are used. The bead size distribution also
deserves consideration. A narrow bead size distribution is advantageous. The capacity of the
polystyrene beads is around 0.5 mmol/g.
19
Resin
Linker
Anchor
The anchor can also be considered as a protecting group of one of the functional groups of
the final product and, as such, it determines the reaction conditions by which the product can be
cleaved from the support. A large variety of the commercially available resins contain the already
built in anchor. A series of selected examples are found below.
Merrifield resin.
The Merrifield resin can be used to attach carboxylic acids to the resin. The product can
be cleaved from the resin in carboxylic acid form using HF.
CH2-Cl
Trityl chloride resin.
The trityl chloride resin is much more reactive than the Merrifield resin. It can be used for
attachment of a vide variety of compounds like carboxylic acids, alcohols, phenols, amines,
20
thiols. The products can be cleaved under mild conditions using a solution of trifluoroacetic acid
(TFA) in varying concentrations (2-50%).
Cl
Hydroxymethyl resin.
The resin can be applied for attachment of activated carboxylic acids and the cleavage
conditions resemble that of the Merrifield resin.
CH2-OH
Wang resin.
The resin is used to bind carboxylic acids. The ester linkage formed has a good stability
during the solid phase reactions but its cleavage conditions are milder than that of the Merrifield
resin. Usually 95% TFA is applied. It is frequently used in peptide synthesis.
O-CH3
O
CH2-OH
Aminomethyl resin.
Carboxylic acids in their activated form can be attached to the resin. Since the formed
amide bond is resistant to cleavage, the resin is used when the synthesized products are not
cleaved from the support; they are tested in bound form.
CH2-NH2
21
NH2
CH
OCH3
OCH3
Photolabile anchors.
Photolabile anchors have been developed that allow cleavage of the product from the
support by irradiation without using any chemical reagents. Such anchors, like the 2-nitrobenzhydrylamine resin below, usually contain nitro group that absorbs UV light.
CH
NH2
NO2
Traceless anchors.
The initial building block of a multi-step solid phase synthesis needs to have one
functional group (in addition to others) for its attachment to the solid support. It may happen that
in the end product this group is unnecessary and needs to be removed. For this reason anchors
have been developed that can be cleaved without leaving any functionality in the end product at
the cleavage site. These traceless anchors usually contain silicon based linkers.
22
O
CH 2 -O-C-Cl
H2N~
O
CH 2 -O-C-NH ~
The Z protection is stable under mildly basic conditions and nucleophilic reagents at ambient
temperature. Cleavage can be brought about by HBr/AcOH, HBr/TFA or catalytic
hydrogenolysis.
The t-butoxycarbonyl (Boc) group.
An alternative choice for amino group protection is the Boc group. Its advantage is that
can be removed under milder conditions than the Z group.
O
Me3C-O-C-NH~
The Boc group is completely stable to catalytic hydrogenolysis and as such is orthogonal to the Z
group. Basic and nucleophylic reagents are no effect on the Boc group and its removal can be
carried out by TFA at room temperature. The most convenient reagent that can be used in the
protection reaction is the Boc anhydride (Boc2O).
The 9-fluorenylmethoxycarbonyl (Fmoc) group.
The Fmoc group differs from both Z and Boc groups since it is very stable to acidic
reagents.
23
O
H
CH2-O-C-NH-
The Fmoc group can be removed under basic conditions. Usually 20% piperidine dissolved in
DMF is used as reagent. One of the reagents for introducing the Fmoc group is the FmocCl.
~COO-CH2-
C CH3
CH3
benzyl ester
t-butyl ester
24
Me
SO2
Me
Me
O
NH-C-NH~
Me
Me
Me
SO2
NH-C-NH~
Me
Me
Pmc protection
Me
Mtr protection
O
O2N-NH-C-NH~
N
Boc
25
H 3C
CH3
CH
N=C=N
CH
H 3C
CH3
OH
1,3-Diisopropylcarbodiimide (DIC)
N-Hydroxybenztriazole (HOBt)
Another very often used coupling reagent is O-benztriazole-N,N,N,N-tetramethyluronium-hexafluoro-phosphate (HBTU) that is known not to cause racemization.
_
PF6
N
N
O
CH3
+
N
CH3
N
H3C
CH3
O-benztriazole-N,N,N,N-tetramethyl-uronium-hexafluoro-phosphate
(HBTU)
One of the bases applied in the coupling reactions is N,N'-Diisopropylethylamine
(DIPEA).
N,N'-Diisopropylethylamine (DIPEA)
26
Cycloaddition reactions
Organometallic reactions
Michael additions
Heterocyclic forming reactions
Multi-component reactions
Olefin forming reactions
Oxidation reactions
Reduction reactions
Substitution reactions
Protection/deprotection reactions
Cleavage from supports
Other types of solid phase reactions
Excellent compilations of these reactions were prepared by Hermkers et al.15,16, .
Furka17 and W. M. Bennett.18
2.5. Solid phase reagents and scavenger resins in solution phase synthesis
Solid phase additives are successfully applied in many solution phase synthetic reactions.
In solid phase reactions the substrate is bound to the solid phase carrier and the reagents are in
solution. In solution phase reactions both the substrates and the reagents are in solution. In some
solution phase reactions, however, the reagent is bound to resin. The advantage of such reagents
is that the by products of the reagent remains bound to the resin and can be easily removed from
the reaction mixture by filtration. One example is the polymer bound HOBt that is used in amide
formatting reactions.
N
N
N
OH
More solid phase reagents and examples of their applications are found in the already
mentioned compilations.15-18
Different types of resins can also be used in solution phase reactions for removal of the
excess of reagents, substrates or by products. Those resins that can be used for such purposes are
named scavenger resins. One example is formylpolystyrene19-20 which is used for removal of
primary amines from reaction mixtures.
27
O
H
Formylpolystyrene
Other examples of scavenger resins and their applications are found in the above
mentioned compilations.15-18
References
1. R. B. Merrifield J. Am. Chem. Soc. 1963, 85, 2149.
2. R. B. Merrifield J. M. Steward,N. Jernberg Anal. Chem. 1966, 38, 1905.
3. W. Rapp In G. Jung (Ed) Combinatorial Peptide and Nonpeptide Libraries 1996, VCH,
New York, 425.
4. H. M. Geysen, R. H. Meloen, S. Barteling Proc Natl Acad Sci USA 1984, 81, 3998.
5. http://www.mimotopes.com
6. C. C. Lezenoff Acc. Chem. Res. 1978, 11, 327.
7. P.M. Worster, C. R. McArthur, C. C. Lezenoff Angew. Chem. 1979, 91, 255.
8. C. C. Lezenoff, V. Yedidia Can. J. Chem. 58, 1980, 287.
9. V. Yedidia, C. C. Lezenoff Can. J. Chem. 1980, 58, 1144.
10. E. Camps, J. Cartells, J. Pi Anales de Quimica 70, 848 (1974).
11. J. M. J. Frechet Tetrahedron 1981, 37, 663.
12. M. J. Farrall, J. M. J. Frechet J. Org. Chem. 1976, 46, 3877.
13. J. M. J. Frechet, C. Schuerch, J. Am. Chem. Soc. 1971, 93, 492.
14. J. I. Crowley, H. Rapoport Acc. Chem. Res. 1976, 9, 135.
15. P. H. H. Hermkens, H. C. J. Ottenheijm, D. Rees Tetrahedron 1996, 52, 4527.
16. P. H. H. Hermkens, H. C. J. Ottenheijm, D. Rees Tetrahedron 1997, 53, 5643.
17. . Furka In Combinatorial & Solid Phase Organic Chemistry 1998, Advanced
ChemTech Handbook, Louisville, 35.
18. W. M. Bennett In H. Fenniri (Ed) Combinatorial Chemistry 2000, Oxford University
Press, Oxford, New York, 139.
19. K. G. Dendrinos, A. G. Kalivretenos J. Chem. Soc. Perkin Trans. 1998, 1, 1463.
20. M.V. Creswell, G. L. Bolton, J. C. Hodges, M. Meppen Tetrahedron 1998, 54, 3983.
28
29
Side view
Top view
cleavage
cleavage
cleavage
cleavage
Figure 3.3. Parallel synthesis of five trimers in five (numbered) reaction vessels. The Black, gray
and white circles represent building blocks, for example amino acids
The five trimers are synthesized on solid support (P) in reaction vessels 1 to 5. At the end
of the synthesis, each trimer is individually cleaved from the support and collected in one of the
30
five vessels designated for storing the end products. The figure demonstrates that in parallel
synthesis the number of reaction vessels is the same as the number of compounds to be prepared.
The number of operations is practically the same as in the one by one synthesis of the same
compounds since the solvents and reagents have to be serially transported into each reaction
vessel. The real advantage is that the reaction time for the in synthesizing the 5 compounds is
about the same as preparing a single one. The series of compounds prepared by the parallel and
the other combinatorial methods are called compound libraries.
Solution
5 6 7
8 9 10 11 12
1
2
3
4
5
6
7
8
The sequence of the peptide formed on a pin depended on the order of the amino acids
added to the particular well. The amino acids or their order (or both) were different for each well
so a different peptide formed on each pin. The wells, as well as the pins, were characterized by
their coordinates: rows and columns. By recording the order of the added amino acids into each
well, the expected sequences of the peptides could be determined from the position occupied in
the plate.
If the order of added amino acids in the well row 4/column 9 is Gly, Gly, Arg, Phe, for
example (Figure 3.5), then the sequence of the tetrapeptide formed on the pin row 4/column 9 is
Phe.Arg.Gly.Gly (taking into account that numbering of the amino acids in peptides stars at the
N-terminus).
The multipin method is still used and the multipin apparatus is a commercially available
product. In such apparatus, however, not coated plastic rods are used as pins. The coated head of
the roads is replaced by SynPhase crowns or SynPhase lanterns (Figure 3.6) mentioned in
chapter 2.
R2
R2
NHFmoc
O
O
NH
R2
R3
NHFmoc
R1
R1
H
N
R2
O
R3
R4
N
R2
O
R3
R4
N
O
R3
N
H O
R1
R1
R1
Scheme 3.1.
Derivatives of 1,4-benzodiazepines were constructed from 2-aminobenzophenones, amino
acids and alkylating agents (Scheme 3.1). The Fmoc protected 2-aminobenzophenones were first
attached to an acid labile linker (L) then through the linker to the pins (P). After removal of the
protecting group it was coupled with a protected amino acid (1). This was followed by the
removal of the Fmoc protecting group and cyclization (2), then by alkylation of the ring nitrogen
to introduce R 4 (3). Finally the product was cleaved from the support (4).
peptide arrays. The synthesis is carried out on cellulose paper membranes derivatized to serve as
anchors for the first amino acids of the sequences to be prepared. Small droplets of solutions of
protected amino acids dissolved in low volatility solvents and coupling reagents are pipetted onto
predefined positions of the membrane (Figure 3.6). The spots thus formed can be considered as
reaction vessels where the conversion reactions of the solid phase synthesis take place. An array
of as many as 2000 peptides can be made on an 8x12 cm paper sheet. The peptides can be
screened on the paper after removing the protecting groups.
33
Preparation of many organic compounds needs heating. Since the mid-1980s the use of
microwave heating began to spread in chemical laboratory practice9. This kind of heating raised
considerably the speed of chemical reactions in both solution and solid phase. The reaction times
are typically reduced from days or hours to minutes or second often followed by increased yields,
too. This type of heating is also applied in the synthesis of combinatorial libraries in order to save
time. In microwave heating the energy is not transferred by conduction or convection so the
reaction vessel is not heated only the solvent and the reactants. The energy is absorbed by dipolar
molecules. Molecules that have larger dipolar momentum absorb better. For this reason solvents
with large dielectric constant are preferred.
Although examples for application of domestic microwave ovens in parallel synthesis of
combinatorial libraries are found in the literature10 in practice rather specially constructed heaters
are applied. The experiments carried out in domestic ovens are often difficult to reproduce
because of the uneven electromagnetic field distribution, the pulsed irradiation and the
unpredictable formation of hot spots. Two kinds of specially constructed commercial reactors are
available that work either in multimode or in monomode operation.11,12
34
Figure 3.9. In the XP-1500 Plus system of CEM up to 12 samples can be heated
simultaneously (photo: courtesy of CEM)
The multimode reactor has a large cavity like the domestic oven but reflection by the
walls and a mode stirrer ensures a nearly homogenous distribution of the electromagnetic field. In
this reactor the samples can be heated in parallel. A parallel reactor for heating up to 12 samples
is demonstrated in Figure 3.9.
The ovens operating in monomode, on the other hand, heat only one sample at a time. The
vials containing the reactants are delivered serially into the oven. Such system, the
ExplorerPLS of CEM is demonstrated in Figure 3.10. The Explorer handles all of the routine
tasks necessary to execute a large number of reactions each day. The system has a sample deck
with interchangeable racks.
35
12
12
12
12
12
12
6
TAATATTA
TAGTACTA
1
2
8 765 432 1
At coupling positions 3 and 6 the discs were placed in separate reaction vessels and the
couplings were executed separately. The total number of the executed coupling cycles in the
synthesis of the two octamers was 10. In the case of normal parallel synthesis of the same two
compounds 16 coupling cycles would have been needed. This shows that the method is capable
to significantly reduce the number of the necessary operations. It seems worthwhile to note that
36
in a single coupling cycle 18 different operations had to be executed including washings and
dryings.
tea-bag
reaction vessels
with bags
37
R1
COOH
R2
R3
R4
NC
CHO
NH2
R3
H
N
R1
R2
R4
R4
X = O or S
N
N
H
R1 R2
R1
R5
R2
N
R3
R2
N
R3
R1
R3
X = O or S
H2O
H2S2O3
H R -COOH
5
N
R
N
H
HSCN
HX
R1-CO-R2
R3-NH-R4
R-NC
HOCN
R4
HN3
N
R3
R1
R2
Figure 3.14. Products of U-4CRs carried out with different acids. HX is replaced by acids seen on
the arrows
CHO
R2
NH2
R1
H
N
R2
N
10% TFA
R
OH
Boc
R4
R4
N
R3
R2
O
R4
Boc
N
R3
N
R3
38
These oligomers are soluble, relatively large molecules their size considerable exceeds
those of the building blocks and reagents used in combinatorial syntheses. To the ends of their
branches linkers can be attached so they can serve as soluble supports for combinatorial synthesis
(Figure 3.16.).
: linker,
: building block
Because the large size of the dendrimer molecules, the products of each coupling step can
easily be separated from the excess of reagents by size exclusion chromatography. After cleaving
the small molecule products from the support, the size exclusion chromatography also makes
possible to separate them from the dendrimer molecules. This kind of separation is usually much
faster than the conventional separation and purification processes but it is much slower than the
simple filtration in the solid phase procedures.
39
Sn(CH2CH2C6F13)3
PdCl2(PPh3) 2
+
Br
+ Cl-Sn(CH2CH2C 6F13 )3
LiCl/DMF/THF
MeO
MeO
HH2
F3C
Br
N
H
F3C
OCN
N
H
excess
H
N
NCO
HN
Si(CH2CH 2C 6F13)3
Si(CH2CH2C6F13) 3
Si(CH2CH2C6F13) 3
Si(CH2CH 2C 6F13)3
40
In Figure 3.19. an example is found showing acylation of a secondary amine with a solid
phase active ester.26 The product can be isolated from the filtrate while the by product and the
excess of the active ester remains on the filter.
NO2
NO2
O
O
OH
O
R3
R1
CH3-CN,Et3N
HN
R3
N
70
R2
R2
R1
product
active ester
In another example shown in Figure 3.20. the coupling reagent, 1-(3dimethylaminopropyl)-3-ethylcarbodiimide (EDC), is used in insoluble form (P-EDC) for
coupling an acid (R1COOH) with a secondary amine (R2R3N). The transformed form of the
reagent can be filtered out and the product can be isolated from the filtrate. Not only reagents but
catalysts can also be used in solid phase form.25
N
C
C
N
Cl
Cl
N
HO
N
H
R1
R1
P-EDC
R2
NH
NH
Cl
R3
C
N
N
H
R2
R1
product
N
R3
41
H
O
R1
NH2
excess
NR3
R2
H
BH4
R2
R1
N
R1
R2
R1
NH2
product and
remaining amine
filtered out
clean product
Figure 3.21. The use of solid phase catalyst and scavenger in one reaction
3.4. Automation in parallel synthesis
Like many other important developments in combinatorial chemistry, automation also
began in the field of peptide chemistry. Introduction of the solid phase peptide synthesis
procedure by Merrifield19 in 1963 opened the possibility for automation. Merrifield not only
invented the new synthetic technology but he also built the first solid phase peptide synthesizer.27
In the last half century numerous companies developed and commercialized automatic
synthesizers.
42
The peptides cleaved from the resin are in dissolved form. The solvent needs to be
evaporated or lyophilized. A simple module developed for this purpose is seen in Figure 3.23. It
is designed to prevent liquid bumping. It can be connected to vacuum pump equipped with cold
trap. The module can also be used to evaporate or concentrate fractions after purifications.
Another automatic synthesizer is demonstrated in Figure 3.24. This is Model 384 Ultra
High Throughput Synthesizer is also manufactured and sold by aapptec.
43
Figure 3.24. The is Model 384, Ultra High Throughput Synthesizer for solid phase applications
(photo: www.aaptec.com)
Figure 3.25. The Solution, a solution phase parallel synthesizer (photo: www.aaptec.com)
The synthesizer has four reactor blocks each containing a maximum of 96 reaction
vessels. So a maximum of 384 different peptides or other compounds can simultaneously be
prepared. The maximum number of compounds that can be prepared depends on the volume of
the reaction vessels that can vary from 3 to 35 ml. Heating and cooling is optional.
An automatic machine, the Solution, designed for solution phase synthesis is shown in
Figure 3.25. but the instrument can be used for solid phase synthesis, too. It can accommodate
fifteen 80 ml reactors or ninetysix smaller reaction vessels (3 or 10 ml). The machine
automatically performs liquid-liquid extractions and can transfer the products to titer plates.
44
Another machine, the Sophas HTC (Figure 3.26), is manufactured by Zinzer Analytic. It
can be used for both solution phase and solid phase parallel preparations. 600 compaunds can be
synthesized in a parallel procedure.
The reaction vessels can be heated up to 150 oC and can be cooled to -40 or -80 oC. Inert
reaction conditions are assured. Slurry distribution and sample picking for HPLC during the
synthesis are also possible.
45
Another liquid handler is also a Gilson product: the Multiple Probe 215 Liquid
Handler/Injector (Figure 3.28.). This device is a large-capacity multiple-probe liquid handler that
processes four or eight samples simultaneously. It also performs injection into four or eight
parallel systems simultaneously. The instrument is ideal for parallel injection into HPLC and
LC/MS systems.
46
47
48
characteristics. Biotage also offers automated and semi-automated systems for production-scale
HPLC and FLASH chromatography processes.
BUCHI Corporation (www.buchi.com)
Buchi produces and offers a large array of laboratory equipments.
Caliper Technologies (www.caliperls.com ).
The company designs and manufactures Labchip devices and systems that enable highthroughput screening. The LabChip systems replace entire chemical laboratories. Caliper's
microfluidic LabChip devices function like "liquid integrated circuits." They process fluid containing DNA, proteins, or cells - like semiconductors process electrons, executing biological
tests in seconds. Genes can be analyzed within minutes. Promising drug compounds can be tested
within days instead of months.
Carl Zeiss Jena (www.zeiss.de).
The company has developed a screening system for ultra high throughput screening (UHTS) for
pharmaceutical drug research. The high-performance multimode readers offer 96-channel
parallel detection of fluorescence, absorption and luminescence in 96, 384 and 1,536-well
microtiter plates. The compact workstations and systems containin a new robust technology for
the transport of microtiter plates with a throughput of > 100,000 specimens a day. The userfriendly software offer simple assay programming.
Cartesian Technologies, Inc. (www.cartesiantech.com).
The company manufactures equipment for pharmaceutical and agricultural research. The
equipment helps automate and increase the process efficiencies in areas such as drug screening,
genomics, and combinatorial chemistry.
Cellomics, Inc. (www.cellomics.com ).
Cellomics Inc.s mission is to improve the efficiency of the drug discovery process by delivering
a cell-based screening platform that automates target validation and lead optimization using
fluorescence-based assays. Today, the Companys integrated platform consists of proprietary
fluorescence assays, a proprietary, cell-based High Content Screening (HCS) system, and
bioinformatics software.
CyBio (www.cybio-ag.com).
The company offers modular technology platforms for automated drug research, high throughput
screening, liquid handling, luminescence readers.
Genetix (www.genetix.co.uk).
The company offers a multi-tasking robot, offering Colony Picking, Gridding and Liquid
Handling. The 'Q' BOT is an invaluable addition to any laboratory engaged in high throughput
Pharmaceutical, Genomic or Bioresearch screening.
49
50
51
TECAN (www.tecan.com).
The company produces a large portfolio of instruments and systems the Robotic Sample
Processors for automated liquid handling, the Microplate equipments such as Readers and
Washers.
TekCel (www.tekcel.com).
The company manufactures a family of robotic workbenches, called TekBenches. These
products include liquid handling and assay development, automated microplate sealing/resealing,
storage and retrieval system.
Tomtec (www.tomtec.com).
The company manufactures a complete line of liquid handling systems including harvesters, 96well pipetters, 384-well pipetters, plate washers, and robotic components and systems.
Zinsser Analytic (www.zinsser-analytic.com)
The company is specialized in developing, producing and distributing innovative laboratory
solutions for liquid handling and automation including systems for combinatorial chemistry and
tools for drug discovery.
Zymark Corporation (www.zymark.com).
The company is a is a designer and installer of workstation-based laboratory automation
products.
References
1. J. Kehnscherper, G. Kehnscherper, A. Hausen, W. Mochmann A vilg vallsai, Tessloff
& Babilon, 1999.
2. Gy. Taktsy Acta Microbiologica Acad. Sci. Hung. 1955, 3, 191.
3. H. M. Geysen, R. H. Meloen, S. J. Barteling Proc. Natl. Acad. Sci. USA 1984, 81, 3998.
4. B. A. Bunin, J. A. Ellman J. Am. Chem. Soc. 1992, 114, 11997.
5. R. Frank, S. Gler, S. Krause, W. Lindenmayer,In Peptides 1990, E. Giralt, D. Andreu
(Eds), 1991, ESCOM, Leiden, 151
6. R. Frank, Tetrahedron 1992, 48, 9217.
7. S. H. De Witt, J. S. Kiely, C. J. Stankovic, M. C. Schroeder. D. M. R. Cody, M. R. Pavia
Proc. Natl. Acad. Sci. U. S. A. 1993, 90, 6909.
8. H. V. Meyers, G. J. Dilley, T. L. Durgin, T. S. Powers, N. A. Winssinger, H. Zhu, M. R.
Pavia, Molecular Diversity, 1995, 1, 13.
9. P. Lidstrm, J. Tierney, B. Wathey, J. Westman Tetrahedron, 2001, 57, 9225.
10. B. M. Glass, A. P. Combs In I. Sucholeiki (Ed) High-Throughput Synthesis. Principles
and Practices, Marcel Dekker Inc. 2001, 123.
11. O. Kappe, A. Stadler In G. A. Morales, B. A. Bunin (Eds) Methods in Enzymology,
Combinatorial Chemistry Part B, 2003, Elsevier Academic Press, 197.
12. B. M. Glass, A. P. Combs, S. A. Jackson In G. A. Morales, B. A. Bunin (Eds) Methods in
Enzymology, Combinatorial Chemistry Part B, 2003, Elsevier Academic Press,223.
13. R. Frank, W. Heikens, G. Heisenberg-Moutsis, H. Blcker Nucleic Acid Research 1983,
11, 4365.
52
14. R. A. Houghten Proc. Natl. Acad. Sci. USA 1985, 82, 5131.
15. I. Ugi Isonitrile chemistry, Academic Press, 1971, 1.
16. I. Ugi Proc Estonian Acad Sc. Chem. 1995, 44, 237.
17. A. Laurent, C. F. Gerhardt Liebigs Ann. Chem. 1838, 28, 265.
18. I. Ugi, A. Dmling, B. Ebert In G. Jung (Ed) Combinatorial Chemistry. Synthesis,
Analysis, Screening, Wiley-VCH, 1999, 125.
19. C. Hulme, H.Bienam, T. Nixey, B. Chenera, W. Jones, P. Tempest, A. L. Smith In G. A.
Morales and B. A. Bunin (Eds) Methods in Enzymology, Combinatorial Chemistry
Elsevier Academic Press, 2003, 369, 469.
20. R. B. Merrifield J. Am. Chem. Soc. 1963, 85, 2149.
21. N. K. Terrett Combinatorial Chemistry, Oxford University Press, 1998, 64.
22. I. T. Horvth, J. Rbai Science, 1994, 266, 72.
23. D. P. Curran Angew. Chem. Int. Ed. Engl. 1998, 37, 1174.
24. D. P. Curran, M. Hoshino J. Org. Chem. 1996, 61, 6480
25. B. Linclau, D. P. Curran In I. Sucholeiki (Ed) High-Throughput Synthesis. Principles and
Practices, Marcel Dekker Inc. 2001, 135.
26. S. W. Kaldor, M. G. Siegel Current Opinion in Chem. Biol. 1997, 1, 101.
27. R. B. Merrifield J. M. Steward,N. Jernberg Anal. Chem. 1966, 38, 1905.
53
54
55
A third dividing , coupling and mixing step that is not demonstrated in the figure would
lead to the formation of a mixture of 27 resin bound tripeptides (Figure 4.1.b.) and a fourth cycle
would produce 81 tetramers (c, d and e in Figure 4.1.).
tripled: first 3x1=3 resin bound amino acids, then 3x3=9 resin bound dipeptides and, if the
process is continued, in the third cycle 3x9=27 resin bound tripeptides and in the fourth cycle 81
tetrapeptides are formed. This means that the number of peptides increases exponentially after
each coupling step (Table 4.1.).
This is the reason why the split-mix method is so productive. Table 4.1. also shows that
while the number of the products increases exponentially in each step the number of coupling
cycles, that can be considered as a measure of the invested labor, remains constant.
Number of
Number of cycles
reaction vessels
in one step
3
3
3
3
3
3
3
3
Total number
of cycles
3
6
9
12
Number of
peptides
31=3
32=9
3 3=27
3 4=81
As the synthesis proceeds, the invested labor increases only linearly. Linear increase of
labor and exponential growth of the number of products: this is the reason of the exceptionally
high efficiency of the method.
Table 4.1. also shows the number of reaction vessels that are needed to execute the
synthesis. This is also only three is each step: one reaction vessel for each amino acid or other
kind of building block. The number of the reaction vessels do not depend on the number of
compounds formed in the process.
If 20 different amino acids are used in the synthesis, the number of peptides in each
coupling step is increased by a factor of 20. The number of peptides (Np) can be expressed by the
following formula where n is the number of the coupling steps, that is, the number of amino acids
forming the peptides.
Np = 20n
After executing 5 coupling cycles with each of the 20 amino acids, for example, more
than 3 million peptides are present in the mixture. Such a synthesis does not need millions of
reaction vessels neither thousands of years for their preparation. It is enough to use 20 reaction
vessels, one for each amino acid, and the pentapeptide library can be prepared in a couple of
days. The number of the executed coupling cycles (Nc), as expressed by the following formula,
increases only linearly with the length of peptides.
Nc = 20n
Formation of all possible sequences. Another feature of the split-mix synthesis is that all
possible combinations of amino acid building blocks are represented in the synthesized peptides.
This is clearly shown by the simple example outlined in Figure 4.1. No more sequential orders of
the red, yellow and blue circles can be deduced than those present in the dimmers, trimers and
57
tetramers demonstrated in Figure 4.1. This combinatorial nature of the composition of the
mixtures synthesized by the split-mix method is reflected in their name: "combinatorial libraries."
This combinatorial feature of the split-mix synthesis holds for preparations of non-sequential
(e.g. cyclic and other) libraries, too.
Formation of all possible sequences is the consequence of equally dividing the resin
mixtures into the reaction vessels of the next coupling step. As a result of this operation
all products formed in any reaction vessel are evenly distributed among the
reaction vessels of the next reaction step.
This can be considered as the combinatorial distribution rule that governs the product formation
in the combinatorial process.
Formation of the products in one to one molar quantities. Peptides are considered to be
natural compounds although certainly not all peptide sequences are found in nature. Peptide
libraries are most often prepared in order to find biologically active substances among them.
Other kinds of organic libraries are also synthesized for the same purpose. In the identification
process, or in screening of the libraries, the goal is to find the biologically most effective
component of the mixture. Serious problems may arise in screening if the peptides are not present
in equal quantities in the mixture. A low activity component, for example, if it is present in a
large amount, may show a stronger effect than a highly active component present in a much
lower quantity. Therefore, it is important to prepare libraries in which the constituents are present
in equal molar quantities. The split-mix method was designed to comply with this requirement.
After each round of couplings, the resin is thoroughly mixed. This ensures that before
dividing the resin the mixture is nearly homogenous. If the resin is divided into equal portions the
previously formed peptides can be supposed to be present in equal number and in equal molar
quantities in each portion. The coupling of any portion of the resin with an amino acid does not
alter the number or the molar ratio of the peptides originally present in the mixture; simply adds
the same amino acid to each sequence. Consequently, the molar ratio of the newly formed
peptides is expected to be the same as the molar ratio of originally present ones. That is, the new
sequences are formed in equal molar ratio.
In addition to the execution of couplings with equal portions of resin samples it is also
important that the couplings are carried out on spatially separated samples adding a single amino
acid to each sample. This makes possible to use appropriate chemistry to drive each coupling
reaction to completion regardless of the reactivity of the amino acids. As a result, both the
number of peptides and their equimolar ratio is preserved in every portion and in each step.
It is worthwhile to note that the equimolarity could be altered at will if for some reason it
would be advantageous. Simply unequal portions should be used in some couplings. Applying a
larger portion of resin in one of the couplings, for example, would result in formation in larger
molar quantity of a subgroup of products.
The parallel nature of the split-mix synthesis and formation of individual compounds. The
split-mix procedure has another intrinsic feature which plays an important role in screening and
gives a unique character to the method: in any individual bead of the solid support, only one kind
of peptide is formed. This may seem surprising at first glance, but becomes quite understandable
upon closer examination. In Figure 4.2. the fate of a randomly selected bead is followed in a three
step coupling process.
58
Lets suppose that the bead in the first reaction step randomly finds itself in the reaction
vessel where the coupling is done with the red amino acid. Consequently, to all of the
functional groups of this bead and to those of all of the other beads in the same vessel the
red amino acid is attached. In the next step the bead is in the vessel where the yellow amino
acid is added. To all coupling sites this amino acid is coupled. In the third step, for similar
reasons, all dipeptides are elongated with the blue amino acid. Thus all peptide molecules that
form in the bead are the same. The sequence of all peptides is blue-yellow-red (the reversed
order of couplings).
The beads behave in the process like independent reaction vessels. The content of these
reaction vessels is not interchanged with those of the other ones. Any selected bead randomly
travels through the successive reaction vessels and the final sequence stores the information about
the route the bead traveled in the course of the synthesis.
The formation of one substance in each bead is a very important feature of the split-mix
synthesis. If the products are cleaved individually from separated beads, then they can be
examined as individual substances like those produced by the parallel synthesis. Furthermore, if
the formed compounds are not cleaved from the beads only the protecting groups are removed the
products can also be tested as individual substances. The libraries in which the products remain
59
bound to the beads are called tethered libraries. Such libraries can be prepared by attaching to the
resin the first amino acid by a cleavage resistant bond. The possibility of screening the products
as individual compounds like those produced in the parallel synthesis ensure an enormous
advantage in applications.
When comparing the products of the split-mix synthesis to those produced in a parallel
process attention have to be called to an important difference. In parallel synthesis not only
individual substances form in the reaction vessels, but the position of the reaction vessel in the
reaction block unambiguously determines the identity of the product. The coordinates (row and
column) identify the expected products since the synthetic history of each well, that is the added
reagents and their order is exactly known. The situation in split-mix synthesis is different.
Although each bead contains a single product it is not possible to easily identify the content. All
beads look the same and the synthetic history of the beads is unknown. This means that if we
determine is some way or other that the content of a selected bead, say a peptide, has a useful
property all that we know is only the length of the peptide and this is the same for all components
of the library. Neither the amino acid composition nor the sequence of the amino acids is known.
If we want do know these data they have to be determined in a separate process that is called
deconvolution. In the case of peptide libraries this can be done by scarifying at least a part of the
content of the bead for sequence determination.
If a component of a tethered peptide library is examined, the best choice is to submit the
bead to sequence determination using a peptide sequencer.4 On the other hand, if the peptide is
examined in cleaved form the appropriate choice is to determine the sequence by mass
spectrometry.5
After carrying out the synthesis of a peptide library, all peptides can be cleaved from the
support and this way a mixture of free peptides is formed. These libraries are called soluble
peptide libraries. In such library millions of peptides may be present and finding a bioactive
peptide among them seems, at first glance, like finding a needle in a hay stack. Nevertheless,
appropriate strategies have been developed to solve the problem. These strategies will be
described later.
Applicability of the split-mix method in the synthesis of organic libraries. Although the
split-mix method was developed with intention to prepare large number of peptides, it was clear
from the beginning that the method would be applicable for the synthesis of different families of
other kinds of organic compounds, too. The series of non-peptide compounds are usually called
organic libraries. Since most organic compounds are prepared by multi-step synthesis, it is quite
obvious that the split-mix synthesis can be used for preparation of organic libraries. It has to be
made clear that the split-mix method can only be applied in the synthesis of organic libraries if
the chemistry of the process is well developed. The advent of the combinatorial era brought to
light the importance of the solid phase organic reactions and, as a result of an intensive
development, a large number of previously described solution phase organic reactions have been
optimized to solid phase (see the second chapter). These reactions are applied in both parallel and
split-mix approach. From the point of view of the pharmaceutical research and many other
applications, the organic libraries are very important. Peptides are not the most preferred drug
candidates because of their high susceptibility to enzymatic degradation. The ideal drug leads are
small organic compounds due to their, in general, more favorable pharmacodynamic properties.
The use of organic libraries prepared by the split-mix method brings about a problem that
does not occur with peptide libraries. Identification of a library component formed in a bead is
not so easy than in the case of peptide libraries. Determination of the structure of an organic
60
compound in most cases is more complicated than that of a peptide. In order to circumvent this
problem, different encoding methods have been developed.
Figure 4.3. Beads encoded by sequence (a) and binary code (b)
The encoded synthesis of organic libraries follows the general route of the split-mix
method with one exception. A fourth operation is added to the usual three ones of the coupling
cycle: coupling the units of the code to the beads.
61
Figure 4.4. First cycle of an encoded synthesis. Green cycles: support, yellow, blue and red
cycles: organic building blocks, white, grey and black squares: units of the code.
In the binary encoding system the coding units are different organic molecules and their
combination forms the code. In one of the binary encoding method the encoding molecules are
halobenzenes carrying a varying length hydrocarbon chain (pink structures in Figure 4.3.b)
attached to the beads through a cleavable spacer.
( )n
HOOC
Ar
O
NO2
Linker
Electrophoretic Tag
The structures of some aryl groups that appear in the electrophoretic tags are shown below.
62
Cl
Cl
Cl
Cl
Cl
Cl
Cl
Cl
Cl
Cl
It is characteristic for the binary encoding technique that the coding units do not form a
sequence. It is simply their presence which codes for the organic building blocks and their
position. In their original paper Ohlmeyer and his colleagues11, demonstrated the method for
encoding peptide sequences. By 18 different coding units arranged according to a binary coding
format the authors were able to code all sequences in a 117,649 member peptide library, formed
by varying 7 amino acids (D, E, I, K, L, Q and S) in six positions. The presence of the coding
units could be determined after cleavage in a single step by electron capture gas chromatography.
Table 4.2. shows a simple example constructed in order to demonstrate the principle of
binary encoding. The nine different tags (T 1-T9) are used to encode the structure of 343 organic
molecules synthesized by using the building blocks A1-A7, B1-B7 and C1-C7 in the first, second
and third coupling step, respectively.
Coupling #2
Blocks Codes
B1
T4
B2
T5
B3
T6
B4
T5T4
B5
T6T4
B6
T6T5
B7
T 6T5T4
Coupling #3
Blocks Codes
C1
T7
C2
T8
C3
T9
C4
T8T 7
C5
T9T 7
C6
T9T 8
C7
T9T8T 7
It can be seen that that the tags T1, T2 and T3 and mixtures formed from them are
encoding A1 to A7. Similarly T4, T5 and T 6 encode B1 to B 7 and T7, T8 and T9 for C1 to C7. It can
be read from the table that the code for the compound formed from the building blocks A1B1C3,
for example is T9T4T1. Similarly A2B3C4 and A7B7C7 are encoded by T8T7T 6T2 and
T9T8T7T 6T5T4T3T 2T1, respectively.
The table shows that a building block in most cases is coded by more than one tag. These
tags are attached to the beads in a single operation using the mixtures of the tags as reagents. The
binary encoding system proved to be very successful in practice.
Encoding tags other than halobenzenes have also been proposed and successfully used in
practice.
63
Figure 4.6. Manual device for split-mix synthesis. Vertical and tilted position.
The device is an aluminum tube mounted on a laboratory shaker. On one side of the tube
there are two rows of altogether 20 holes to which reaction vessels can be attached. The reaction
vessels that are normally used for solid phase synthesis (Figure 2.3.) were inserted into the holes
and tightened by applying plastic rings. The unused holes were stopped. One end of the
aluminum tube was attached to a waste container and the system could be evacuated by a water
pump. The tube could be twisted around its axis. The Figure shows the tube in two positions. The
left and right photo shows the reaction vessels in nearly vertical and tilted positions, respectively.
The vertical position of the reaction vessels was used when the resin was portioned into
them, when reagents or solvents were added and when solutions were removed. The reaction
vessels stayed in tilted position and shaking was applied during the coupling reactions and the
removal of protecting groups.
In the synthesis of peptide libraries 200-400 mesh resin (capacity 0.5 mmol/g) was used
and swelled in DMF prior to portioning. This resin contains about 10 million beads per gram. The
following operations were typical in a coupling cycle.
Portioning. The resin was suspended in DMF/DCM (2:1 v/v) in a round bottom container
and was continuously mixed by bubbling nitrogen though it. The density of the solvent mixture
was very near to that of the resin so the slurry could be kept homogenous during the portioning
operation which was carried out by pipetting equal volumes of the slurry into the reaction vessels.
After the first round of pipetting a small volume remained in the flask. This was diluted with the
solvent and the pipetting was repeated in order to transfer all of the resin into the reaction
vessels.
64
Removal of the terminal protecting group. The protecting group was removed by shaking
with a solution of TFA (Boc strategy) or piperidine (Fmoc strategy) then washed several times
with solvents.
Coupling. A DMF solution of protected amino acid containing HOBt and a solution of
DIC was added then shook for about one hour. After removing the solution, the resin was washed
several times with DMF and DCM.
Mixing. After addition of DCM/DMF (2:1 v/v) the reaction vessels were removed from
their place and their content was poured into the round bottom container. The remainder was also
washed into the container where the slurry was mixed by nitrogen bubbling.
At the end of the synthesis a deprotecting cocktail was added to the thoroughly washed
resin and after shaking then the solution was separated from the resin by filtration and dried.
Productivity. In the synthesis of peptide libraries from 20 amino acids (although cystein
was usually omitted) and using the manual device (Figure 4.6.) one elongation step, that is one
coupling with each of the 20 amino acids, could easily be realized in one day. Taking this speed
as standard, the number of synthesized peptides is shown in Table 4.3.
The data of Table 4.3. are striking. In as short time as a week more than 1 billion peptides
can be prepared. This really shows the exceptionally high productivity of the method. Before
using the method, we did not even dream about anything comparable to this.
It is worthwhile to note that during the synthesis of a peptide library of a given length all
the libraries of the shorter peptides are also formed. Of course, if needed, samples of all these
libraries can be separated from the resin mixtures. If a pentapeptide library is synthesized in 5 g
resin, and in the tripeptide and tetrapeptide phase 12.5 mg and 250 mg resin is removed,
respectively, we end up with a tripeptide, a tetrapeptide and a pentapeptide library and in these
libraries the components are present in practically the same molar quantities.
Identification of the components in mixtures containing thousands or millions of peptides
is impossible. Therefore, when the split-mix synthesis was first tried experimentally, very simple
libraries were prepared containing only 9 to 180 peptides. The components of the synthesized
peptide mixtures were identified by two dimensional high voltage paper electrophoresis. In order
to facilitate the identification, a software was developed. Using this software, the sequences of
the expected peptides could be generated by computer. Based on the sequences the computer also
65
calculated the molecular weights, the electric charges of the peptides in two different (pH 2, and
pH 6.5) buffers. Based on these data, the relative electrophoretic mobilities were derived and
transformed into two dimensional electrophoretic maps. The computer predicted maps were
compared with the experimental ones so the products of the synthesis could be identified.
The software made possible to generate all components of huge peptide libraries. These
were the first examples of what are called today virtual libraries. Figure 4.7. shows the predicted
electrophoretic map of the haxapeptide library containing 64 million components. Migrations in
horizontal and vertical directions are supposed to occur at pH 6.5 and pH 2, respectively.
YYYYYY is the last generated sequence.
A Teflon reaction block (1) with 36 reaction vessels and one collection vessel for
combining and mixing the resin.
A rack (2) for the bottles for the solutions of monomers that can be either protected
amino acids or other kinds of building blocks.
Two arms each moving in x,y directions and holding a probe. The needle like probe of
Arm 1 (3) transfers solvents, solutions of the monomers and solutions of reagents into
reaction vessels. This probe is able to spray the solvent and so it can be used to wash
66
(iv)
(v)
(vi)
the walls of the reaction vessels and that of the collection vessel. The probe of Arm 2
(4) has a wide tip and transfers slurries of resin from the reaction vessels to the
collection vessel and back. This probe is also capable to transfer solvent into the
collection and reaction vessels.
Five small bottles to hold solutions of reagents (5)
The synthesizer has 3 bottles (6) for storing solvents and a waste container (7).
The computer seen in the photo controls all operations.
The computer can easily be programmed to control the synthesis of different kinds of
libraries using different kinds and different number of monomers in each step, applying reagents
in different molar concentrations. Double or triple coupling and different coupling times etc. can
also be programmed.
3
4
7
2
6
R3
2
2
1
R5
R2
R4
3
R1
Both the collection vessel and the reaction vessels are equipped with frits at the bottom so
their liquid content can be removed by applying vacuum. The whole reaction block can be shaken
at adjustable speeds by an orbital shaker.
The solutions of the monomers, usually protected amino acids, are placed into bottles (4)
that are stored in places defined by a rack. The bottles are closed by septum. A group of 5 bottles
is also found in defined places in the table top. The coupling reagents are stored in these bottles
(R1-R5) and they are also closed by septum. There are also two cleaning-waste stations in the
table top: one for Arm 1 (5) and another one for Arm 2 (6). The stand by position of the arms is
above the center of these stations. In the left cleaning station (5) the tip of the probe of Arm 1 can
be cleaned by solvent. This is very important. When Arm 1 transfers solution of a building block
into a reaction vessel the needle like probe penetrating through the septum of the container is
immersed into the solution, removes the programmed volume of solution and transfers it into the
programmed a reaction vessel. In order to avoid cross contamination both the inside and the
outside of the probe must be washed. This happens at the station (5).
Before starting to work with the machine both arms must be calibrated: Arm 1 for
reaction vessels, collection vessel, monomer bottles, reagent bottles and the left cleaning station
(5) and Arm 2 for reaction vessels, collection vessels and the right cleaning station (6). These are
the places that are visited by the two arms in the course of the synthesis. As a result of calibration
the exact x,y coordinates of the reaction vessels, collection vessel, reagent bottles, monomer
bottles and the cleaning-waste stations are stored in the memory of the computer. The z
coordinates of the arms also need to be calibrated.
The content of each monomer container must also be defined as well as the content of the
5 reagent bottles and the 3 system fluids placed in the 3 solvent bottles.
The computer can be instructed to initialize the stepwise operations of the synthetic
procedure by commands that are entered into the software (ChemFiles). Sequential execution of
all commands of the ChemFile result in fully automatic realization of the synthetic process.
Commands.
Flush. The command is used to clean and prime the system fluid lines at the beginning of
a synthesis or when a system fluid bottle is changed.
Split. The probe of Arm 2 removes equal volumes from the resin slurry present in the
collection vessel and transfers them into the defined reaction vessels. The volumes and the
repetition of the whole process can be specified.
Combine. By this command the probe of Arm 2 removes a defined volume of resin slurry
from a defined group of reaction vessels and delivers it into the collection vessel for mixing. The
command can also be used to combine the liquid from several reaction vessels into a single
reaction vessel. In this case the number of the destination vessel also needs to be entered.
Mix. The command is used to shake the whole reaction block. It allows mechanical
(vortex) mixing and in addition nitrogen bubble mixing for the collection vessel.
Dispense sequence. A defined volume of liquid is dispensed from the containers of a
source rack into the containers of a destination rack.
Dispense system fluid. The command allows a specified amount of a selected system fluid
to be delivered to a range of vessels defined as destination rack.
Transfer. The command provides for transfer of a specified volume of reagent (from a
defined reagent bottle) to a range of reaction vessels. Also liquid can be moved between any
calibrated position on the tabletop.
68
Empty. The empty command establishes a time value in hours, minutes or seconds for
applying vacuum for emptying the liquid from the reaction vessels or the collection vessel.
Wash. The wash command is usually applied to wash down the resin from the walls of the
collection and reaction vessels after mixing. The liquid is sprayed to the walls and the vessel is
simultaneously emptied.
Wait for. The command offers a timer with a range from seconds to hours, during which
all operations are paused.
Repeat. The repeat command allows the user to develop loops within the ChemFiles in
order to repeat an operation or a sequence of operations.
Besides construction of the ChemFile, preparation for the synthesis of a peptide library
involves the assignment of the amino acids to the bottles of the rack and filing the bottles with the
solutions of the protected amino acids containing HOBt. The reagents also need to be assigned to
their bottles and fill into them. An example of the assignment is shown in Table 4.4.
Reagent
DIC
HBTU
DIEA
MeOH
Piperidine
Solvent
NMP
DMF
NMP
DMF
The system fluids (solvents) also have to be assigned and filled into the bottles. A possible
assignment is seen it Table 4.5.
Solvent (SF)
DMF
DMF/DCM 2:1
DCM
A peptide library is built up from amino acids. In construction of the ChemFile the
Library Builder function of the software can be used to assign the amino acids to the different
coupling positions and define the reaction vessels to where they are delivered for coupling.
Coupling position 1 means the C-terminus of the peptides. The amino acids occupying this
position are coupled first to the support. In order to prepare a full library the same amino acids
need to be assigned to all coupling positions. Table 4.6. shows the data of the Library Builder
when a partial pentapeptide library is prepared. It can be seen that the number of amino acids
assigned to the different coupling positions is different. In the coupling positions 1 and 4, for
69
example, 18 and 14 amino acids are used, respectively. Although it is not seen in Table 4.6. the
library builder calculates the expected number of peptides in the library (884,520) and based on
the quantity and mesh size of the resin also gives the number of beads per peptide.
The ChemFile is a series of commands arranged in the order of execution. Table 4.7.
shows the first ten rows of a ChemFile. The execution involves the swelling and washing of the
resin, diluting the resin with solvent and the first round of splitting the resin transferring 2.5 ml of
slurry into each of 18 reaction vessels.
C o u p l i n g p o s i t i on
8 7 6 5 4 3 2 1
A
R
N
D
K
L
M
F
P
S
T
V
W
A
R
N
D
Q
E
G
M
F
P
S
T
V
W
A
R
N
D
Q
E
G
H
I
K
L
M
F
P
S
T
V
W
A
R
N
D
Q
E
G
H
I
K
L
S
T
V
W
A
R
N
D
Q
E
G
H
I
K
L
M
F
P
S
T
V
W
The synthesis normally ends with the resin combined in the collection vessel in addition
to this there are other choices, too.
In order to be able to calculate the weight of full peptide libraries, let's suppose that only
one peptide is responsible for the biological activity. Let's also arbitrarily fix the quantity of this
peptide (and therefore all peptides in the mixture) to 1 pmol. The real quantity requirement,
depending on the sensitivity of the screening experiment and other factors, can easily be deduced
from this quantity.
70
Table 4.8. Number of peptides in libraries depending on the number of varied positions
Number of
Number of peptides
varied positions
2
361
3
6,859
4
130,321
5
2,476,099
6
47,045,881
7
893,871,739
8
16,983,563,041
9
322,687,697,779
10
6,131,066,257,801
In order to be able to calculate the weight of full peptide libraries, let's suppose that only
one peptide is responsible for the biological activity. Let's also arbitrarily fix the quantity of this
peptide (and therefore all peptides in the mixture) to 1 pmol. The real quantity requirement,
depending on the sensitivity of the screening experiment and other factors, can easily be deduced
71
Table 4.9. Approximate weight of libraries containing each peptide in 1 pmol quantity
Number of
Weight Units
varied positions
2
92
ng
3
3
g
4
65
g
5
2
mg
6
35
mg
7
765
mg
8
17
g
9
353
g
10
7
kg
The quantity of the resin that is needed for the synthesis is expected to be - and really is even a bigger problem. Table 4.10. shows the weight of the resin needed to prepare all peptides in
1 pmol quantity. In practice, these quantities are expected to be even higher than indicated in
Table 4.10 because the libraries are usually prepared not for a single but for a series of
experiments and the screening tests may also have lower sensitivity. Problems may occur in
handling such large quantities of resin and, if the number of the varied positions is high enough,
it is practically impossible to carry out the synthesis. Consequently the weight of the resin needs
to be considered carefully.
72
Units
pmol
nmol
nmol
mol
mol
mol
mmol
mmol
mol
ng
g
g
mg
mg
g
g
g
kg
Another problem which deserves consideration before beginning a synthesis is the ratio
of the number of beads of the resin to the number of the expected peptides. Since only one
peptide forms in each bead, the maximum number of peptides is limited by the number of beads.
Furthermore, two essential operations of the method, mixing and portioning, are influenced by
probability. As a consequence, if the number of the beads is equal to the number of peptides not
all peptides are expected to form and also deviations from the equimolarity are expected. For this
reason, formation of all expected peptides, as well as their near equimolarity, is ensured only if
the number of beads well exceeds the number of peptides. A ten fold excess of the beads can be
considered quite safe.
For reasons outlined above, when very complex libraries are prepared, it is desirable to
choose as small bead size as possible, for example, 200-400 mesh (diameter: 38-75 m) resin.
Each gram of this resin contains about 10 million beads. Table 4.11. shows the quantity of resin
needed if the number of beads equals or exceeds 10 times the number of peptides. The data in
Table 4.11. clearly demonstrates that, due to practical reasons, the number of varied positions in
full libraries is limited to about 6 or 7.
The difficulties arising from the overwhelmingly large number of peptides in some full
libraries can be circumvented by preparing their partial libraries. One may follow two different
approaches for doing this:
1. Reducing the number of the varied amino acids;
2. Reducing the number of the varied positions.
It is, of course, possible to combine the two approaches. It seems worthwhile to consider in some
detail both possibilities.
Table 4.12. shows that the number of components in the libraries can effectively be
reduced by reducing the number of the varied amino acids .
73
Of course the chemist is not restricted to use the same number of amino acids in all
positions. An example of an octapeptide library is demonstrated in Table 4.13. that is constructed
by varying different number of amino acids in different positions.
Number of varied
amino acids
1
10
2
8
3
12
4
9
5
4
6
19
7
4
8
12
Total number of peptides
31,518,720
Intuition plays an important role when one decides which amino acid can be omitted in
the synthesis. One has to be aware, however, that if a partial library is prepared and an amino acid
critical to the activity of the potential active peptide happens to be among the omitted ones the
active peptide and the activity of the whole library is lost.
It is very convenient to prepare less complex libraries by reducing by one or two or even
more the number of the varied positions. Each fixed position reduces the number of peptides by a
factor of 19. The partial heptapeptide library of Table 4.14, for example, that has three non-varied
positions has only 130,321 components.
74
Considerations of the abovementioned features of peptide libraries may, perhaps, help the
potential user to be aware of the limitations of the library method, formulate a realistic research
plan and, when possible, circumvent the difficulties.
The examples in the considerations made in this section were peptide libraries. The
conclusions, however, hold for organic libraries, too.
Amino acids
A, F, G, H, R
H, I, K, L
D, E, F, T
A, G, K, T, W
If a full pentapeptide library, composed in every step from the twenty natural amino acids,
is represented as shown in Figure 4.10. the sequences of the peptides can be read along the lines
75
drawn through one of the amino acids found in the five rows of the figure. In the case of
pentapeptides 3.2 million different lines can be drawn each representing one of the theoretically
possible 3.2 million pentapeptide sequences.
Both libraries represented in Table 4.15. and Figure 4.10. can be considered full libraries.
Any library in construction of which the same amino acids are used but their number is reduced
in one or more coupling positions relative to those found in Table 4.15. or Figure 4.10. can be
considered a partial library.
1
A C D E F G H I K L M N P Q R S T V W Y
A C D E F G H I
A C D E F G H I K L M N P Q R S T V W Y
A C D E F G H I
A C D E F G H I K L M N P Q R S T V W Y
K L M N P Q R S T V W Y
K L M N P Q R S T V W Y
The library of Table 4.16. for example, is a partial library of the full library of Table 4.15.
In the synthesis of the partial library of Table 4.16. R and F are omitted in coupling positions 1
and 3, respectively and so the number of components is reduced from 400 to 240.
Amino acids
A, F, G, H
H, I, K, L
D, E, T
A, G, K, T, W
The library of Figure 4.11. is a partial library of the full library represented in Figure 4.10.
In the synthesis of this library in the coupling steps 1, 3, 4 and 5 all of the 20 amino acids are
varied. In coupling step 2, however, a single amino acid glycine is used that is coupled to the
resin without previous portioning. Coupling position 2 is a non-varied position and glycine is the
amino acid occupying coupling position 2 in all peptides. All sequence lines cross glycine. The
number of sequences, and consequently the number of possible sequence lines, is only 160,000.
This is the number of the components of the full library divided by 20.
As it will be shown later the partial libraries that have a single non-varied position play an
important role in screening. They are often called sub-libraries.13
The synthetically easiest accessible and at the same time the simplest sub-libraries are
76
those ones that form at the end of a split-mix synthesis omitting the final mixing. Their nonvaried position is the last coupling position. If 20 amino acids are varied in the synthesis, a single
split-mix run need to be carried out (without the last mixing) and the process ends up with 20
sub-libraries. These sub-libraries have another interesting feature: if they are mixed a full library
is formed. As 4.12. shows, this feature is the same for any full sets of sub-libraries having the
same non varied position. The non-varied positions in sets a, b and c are coupling positions 1, 2
and 3, respectively, and it can be seen that each of the three sets form a full library.
1
A C D E F G H I K L M N P Q R S T V W Y
A C D E F G H I K L M N P Q R S T V W Y
A C D E F G H I K L M N P Q R S T V W Y
A C D E F G H I K L M N P Q R S T V W Y
Figure 4.11. A partial pentapeptide library with one non-varied position
77
As it was pointed out, the synthesis is a single run process of the sets of sub-libraries in
which the non-varied position is the last coupling position, like in set c of Figure 4.12. In sets like
a and b the non-varied positions are in first and intermediate coupling positions, respectively. The
simplest way to prepare the components of these sets is to synthesize them separately, one by
one.
There are sub-library sets that do not form a full library. An example is demonstrated in
Figure 4.13. In sub-libraries c, b and a the blue amino acid occupies the first, second and third
coupling position, respectively. Both the all yellow and the all red sequences (marked by an
arrow), for example, and also other ones would be missing from the mixture of the three sublibraries, while other trimers are present in duplicates or triplicate (all blue).
Coupling position
3
2
1
As already pointed out, when the number of components in a full library is too large to
synthesize it in a single run it is practical to prepare it in portions. It has to be taken into account,
however, that some partial libraries are unpractical to prepare because their completion to a full
library requires too much work.13 It is unpractical for example to prepare L2 as a portion of the
full library L1 because L3 does not complete it to L1. The total number of components in L2 and
L3 is only 32 while L1 contains 256 tetramers. Several other libraries would have been needed to
be prepared in order to complete L2+L3 to L1. L4 and L5, however, are two reasonable choices
for portions of L1. Both L4 and L5 have 128 components that add up to 256.
78
L1
L2
L3
L4
L5
A,D,E,F
A,D,E,F
A,D,E,F
A,D,E,F
A,D
A,D
A,D
A,D
E,F
E,F
E,F
E,F
A,D
A,D,E,F
A,D,E,F
A,D,E,F
E,F
A,D,E,F
A,D,E,F
A,D,E,F
256
16
16
128
128
L, G, P
L, G
K, M, F
D, I
A, G
A combinatorial synthesis using these amino acids as building blocks produces a library that
contains all of the five pentapeptides. The number of components of the library is 3x2x3x2x2=72.
This means that in addition to the 5 wanted peptides 67 extra peptides are formed. The number of
coupling cycles in this synthesis is 3+2+3+2+2=12. If the 5 peptides are prepared by parallel
synthesis the number of the coupling cycles is 25.
Cohen and Skiena showed 49 that the total number of components of the libraries can be
reduced by increasing the number of the coupling cycles and by properly designing the synthesis.
They developed software that makes possible to optimize the total number of components of the
79
libraries that contain the wanted compounds vs the number of the coupling cycles needed in the
synthesis.
One of their examples is the synthesis of an arbitrary set of 496 pentapeptides. One may
think whether for the preparation of such a set the parallel method or the split-mix synthesis is
more advantageous. The parallel synthesis of such a set needs 2480 coupling cycles that involves
much work. They showed that by application of their optimization method the synthesis of a
20,000 member pentapeptide library that contains all of the 496 arbitrarily selected pentapeptides,
needs only 324 coupling cycles. One may decide what is better: the reduction of the number of
the coupling cycles by a factor of ca. 8 and accepting the presence of additional ca. 19,500
components in the mixture or preparation of the individual compounds in parallel by investing
several times more in labor. The choice certainly depends on additional factors, too. The use of
modified versions of the split-mix synthesis that applies macroscopic solid support units (see
later) offers a much better choice for the preparation of arbitrarily selected series of compounds.
1/2
Resin
1/2
E
No coupling
1/2
Mix
1/2
R
No coupling
1/2
Mix
1/2
G
No coupling
1/2
Mix
1/2
L
No coupling
Mix
80
Coupl.
with
0
L
0, L
G
0, L, G, GL
R
0, L, G, GL, R, RL, RG, RGL
E
The zero in the table means unchanged empty resin. The final mixture contains the
products found in the last row of the table including a fraction of the resin that remains
unchanged. One of the products is ERGL, that is, the peptide formed from the four amino acids
used in the synthesis and its sequence reflects the coupling order of the amino acids. The other
products can be derived from the sequence of this tetrapeptide. All the sequences and amino
acids that can theoretically be derived from this root tetrapeptide sequence by deletion of partial
sequences or amino acids are found in the mixture. The components of the mixture are present in
equimolar quantities.
As it was pointed out by Fodor an his colleagues, the number of peptides formed in the
binary synthesis (N) can be calculated from the number of coupling cycles (c) according to a
simple formula.
N=2c
This indicates a very high efficiency since N grows exponentially with the number of
coupling cycles. In 22 coupling cycles, for example, more than 4 million components form. The
number of components in the groups of peptides of different lengths, follow a binomial
distribution. When c=10 the total number of components is 1024. The length of peptides varies
from 0 to 10. The number of peptides belonging to different length is indicated in brackets: 0(1),
1(10), 2(45), 3(120), 4(210), 5(252), 6(210), 7(120), 8(45), 9(10) and 10(1).
81
One has to note that the N=2c formula calculates the maximum number of components
that are formed only when in the root sequence every amino acid is represented only once. In
many cases this requirement can not be maintained, for example, when the 20 L-amino
acids are used as building blocks in more than 20 coupling cycles. Multiple occurrences of amino
acids in the root sequence have two consequences:
(i)
(ii)
the number of components in the synthesized library is less than calculated from
the formula and,
the equimolarity of components is no longer maintained since some compounds
form in multiple molar quantities.
If an amino acid appears more than once in the root sequence, some peptides may form
from more than one source. This happens, for example, if the binary synthesis is carried out on
the basis of the following root sequence: EGGL. The products derived from this sequence are:
0, L, G, GL, G, GL, GG, GGL, E, EL, EG, EGL, EG, EGL, EGG, EGGL
The products indicated in bold appear twice in the list so their quantity relative to the
other components of the mixture is doubled. In the paper of Sebestyn et al.15 tricks are described
how to avoid formation of the products in non-equal quantities.
The binary synthesis may prove useful for exploration whether or not deletions in a region
of a longer peptide lead to bioactive fragment(s).
Another disadvantage relative to the split-mix method is that the one bead one product
feature is completely lost. Since mixtures of amino acids are used in every coupling step, instead
of single products, mixtures are formed in the beads. The synthetic history of every bead is the
same. As a consequence, within the limits of statistics, the content of the beads is the same. This
means that every bead contains all components of the library. The loss of the one bead one
product feature is a very significant disadvantage. The library can be analyzed only as a mixture.
The individual components of the library are absolutely not accessible.
The method has been applied mainly in preparation of peptide libraries but non-peptide
libraries have also been synthesized.19
83
The method has disadvantages, too. The very important feature of the split-mix method
that a single compound forms in one bead, is completely lost. In addition, the separation of the
support from the reaction mixture is not as simple as filtering out the bead form resin.
A
A
AA AG AK
GA GG GK
KA KG KK
In Figure 4.15. the irradiated areas are white and those shadowed by the mask are gray.
The synthesis of the 9 dipeptides is completed in 6 cycles irradiation and coupling (a to f). It is
84
remarkable that more peptides (32=9) form than the number of the executed coupling cycle (6)
that is characteristic in the combinatorial processes. The dipeptides are formed in the locations
shown in Figure 4.15/g.
It is worthwhile to note that in fact only two different masks are needed in the synthesis:
those two shown in Figures 4.15/a and 4.15./b. Mask positions d, c and f can be produced by
rotation of mask a by 90o, 180o and 270o, respectively.
If the synthesis is continued, in each elongation step finer masks need to be used and in
each elongation step the number of the components of the library increases exponentially as in
the split-mix synthesis. In the next elongation step, for example the masks 4.15./h and 4.15./i
would be needed and the other mask positions could be presented by rotation of these two. Mask
position j, for example, could be brought about by rotation of mask 4.15./h by 180o. After
completing the next 6 coupling cycles with the amino acids A, G and K (at mask positions h, i, j,
and at those positions brought about by rotating these by 90o, 34=81 tetrapeptides would form. As
expected in a combinatorial synthesis, among these 81 tetrapeptides all sequences would be
represented that can be deduced as a result of inserting the three amino acids (A, G and K) into
coupling positions 3 and 4. After completing the couplings, before the library undergoes testing,
the protecting groups, of course, have to be removed. In the testing experiments the products
remain attached to the slide.
The light directed synthesis in some respects is very similar to the split-mix method. In
the synthesis of peptide libraries the invested work, that is the number of the executed coupling
steps (Nc), linearly increases with the lengths of the peptides (n, the number of amino acids in the
peptides), while the number of the components of the library (Np) increases exponentially with
the length. If 20 amino acids are used in every step of the synthesis, the following formulae
express the invested work and the number of the peptides formed in the process.
Nc = 20n
Np = 20n
There are also differences relative to the split-mix method. One of the differences is that
in the light directed method the couplings of one elongation step can not be executed in parallel,
like in the split-mix procedure. The couplings with the single amino acids need to be carried out
serially, one after the other. Furthermore, the light directing method is, of course unsuitable to
prepare the libraries in large quantities.
Another difference provides a very significant advantage for the light directed method.
The identity of every product formed on the surface of the slide is exactly known. There is no
need for a separate analytical process in order to identify the products. If the masks, their
positions and the order of their application as well as the coupling order of the amino acids are
known, the identity of the products in every location of the slide can be deduced. This makes
application of the libraries in the testing experiments very simple.
It has already been mentioned that the binary synthesis was first introduced and
demonstrated in conjunction with the light directed synthesis. Its principle is that in each
elongation step only half of the slide is submitted to coupling the other half remains unchanged.
Figure 4.16. demonstrates the synthesis based on the same ERGL root sequence that was used in
demonstrating the binary synthesis realized by the split-mix method. The white regions of the
slide are irradiated then coupled with the indicated amino acid. The gray regions remain
unchanged. Four elongation steps are executed (a, b, c and d) as shown in Figure 4.16. The
85
products are found in Figure 4.16/e. It can be seen that the products are the same as those formed
in the split-mix binary synthesis (row 4 in Table 4.17).
E
G
L
R K
E
K
c
EGL
L
ERG
EG
RGL
GL
RG
ERL
EL
ER
RL
ERGL
Figure 4.16. Binary synthesis on the basis of the root sequence ERGL using the light directed
method
In the introductory publication around one thousand peptides were synthesized on the
surface of the slide. Since that time the number of the substances produced on a slide was very
significantly increased. In practice, the method is applied for making oligonucleotide chips that
are extensively used in nucleic acid analysis.26 On the surface of a chip less than 1.5 cm2 about
500,000 different oligonucleotides can be synthesized and a single silicon wafer may contain 49
to 400 different oligomer arrays. The light directed synthesis was developed at an American
company, Affymax, and the chips are manufactured and commercialized by Affymetrix. More
details of the method can be found in the home page of the company.27
86
DNA of the phage can be considered as an encoding tag since the sequence of the peptide can be
determined (after ampflification) by sequencing the proper portion of the DNA.
In order to fulfill this goal the microscopic solid support units (beads) applied in the
original method had to be replaced by macroscopic ones. In addition these macroscopic units had
to be labeled somehow in order enable the experimenter to identify the product formed in the
units. This means that the number of the support units applied in the synthesis need to be the
same as the number of the products. Before starting the synthesis a product have to be assigned to
each unit and properly labeled them. Assignment of the product involves assignment of the
building blocks and their coupling order. In addition, during the synthetic process the units have
to be distributed one by one into the reaction vessels according to the structure assigned to the
products.
The simplest way to label the units is to assign numbers to them from 1 to n where n is the
number of compound to be prepared. The building blocks and their order can be recorded in a list
or in a computer. Once the number of the unit is known, the operator can read from the list or
from the computer which building block need to be coupled into the unit in a given reaction step.
In other words the operator can identify the reaction vessel into which the unit has to be
transferred in the given phase of the synthesis. This is demonstrated in figure 4.17.
Box a in the figure shows 9 dipeptides assigned to the units 1 to 9. The sequences to be
synthesized in the units 1, 2 and 3, for example, contain G in the first coupling position
consequently they have to be placed into reaction vessel b where in the first synthetic step G is
coupled into all units. The amino acid G coupled into the units is indicated in bold in the
sequences. The arrows show where the units need to be transferred for the second coupling. The
numbers on the arrows show the numbers of the units. So the units 1, 2 and 3 are transferred into
the reaction vessels e, f and g, respectively. Similarly, the units 4, 5 and 6 from reaction vessel c
are transferred into reaction vessels e, f and g, respectively.
87
Two essentially different approaches have been developed to solve the problem of the
identification of the units. In one of the approaches physical labels are attached to the units. In the
other approach no labels are used. The units are encoded by the position they occupy in space.
b
GG-1
LG-2
AG-3
GL-4
LL-5
AL-6
GA-7
LA-8
AA-9
GG-1
LG-2
AG-3
7
8
GG-1
GL-2
GA-3
GA-7
LA-8
AA-9
GL-4
LL-5
AL-6
L
3
LG-1
LL-2
LA-3
AG-1
AL-2
AA-3
g
4.5.1. Encoding by attached labels. The radiofrequency and optical encoding methods
The first approach for the modified split-mix synthesis using macroscopic solid support
units was developed independently in two laboratories.32,33 In the suggested methods the solid
support units are permeable capsules containing resin and the labels are radiofrequency tags that
are also enclosed into the capsules. The method was commercialized by IRORI34.
A capsule enclosing the resin and the radiofrequency (R f) tag is demonstrated in Figure
4.18. The capsules are made of polypropylene and named MicroKans at IRORI. Their length and
diameter are 18 mm and 7 mm, respectively. They can enclose 25-30 mg resin making possible to
produce in them around 25 mol compound. The R f tag is a small microelectronic device in glass
cover, its length is 13 mm and the diameter is 3 mm. They have a permanent 40 bit code etched
into their memory and can receive and emit radiofrequency signals. When placed in
radiofrequeny field they re-emit their code.35 The MicroKans are available in 5 different sizes.
Their volume varies from 250 L to 660 L.
Rf tag
Resin
88
Other kinds of support units have also be introduced. One type of them is the Micro Tube.
Figure 4.19. demonstrates a MicroTube that is a plastic tube also containing an Rf tag. The
surface of the plastic tube is covered with a radiolitically grafted and functionalized polystyrene
layer. The length of a Micro Tube is 15 mm and the diameter is 6 mm. The capacity is about 30
mol.
Rf tag
Plastic tube
covered with
grafted
polystyrene
A third kind of support unit carries an optical coding system: the "Laser Optical Synthesis
Chips". The supports are 1x1 cm polystyrene grafted square plates. The medium carrying the
code is a 3x3 mm ceramic plate in the center of the synthesis support (Figure 4.20.). The code is
etched into the ceramic support by a CO2 laser in the form of a two dimensional bar code that can
be read by a special scanner.36
89
Cleavage stations are also available at IRORI together with other items that are needed in the
synthesis including, for example, a device that makes possible to easily fill the MicroKans with
exact quantities of dry resin.
Pool
and wash
The key operation in the synthesis is sorting. Since every unit has to be scanned and
delivered separately, the manual sorting process is relatively slow. Only several hundred or a
maximum of 1000 compound is usually prepared using this method. Definitely does not make
possible to prepare in a single run thousands of compounds. The automatic sorting machine
developed at IRORI solves this problem.
The principle of the automatic sorting is outlined in Figure 4.22. After each chemical step
the capsules are transferred from the reaction vessels into a larger vessel and thoroughly washed.
The pooled and washed units are then further transferred into the vibratory bowl (D) of the sorter.
Vibration of the bowl then forces the units into a tube (E). At the solenoid gate (B) the antenna
(C) reads the code and the computer (A) determines the destination of the unit. The destination is
one of the containers (F) that represents a reaction vessel, into which the capsule need to be
delivered for the next synthetic step. The delivery is executed by the X-Y movement of the
delivery mechanism. After sorting, of course, the capsules collected in each vessel are reacted
with a different monomer.
The automatic sorter can accommodate up to 10,000 units that can be sorted into a
maximum of 48 containers. The sorting speed is 1000 units per hour.
90
E
D
B
The radiofrequency tagging and the visual coding of the support units are used in manual
sorting systems developed by other companies, too. The Australian company, Mimotopes, offers
two kinds of solid support units shown in Figure 4.23: SynPhase Crowns (a) and SynPhase
Lanterns (b). Their surfaces are grafted and functionalized. Both kinds of units can be coded by
attaching to them Rf tags (c and d). One end of the R f tag fits into the holes in the crowns and
lanterns and so it can be firmly attached to them. Scanning of the units goes as described at the
IRORI method. A color tagging system has also been developed at the company that uses 8
different colors in the form of colored rings. The color system can be applied to both crowns and
lanterns. The stems are firmly attached to the crowns and lanterns by inserting them into their
holes. These stems hold the code forming rings (Figure 4.23. e and f).
c
d
e
f
a
b
Figure 4.23. Radiofrequency and color coding of crowns and lanterns:
a: SynPhase Crown, b: SynPhase Lantern, c: SynPhase Crown with Rf tag, d: SynPhase Lantern
with Rf tag, e: color coded SynPhase Crown, f: color coded SynPhase Lantern
Position of the ring on the stem encodes the reaction step and its color encodes the
building block. A list needs to be prepared in advance in which building blocks are assigned to
the positions and colors of the rings. The codes of the units are read visually and are distributed
manually among the reaction vessels of the next reaction steps according to the data of a list. The
8 colors in 4 reaction steps allow encoding 84=4096 units.
91
96
96
96
96
96
96
96
96
96
96
b
c
d
12
12
12
12
12
12
12
12
92
The Encore technique is demonstrated in Figure 4.24. The 960 lanterns are evenly
distributed among 10 reaction vessels (a). The content of each reaction vessels is reacted with
one of the 10 building blocks of the first reaction step. At the end of this step the content of all
the 96 lanterns placed in a reaction vessel is the same.
Before coupling with the second building block the lanterns are stringed on stainless steel
or polyethylene stringing tools (c). Each string contains 10 lanterns and a labeling color ring. The
positions of the lanterns are counted from the ring. Each lantern of a string comes from a
different reaction vessel (b). This way 96 identical strings are formed. Each string contains 10
lanterns and the content of each lantern is different. The strings are labeled by color rings. There
are 8 different colors.
The 96 strings are distributed into 8 groups of 12 strings labeled with the same color. The
figure shows two examples (b and d). The 12 strings having the same color are transferred into
the same reaction vessel in order to react in step 2 with the same building block. As a result, each
of the 8 reaction vessels contains 12 strings (e). After the second synthetic step the strings are
rearranged into 12 numbered reaction vessels (f). One string from each of the 8 reaction vessels is
transferred into one numbered vessel. This way each of the 8 strings of one reaction vessel carries
a different color. After the third synthetic step the lanterns are transferred into ten 96 well plates
for cleaving the formed compounds from the support. One lantern goes into each well of the
plate. The content of each well is identified from three recorded data: position of the lantern on
the string, color of the string and the number of the third step reaction vessel. The position on the
string, the color of the string and the number of the reaction vessel identify the first, second and
third building block, respectively.
The Encore technology has been commercialized and a number of tools were developed
and made available for simplifying the operations.
(ii)
93
(iii)
The inventors of the method considered different shapes for the units, different patterns
for their redistribution and different sorting devices.40 In the following pages the use of two kinds
of units and two kinds of manual sorting devices are described applying a fast redistribution
pattern named Semi-Parallel Sorting.
The support units are Mimotopes SynPhase Crowns and SynPhase Lanterns demonstrated
in Figure 4.25 (see also Figure 4.23. a and b). The crowns are used attached to stems (a and c)
that makes possible stringing. The stems are available at Mimotopes in different colors. Lanterns
can also be used attached to stems (d) but they can also be used in themselves (g) since they have
a hole in their center that provides possibility for stringing. The commercial stems are modified.
They have a drilled hole to allow the string to be passed, and they are carved to keep the holes
parallel and facilitate threading while they are in the sorting device (b). The modified stems can
be used repeatedly. An empty stem (e) is used to label the head of the strings and a half stem (f)
at the tail of the strings.
Head
10
15
20
10
25
15
Figure 4.26. shows stringed crowns and lanterns with full stem labels at the heads and
half stems at the tails. The two ends of the string must be distinguishable. Labeling at least one
end of the strings makes possible to unequivocally define the position of the crowns. Position of
the units on the string is counted from head to tail.
The number of strings that are formed in a synthetic step depends on the number of
building blocks used in that step since every building block needs a different string. Each string is
placed into a different reaction vessel for coupling. The strings themselves must be numbered or
otherwise labeled. The simplest way to label the strings is to make visible scratches on the stem
marking the head (Figure 4.25./e). Using colored stems is also a possibility.
The string itself must be resistant to solvents and other reaction conditions occurring in
the synthesis. In preparation of peptide libraries a polyethylene fishing line proved applicable.
The use of the String Synthesis is demonstrated with preparation of a tripeptide library on
SynPhase Crowns. Five amino acids are used as building blocks in each coupling step.
Consequently 5 strings need to be formed in each step and the number of the expected tripeptides
is 5x5x5=125. The number of crowns on one string is 25. After threading the crowns, each string
is placed into a reaction vessel carrying the same number as the string (Figure 4.27.). In each
reaction vessel the coupling is done with a different amino acid.
1
5
Source tray
Destination tray
Lantern sorter
Source tray
Destination tray
Redistributions can be carried out using very simple devices that can be easily made by a
machine shop. Two different devices are constructed for sorting: one for crowns and another one
for lanterns. Both devices operate on the same principle and both contain two identical pieces: a
source tray and a destination tray (Figure 4.28.). Both pieces of the crown sorter are metal plates
with several numbered parallel slots and bent at the two edges. The pieces of the lantern sorter
are polymer plates with numbered grooves. Before sorting, the crowns hang in the slots and the
lanterns stay in the grooves of the source tray as shown in Figure 4.29.
Crowns
Lanterns
Figure 4.29. Crowns and lanterns in the sorting device
In the sorting process the crowns or lanterns are pushed into the slots and grooves,
respectively, of the destination tray. It is important to note that in this operation the units preserve
their positions relative to each other when they are redistributed in groups. Figure 4.30 shows the
top view of the crown sorter after the delivery of a group of 5 crowns from the slot 5 of the
source tray into the slot 1 of the destination tray.
Head
5
4
3
2
1
Tail
5
4
3
2
1
Head
Source tray
Tail
Destination tray
The slots and groves of the source and destination trays of the sorter are numbered. It is
important to place each source string into the slot of the source tray carrying the same number.
The heads and tails of the source strings must be positioned into the slots of the source tray as
indicated in the figure otherwise the software (see later) can not be used. After sorting, the units
are restrung.
The destination strings must be numbered according to the numbers of the destination
slots or grooves and render their heads and tails to the heads and tails of the destination slots or
grooves.
96
a
Tail
Head
b
Figure 4.31. Crowns in the slot of the source plate. a: after loading, b: after removing the
string
The units are loaded into the sorter in stringed form. Figure 4.31. shows the crowns
loaded into the slots while still attached to the string (a). When the units are in their place in the
source plate the string is cut and removed (b).
The units are sorted in a string free form. After sorting they are found in the destination
plate. Before the delivery into the reaction vessels they are restrung (Figure 4.32.).
Figure 4.32. Stringing the crowns in the slot of the destination tray
Semi-Parallel Sorting (SPS). In the string synthesis one solid support unit is assigned for
each product. Except the last elongation step, the products form in groups (product groups) and
occupy a defined region on the string. The number of units in each product group is the same.
The units of each group need to be evenly distributed among the strings of the next step. Except
the last distribution step the units are also transferred in groups (delivery groups). In order to get
the number of units of the delivery groups the number of units of the product groups is divided by
the number of the destination strings. This calculation is done by computer.
Source trays
Destination trays
Figure 4.33. Semi-Parallel Sorting. Sorting the units of 3 source strings into 3 destination
ones in 5 relative positions of the source and destination trays.
97
The simplest way of the distribution would be to first transfer all the units of one source
string to the destination strings and then follow with the next source string. The Semi-Parallel
Sorting, however, outlined in Figure 4.33. is faster.
In positions 1, 2, 3, 4 and 5 of the figure, one, two, three, two and one slots of the source
and destination trays are in alignment (indicated by enhanced lines). From each aligned slot of
the source tray one delivery group of units is transferred into the corresponding slot of the
destination tray. The deliveries in these positions are repeated until all units are transferred.
A
Figure 4.34. Datasheet of the Excel Book where the starting data are entered.
98
The software. The software is written in Visual Basic and the data appear in Microsoft
Excel sheets. It can handle up to 1000 crowns, up to 20 reagents (building blocks) and up to 9
reaction steps. Figure 4.34. shows the datasheet of the Excel book where the starting data are
entered. Among the starting data are the number and symbols of the building blocks (monomers)
used in the coupling steps. The symbols are single letter abbreviations. In the case of peptide
synthesis the symbols correspond to the respective amino acids.
In the Excel sheets the areas of data entrance are yellow. Several data are instantly
calculated and appear in the blue regions of the screen.
First coupling
Str 1
Str 2
Str 3
Str 4
Str 5
Str 1
Str 2
Str 3
Str 4
Str 5
Str 1
Str 2
Str 3
Str 4
Str 5
First sorting
Second coupling
Second sorting
Third coupling
String 1
String 2
String 3
Products
String 4
99
String 5
Among the instantly calculated data are the total number of crowns (or lanterns) needed in
the synthesis and the number of coupling steps (column B), the number of source and destination
slots (or grooves) used in the first and subsequent sorting steps (D and E) and the number of
crowns occupying these slots (F and G). The number of crowns in product groups, that is the
number of units that contain the same product appear in H. The number of crowns in a delivery
group that have to be moved in every sorting cycle from a source to a destination slot can be seen
in column I.
The program can be started by pressing together Ctrl and S. The result of calculations
appears (depending on the number of redistributions) in sheets Sort #1 through Sort #9. The
sheets show a block containing the products present in the crowns of the source slots and, below
these, a second block showing the content of the crowns sorted into the destination slots.
Positions of the crowns are counted downward from the top. The number of sheets
showing the results of couplings and sortings is equal to the number of sortings plus one. The last
sheet contains the predicted product distribution on the final strings.
The software is free and can be downloaded via the Internet from the following Web site:
http://szerves.chem.elte.hu/furka
by clicking on the title Excel Book appearing on the lower part of the main page. The software is
available only for those who have Excel installed in their computer.
Experimental example. Synthesis of a library of 125 tripeptides. The synthesis was carried
out using 125 Mimotopes SynPhase Crowns (capacity 5.3 mol each) derivatized with FmocRink amide linker. The procedure was started with the formation of 5 strings by threading 25
crown units on Berkley Fire Line fishing line. Five Fmoc protected amino acids were used in
each coupling position. The flow diagram is demonstrated Figure 4.35. The symbols of amino
acids used in the couplings are those found in the Datasheet demonstrated in Figure 4.34. and that
are also indicated in Figure 4.35. below the reaction vessels.
Coupling. Couplings were carried out with strings placed in 100 ml flasks. The protecting
groups were removed by adding 10 ml 1:1 v/v DMF-piperidine then mixed on an orbital mixer
for 30 minutes. After the cleaving the protecting groups the solutions were decanted from the
strings then washed with 3x15 ml DMF, 15 ml DCM, 15 ml DMF, 15 ml DCM and 2x15 ml
DMF. The deprotection operation was once more repeated then finally washed with 2x15 ml
DCM. The strings were dried then 10 mmol Fmoc amino acid, 10 mmol HOBt and 15 mmol DIC
was added in 10 ml NMP solution then mixed on orbital mixer for 2 hours. The solution was then
decanted and washed with 3x40 ml DMF, 40 ml DCM and 2x40 ml DMF. The above coupling
operation was once more repeated. The strings were finally washed with 2x40 ml DCM. The
crowns, still on the strings were dried in an oven then submitted to sorting.
Sorting. After entering the starting data into the Datasheet (Figure 4.34), it can be read
from the last column, that in the first sorting the crowns need to be moved from each slot in
groups of 5. In the second sorting the crowns are moved one by one. The first sorting is
demonstrated in Figure 4.36.
The redistribution of the 125 crowns was completed in the nine stages each representing
different relative positions of the source and destination trays. In the stages 1, 2, 3, 4, 5, 6, 7, 8
and 9 the number of the transferred groups (of 5 crowns) was 1, 2, 3, 4, 5, 4, 3, 2 and 1,
respectively.
100
Stage
Source tray
Stage
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
Destination tray
Source tray
Destination tray
5
4
3
2
1
101
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
Stage
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
Source tray
Destination tray
Stage
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
Source tray
Destination tray
5
4
3
2
1
After coupling 2, String 1 contained five dipeptides in groups of five crowns. The rest of
the strings (not shown) differed from the first one since different amino acids were coupled into
them. After the second sorting, as the table shows, all products in crowns of String 1 were
different. Again, the product distribution in the rest of the strings (not shown) were exactly the
same. It is typical in String Synthesis that after redistributions the product distribution on all
strings is the same.
It is also typical that after couplings the strings are different. After the third that is the
last coupling not only the strings differ from each other but the content of the crowns within the
strings is also different. Positions of the formed tripeptides on the five strings after the third
coupling are shown in Table 4.19.
Since the redistribution process is directed by computer, the String Synthesis is suitable
for automation. Although no automatic machine has yet been constructed, an automatic sorter
designed according to Figure 4.38. would be capable to sort very fast tens of thousands of support
units placed in vertical source tubes (a) arranged circularly.42 In the sorting process the units
would be dropped through computer controlled electronic gates into the destination tubes (b)
stepwise rotated. This arrangement of the tubes would make possible to transfer the units
simultaneously from all tubes (parallel sorting).
102
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
Cpl. 1
Str. 1
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Sort 1
Str. 1
I
I
I
I
I
F
F
F
F
F
L
L
L
L
L
V
V
V
V
V
G
G
G
G
G
2
7
Cpl. 2
Str. 1
EI
EI
EI
EI
EI
EF
EF
EF
EF
EF
EL
EL
EL
EL
EL
EV
EV
EV
EV
EV
EG
EG
EG
EG
EG
4
5
b
1
2
7
3
6
4
5
103
Sort 2
Str. 1
EI
FI
WI
YI
SI
EF
FF
WF
YF
SF
EL
FL
WL
YL
SL
EV
FV
WV
YV
SV
EG
FG
WG
YG
SG
Str. 2
Products
FEI
FFI
FWI
FYI
FSI
FEF
FFF
FWF
FYF
FSF
FEL
FFL
FWL
FYL
FSL
FEV
FFV
FWV
FYV
FSV
FEG
FFG
FWG
FYG
FSG
Str. 3
Products
WEI
WFI
WWI
WYI
WSI
WEF
WFF
WWF
WYF
WSF
WEL
WFL
WWL
WYL
WSL
WEV
WFV
WWV
WYV
WSV
WEG
WFG
WWG
WYG
WSG
Str. 4
Products
YEI
YFI
YWI
YYI
YSI
YEF
YFF
YWF
YYF
YSF
YEL
YFL
YWL
YYL
YSL
YEV
YFV
YWV
YYV
YSV
YEG
YFG
YWG
YYG
YSG
Str. 5
Products
SEI
SFI
SWI
SYI
SSI
SEF
SFF
SWF
SYF
SSF
SEL
SFL
SWL
SYL
SSL
SEV
SFV
SWV
SYV
SSV
SEG
SFG
SWG
SYG
SSG
104
followed by rearranging the components of the input library according to their order in the virtual
library then distributed into the starting source strings. Table 4.20 shows a part of an input
library, the generated virtual library and the rearranged input library.
Table 4.20. Sequences in the input, virtual and the rearranged input libraries
1
2
3
4
5
6
7
8
9
10
Cherry P.
Virtual
Cherry P.
Input
Library
Rearranged
CITW
CITA
CITF
DGPV
DGPW
DGPA
DGPG
DGPF
DGRV
DGRW
CITW
CITA
CITF
CITV
CITG
CIPW
CIPA
CIPF
CIPV
CIPG
CITW
CITA
CITF
CITV
CIPW
CIPF
CIPV
CIPG
CIRW
CIRF
Since the library to be synthesized is not a complete combinatorial library, the delivery of
the support units from the source strings into the destination ones can not occur in equal groups.
The software generates tables that guide the redistribution operations in every phase of the
synthetic process. They also provide possibility to check for potential errors of the operator.
The building blocks of the library are coded using both the lower case and the capital
letters of the English alphabet (all together 52 symbols). The sequences of these letters encode
the compounds to be synthesized. The order of coupling positions - that is the order of the
characters in the sequences - go left to right. These are practically inversed peptide sequences.
When entering the input sequences into the computer there are no restrictions concerning
the order of library members but no gaps in column A are allowed. Column A can accept a
maximum of 15,000 sequences. The order of characters in the input sequences can be reversed (if
for example peptide sequences are used) by pushing Ctrl + i (Figure 4.39.)). The maximum
number of characters in the sequences, that is the maximum number of building blocks, is 10.
Execution of the sorting program can be started by pushing Ctrl + e (Figure 4.39). The
execution time depends on the size of the library and on the speed of the computer. In the
execution process the sequences are assigned to support units then the units are grouped into
strings. The tables guiding the redistributions are then calculated and displayed. Execution of the
program stops at Sheet 13, showing the position of the products on the final strings. Figure 4.40.
shows a part of Sheet 13.
Rearranging the order of the components of the input library. The order of the
components of the input library in column A is usually accidental. For this reason they can not be
directly arranged into strings that are submitted to coupling with the same building blocks.
105
I N P U T
1 CITW
3 CITF
4 CITV
CIPW
CIPF
CIPV
CIPG
CIRW
CIRF
CIRV
CISW
CISA
CISV
CISG
CP2
CP3
CP4
Number of compounds
CP1
S H E E T
2 CITA
5
6
7
8
9
10
11
12
13
14
15
4
11
The strings that need to be formed usually do not even contain the same number of units.
As a consequence, the order of the components of the input library has to be rearranged into a
form that allows regular redistributions.
Str.
1
AHSW
AHRW
AHPW
AHTW
AGSW
AGRW
AGPW
AGTW
AISW
AIRW
Str.
2
AHSA
AHRA
AHTA
AGSA
AGRA
AGPA
AGTA
AISA
AIRA
AITA
Str.
3
AHSF
AHRF
AHPF
AHTF
AGSF
AGRF
AGPF
AGTF
AISF
AIRF
Str.
4
AHSV
AHRV
AHPV
AHTV
AGSV
AGRV
AGPV
AGTV
AISV
AIRV
Str.
5
AHSG
AHRG
AGSG
AGRG
AGPG
AISG
AIRG
AIPG
FHRG
FGPG
A partial tetrapeptide library is used to illustrate the operations executed by the program.
By analyzing the input library the program first determines the crucial starting data and displays
106
Based on the above data, a full (virtual) peptide library is generated in which all
components of the input library are present. The number of components in the virtual library is
limited to 30,000. The sequences of the input library are then arranged into the order they appear
in the virtual library. Sheet 13 shows the sequences of the virtual library and those of the input
cherry picked and the rearranged cherry picked libraries. The first 10 sequences of these libraries
are reproduced in Table 4.20. The components of the original cherry picked library in column A
of the Input Sheet is replaced by the rearranged library. All further manipulations are based on
this rearranged library: first the sequences are assigned to support units and then the units are
distributed into the starting strings. The occupancy of the starting strings appears in Sheet 3. The
same sheet, in its lower part, contains the guiding tables for the first redistribution. In the
experimental realization of the synthesis the starting strings need to be formed manually by
placing the indicated number of support units on the strings then submit them to the first coupling
step. The symbols of the amino acids that need to be coupled into strings appear in red. Those of
the other amino acids in the sequences are black. The sets of destination strings that are formed in
redistribution steps occupy one of the Sheets 4 to 12. The products appear in Sheet 13.
N u m b e r of u n i t s o n t h e s t r i n g s
50
45
62
49
Third sorting
Str 1
CITV
CITF
CITA
CITW
CGTV
CGTF
CGTA
CGTW
CHTV
CHTF
Str 2
CIPG
CIPV
CIPF
CIPW
CGPG
CGPV
CGPF
CGPW
CHPV
CHPF
Str 3
CIRV
CIRF
CIRW
CGRG
CGRV
CGRF
CGRA
CGRW
CHRG
CHRV
Str 4
CISG
CISV
CISA
CISW
CGSG
CGSV
CGSF
CGSA
CGSW
CHSG
1
2
3
4
5
6
7
8
9
10
Figure 4.41. Sequences on the strings undergoing the third coupling step
107
As an example, Figure 4.41. shows a part of Sheet 5. This sheet demonstrates the 4 strings
that undergo the third coupling. The figure includes the first 10 tetrapeptide sequences from each
string. The codes of amino acids that need to be coupled with the respected strings appear at the
top and the number of units in each string are found below them. The numbers in the last column
show the position of the units on the strings. The sequences of the strings are printed in three
different colors: The codes of amino acids already coupled into the units in the previous coupling
steps are blue. The amino acids of the actual coupling steps appear in red and the codes the amino
acids that need to be coupled into the units in the forthcoming coupling steps are black.
The products of the synthesis and their positions on the strings appear in Sheet 13 (Figure
4.40). The sequences of the products are also shown in reversed form in the lower part of the
sheet (not shown in the table). This makes possible to read the orders of the building blocks as
peptide sequences, too.
Guiding tables for redistribution experiments. As already mentioned, in the case of the
cherry picked libraries some components that are present in a full (or virtual) library are missing.:
2
5
2
3
5
5
4
4
5
5
5
4
206
Units in
source
troughs
1
2
3
Units in
destination
troughs
1
2
3
4
0
4
7
11
15
15
0
0
5
10
14
18
0
0
0
5
9
12
50
50
50
50
46
42
45
45
45
41
37
34
62
62
59
54
50
50
49
45
40
35
35
35
5
4
3
4
5
5
5
5
5
5
/
/
/
/
/
/
6
5
4
3
2
1
3
5
3
4
4
4
4
4
4
4
/
/
/
/
/
/
6
5
4
3
2
1
15
17
22
24
27
27
18
18
23
28
33
37
16
16
16
19
24
27
38
38
38
38
35
31
34
34
34
32
27
24
50
50
45
40
35
35
35
33
28
25
25
25
4
3
3
3
3
3
3
3
3
/
/
/
/
/
/
6
5
4
3
2
1
27
27
30
30
32
32
37
37
37
39
43
47
31
31
31
31
35
38
27
27
27
27
25
21
24
24
24
24
20
17
35
35
32
30
26
26
25
25
25
25
25
25
3
2
4
4
Cycle/stop
position
108
The numbers of support units belonging to the strings usually differ. This is
clearly seen, for example, in Figure 4.41 (number of units on the strings).
(ii) The number of units within the groups of identical products may also differ.
(ii)
For these reasons the transfers in redistributions can not be realized in equal groups. Even
empty groups may occur. As a consequence, in order to be able to execute the redistributions the
number of units of every delivered group has to be calculated and displayed by the computer. The
Experiment guiding tables are presented in the lower parts of Sheets 3 to 12. A part of the table
found in Sheet 4 is demonstrated in Figure 4.42.
The table guides the redistribution after the second coupling. This is the second sorting
process when the units of three source strings are redistributed into four destination strings. The
guiding data are found in the columns below the title: Data for guiding redistribution. The third
sorting is realized in 5 cycles. The figure shows only those guiding numbers that belong to cycles
4 and 5. In each cycle the deliveries occur at 6 different relative positions (stops) of the two trays
of the manual sorter. The 6 relative tray positions of the six stops of a cycle are demonstrated in
Figure 4.43. Taking cycle 5 as example, the cycle/stop positions change from 5/1 to 5/6. The
same relative tray positions are repeated in all cycles. The support units (crowns or lanterns) are
delivered from the slots/troughs of the upper source tray into those of the lower destination tray.
The slots/troughs of the trays appear as vertical lines. The enhanced lines represent slots/troughs
from which and to which the units are delivered in a particular stop position.
12 345 67
12 345 67
5/1
1234567
1234567
5/2
1234567
1234567
12 345 67
1234567
5 /3
5/4
Cycle/stop positions
1234567
1234567
12 34 56 7
1234567
5/5
5 /6
Figure 4.43. The 6 relative tray positions in the 6 stop positions of cycle 5
Figure 4.42. shows one column for each of the three source strings from which the units
need to be transferred. The cycle/stop numbers are found in the fourth column. The start position
(stop position 1) is at the bottom in all cycles. The numbers of support units that have to be
transferred at a stop position from the source slots/troughs into the destination ones are found in
the columns of the strings in the same row where the cycle/stop numbers are found. For example
in the stop position 2 of cycle 5 (5/2) 4 and 3 units are transferred from the slots/troughs 2 and 3,
109
respectively. These transfers are made from slots/troughs 2 and 3 into the destination
slots/troughs 1 and 2, respectively (see Figure 4.43).
The program also provides possibility to check the accuracy of the redistribution in every
phase of execution and discover a potential error made by the operator. This is made possible by
displaying the number of units that have to remain in the source slots and appear in the
destination slots after successful transfers. These numbers are found in Figure 4.42. below the
title: Data for checking potential errors in sorting. These numbers are also found in the same row
where the cycle/stop positions are. After the mentioned transfers in the stop position 5/2, for
example, 14 and 9 units remain in the source slots/troughs 2 and 3, respectively and 46 and 37
units appear in the destination slots/troughs 1 and 2, respectively.
The software developed to guide sorting in the synthesis of cherry picked libraries also
provides a possibility to automate the redistribution process.
4.6. Examples
4.6.1. Split-Mix Synthesis of an encoded benzimidazole library.44
The library synthesized at Affymax (an American company) had three diversity positions
using 36 building blocks in each position. The structure of the components can be described by
the following general formula.
R2
N
R1
R3
HN
N
O
Since 36 building blocks were used in three positions the number of components was
36x36x36=46,656. R1 and R2 were built into the structure by using amines as building blocks;
their structure is demonstrated in Figure 4.44.
R
R'
H3C
O,S,N
n
NH2
n
NH2
n
NH2
NH2
n
NH2
NH2
Figure 4.44. Structure of amines used in the synthesis of the benzimidazole library. Total
number: 71
R3 was introduced by aldehide building blocks. Their structures are represented in Figure
4.45.
110
CHO
R
CHO
R'
CHO
CHO
H3C
O,S,N
R
n
NH2
The beads were encoded by a special binary type encoding developed at Affymax. The
encoding tags were secondary amines that were built in using the Alloc (allyloxycarbonyl)
protected monomers shown below. R and R are various length alkyl chains.
R
N
OH
R'
N
O
O
O
Encoding tags were used only at the first and second diversity positions. At the end of the
synthesis the samples were not mixed so encoding at this stage was unnecessary.
When a peptide library is prepared amino acids are used as building blocks. Their
reactivity is well known as well as the optimal reaction conditions. This is not the case when a
non-peptide library is synthesized. The reactivity of all building blocks has to be carefully
checked and the reaction conditions also need to be optimized. Some otherwise favorite building
blocks have to be excluded because their poor reactivity. It is not uncommon to spend much more
time with the pre-synthetic studies than with the synthesis itself. In the case of the benzimidazole
library synthesis the reaction conditions were also carefully optimized and the selected building
blocks showed good reactivity.
The synthesis was carried out using Tentagel HL NH2 resin as solid support. The first step
was conversion of a part of amino groups of each bead to be suitable to attach to them the coding
tags and convert the remaining amino groups for the acceptance of the first building block of the
product. For this reason a part of the amino groups were protected by Fmoc groups and the rest
was blocked by Boc protecting groups (Figure 4.46).
Fmoc
HN
NH2
NH2
NH
Boc
NH2
NH
Boc
This reaction was carried out with 72 g resin (27.4 mmol amine) in DCM in presence of
111
DIEA. The reagent was a mixture of 22.7 g ( 104 mmol) Boc2O and 0.79 g (3.1 mmol) Fmoc-Cl.
The resin was finally treated with piperidine that removed the Fmoc protecting groups and made
available a part (about 1/9 part) of the amino groups for attachment of the first coding tags.
Table 4.21. Encoding mixtures for the first and second diversity position
RV Code1 Code2 RV Code1 Code2 RV Code1
1
A
U
10
AE
UY
19
DE
2
B
V
11
AF
UZ
20
DF
3
C
W
12
BC
VW 21
EF
4
D
X
13
BD
VX
22 ABC
5
E
Y
14
BE
VY
23 ABD
6
F
Z
15
BF
VZ
24 ABE
7
AB
UV
16
CD
WX 25 ABF
8
AC
UW 17
CE
WY 26 ACD
9
AD
UX
18
CF
WZ
27 ACE
Code2 RV Code1
XY
28 ACF
XZ
29 ADE
YZ
30 ADF
UVW 31 AEF
UWX 32 BCD
UVY 33 BCE
UVZ 34 BCF
UWX 35 BDE
UWY 36 BDF
Code2
UWZ
UXY
UXZ
UYZ
VWX
VWY
VWZ
VXY
VXZ
Encoding. Six different tags were used to encode the 36 resin samples at coupling No. 1
(Code1), and another six ones at coupling No. 2 (Code2). Their stock solutions were labeled A, B,
C, D, E, F and U, V, W, X, Y, Z for Code 1 and Code 2, respectively.
Attachment of the Code1 tags for the R1s (Figure 4.47.). The resin was divided into 36
portions and place into reaction vessels RV1 to RV36. Samples of A, B, C, D, E, F solutions were
added to RV1 to RV6. To the rest of the reaction vessels (RV7 to RV36) mixtures were added
according to Table 4.21. The coupling reagents for the acylations were DIC and HOBt.
Tag1-Alloc
HN
NH2
NH
NH
Boc
Boc
Coupling the linkers to the resin. Two linkers (L1 and L2) were used in the synthesis.
Structure of both are seen in Figure 4.47. L1 is an acid-labile linker from which the product can
be cleaved off as an unsubstituted amide. In other words R1 in the product is H. The other linker
(L2) makes possible to attach to the resin the primary amines of Figure 4.44. using reductive
amination then cleave the product as substituted amides (R1H).
112
O
O
Fmoc
O
HN
OH
O
L1
OH
L2
Figure 4.48. The linkers
First the Boc protecting groups were removed in all the 36 reaction vessels with a solution
of 50% TFA (Figure 4.49/A). L1 was coupled only to RV1 DIC and HOBt. L2 was coupled to
RV2 to RV36 also by DIC HOBt (Figure 4.49/B).
Tag1-Alloc
HN
Tag1-Alloc
Tag1-Alloc
HN
NH
HN
NH
NH2
Linker
Boc
Reductive amination with R1 amines (Figure 4.50). The R1 groups were built into the
products by submitting the content of RV2 to RV36 to reductive amination with amines selected
from those in Figure 4.44. RV1 was left unchanged since the L1 linker itself holds the amino
group in Fmoc protected form. The reductive amination was carried out by adding solutions of
the 35 amines and NaCNBH3 to RVs 2 to 36 and keeping the solution at 50 o for 12 hrs.
Tag1-Alloc
Tag1-Alloc
HN
HN
NH
NH
R1
Linker
Linker
NH
113
Building in a scaffold. The next synthetic step was the attachment of a substituted
benzene scaffold by acylating the amine nitrogen with 4-fluoro-3-nitrobenzoic acid (Figure 4.51).
Tag1-Alloc
Tag1-Alloc
HN
HN
NH
Linker
R1
NH
R1
Linker
NH
N
F
O
NO2
Since the amino group in the L1 linker was protected by Fmoc group the content of RV1
was treated with piperidine to remove the protecting group. Then the couplings were carried out
in all the 36 reaction vessels by adding solutions of 4-fluoro-3-nitrobenzoic acid and DIC.
After the acylation the 36 resin samples were combined in solvent then mixed with
mechanical stirring and nitrogen bubbling. After washing the resin was dried then divided into 36
equal portions.
Nucleophilic displacement of fluorine by R 2 amines Figure (4.52). One of the 36 amines
(in 24x molar excess), solvent and DIEA were added to each of the reaction vessels, kept at 50o
for 12 hrs then washed.
HN
Tag1-Tag2-Alloc
NH
HN
R1
NH
Linker N
O
Tag1-Tag2-Alloc
R1
Linker N
R2
N
H
NO2
R2
N
H
NH2
Attachment of the second set of tags for encoding the R2 amines (Figure 4.53). For
encoding in the second diversity position a different set of encoding N-Alloc-Tag monomers
were used. Their labeled U, V, W, X, Y and Z. They were used individually and as mixtures
according Table 4.21. First the Alloc protecting groups were removed from the R1 encoding tags.
To each reaction vessel a solution of 1 M TBAF and TMSN3 was added followed by addition of a
solution of Pd(PPh3)4. After rapid mixing, the solution was left to stand at room temperature. The
liberated secondary amines of Tag1-s were acylated in the presence of DIAE and HATU. After
washing the 36 resin samples were combined. Before combining, however, usually 5 beads of
each sample were decoded to ensure the fidelity of coding.
114
HN
Tag1-Alloc
HN
Tag1
HN
A
NH
Tag1-Tag2-Alloc
R1
NH
Linker N
R2
Linker N
N
H
NH
R1
R2
N
H
NO2
R1
Linker N
R2
N
H
O
NO2
NO2
Reduction of the nitro group and benzimidazole formation with the R3 aldehides. The
combined samples were thoroughly mixed, submitted to reduction with SnCl2 then divided again
into 36 equal samples. To each of 35 samples a different aldehide was added in 15 fold molar
excess to facilitate the ring formation. One sample was reacted with trimethylorthoformate to
make R3=H in the product. The reaction mixtures were heated at 50o for 12 hrs. The samples
were finally washed and dried in vacuo without mixing them.
HN
Tag1-Tag2-Alloc
NH
R1
LinkerN
O
HN
A
R2
N
H
Tag1-Tag2-Alloc
HN
NH
R1
LinkerN
NH
R2
N
H
NO2
NH2
Tag1-Tag2-Alloc
R1
R2
Linker N
N
O
Product
R3
Before screening the products were cleaved from individual beads. The products were
used in the screening process and their identity could be determined by decoding the remaining
beads. The beads were treated with 6M HCl in a glass tube. The released secondary amines were
dansylated and identified by HPLC.45,46
4.6.2. Synthesis of a 10,000 member piperazine 2-carboxamide library by Directed Sorting47
Application of the radiofrequency encoding method and automatic sorting described in
chapter 4.5.1. is exemplified by the synthesis of a large organic library containing 10,000 discrete
components represented by the following formula.
115
R3
N
HN
R1
N
R2
Representative members of the arrays of the building blocks used in the three diversity
positions are found in Figure 4.55.
O
R1
H2N
H2N
H2N
H 2N
H2H
H2N
NHBoc
O
O
R2
R2
Cl
O
O
R3
O
O
Cl
HO
NHBoc
O
O
HO
O
O
O O
S
Cl
CF 3
O C N
O
O S
OH Cl
OH
HO
HO
N
H
CF3
HO
NHBoc
HO
HO
O C N
F
FF
HO O C N
O
O
O S
Cl
O
R3
Cl
NH HO
Cl
O
O
O C N
HO
BocHN
NH
Figure 4.55. Building blocks used in the synthesis of the piperazine 2-carboxamide library
The scaffold was built in by coupling with the orthogonally protected piperazine-2carboxylic acid.
O
R3
N
HN
R1
N
R2
The steps of the synthesis can be followed in Figure 4.56. The procedure was started with
10,000 MicroKans filled with resin and also containing the RF-tag. The linker was already
attached to the resin. After splitting the MicroKans into portions directed by the Synthesis
Manager, the first combinatorial step (Figure 4.56/A) was attachment of the primary amines (R1)
by reductive amination in the presence of NaBH(OAc)3. The MicroKans were pooled then the
previously formed secondary amines were acylated in the presence of HBTU and DIEA with the
protected piperazine-2-carboxylic acid (Figure 4.56/B). After attachment of the scaffold the Fmoc
groups were removed by piperidine (Figure 4.56/C) then MicroKans were sorted again.
116
Sorting was followed by the second combinatorial step (Figure 4.56/D) in which the
deprotected nitrogen of the ring was reacted with the R2 building blocks, sulfonyl chlorides,
isocyanates, chloroformates and carboxylic acids using properly selected reagents and solvents.
The MicroKans were pooled again and the Alloc protecting groups were removed (Figure 4.56/E)
with Pd(Ph3)4.
After sorting the third combinatorial (Figure 4.56/F) step was executed using the R3
building blocks to functionalize the second amino group of the ring. Reaction conditions were
similar to the second combinatorial step. The products were removed from the resin (Figure
4.56/G) by treating the MicroKans with 50% TFA-DCM.
O
O
O
NH
Alloc
N
O
R1
R1
Fmoc
O
Alloc
N
N
O
O
O
Alloc
N
R1
N
H
R1
N
R2
O
H
N
O
N
O
R1
F
O
O
R3
N
N
O
N
R2
R1
R3
N
HN
R1
N
R2
R2
1
3 steps
Protecting group
Protecting group
A1-10B1-10C1-10
A1-10B1-10C1-10
A1-10B1-10C1-10
3 steps
X 1-10Y1-10Z1-10
2
117
Both branches have appropriate functional group for attachment of building blocks. The
functional group on branch 1 is free that on branch 2 is protected. A three step split-mix synthesis
is executed using 10 building blocks in each step (building blocks A1-10, B1-10 and C1-10 in steps 1,
2 and 3, respectively). Thus 10x10x10=1,000 different trimers are formed on branch 1. After
mixing in the final combinatorial step the protecting group is removed from branch 2 then a
second three step split-mix synthesis is executed using again 10 blocks (X1-10, Y1-10 and Z1-10) in
each step. As a result a 10x10x10=1,000 component library forms on branch 2 of the support.
In the final product to branches 1 of the beads one component of the A, B, C library is
attached. Similarly, branches 2 hold one component of the library X, Y, Z. In both split-mix
processes a single compound forms in each bead. As a consequence if the beads are present in
large excess all possible combinations of pairs of components are formed. Since both libraries
have 1,000 components the total number of different pairs is 1,000x1,000=1,000,000. Of course
to prepare such a library at least of 10,000,000 beads are needed.
Such a library can be used for different purposes. If the two molecules are located at
appropriate distances (depending on the length of the branches) the potential interaction of the
two molecules can be studied. If one of the two molecules is a catalyst then the effect of the
catalyst on the other molecule can be tested. If both molecules are catalysts the experiments may
show which combination of the two catalysts is most effective on added substrate.
References
1. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherlands, 1988, Vol. 5, p 47.
2. . Furka, F. Sebestyn, M. Asgedom, G. Dib Proceedings of the 10th International
Symposium of Medicinal Chemistry, Budapest, Hungary, 1988, p 288, Abstract P-168.
3. . Furka, F. Sebestyn, M. Asgedom, G. Dib Int. J. Peptide Protein Res. 1991, 37, 487.
4. Peptide sequencer is an instrument in which the amino acids are cleaved stepwise from
the peptides starting at the N-terminus and the removed amino acids are identified.
5. R. Smuth, A.Trautwein, T. Richter, G. Nicholson, G. Jung In G. Jung (Ed)
Combinatorial Chemistry 1999, Wiley-VCH, Weinheim, 499.
6. S. Brenner and R. A. Lerner Proc. Natl. Acad. Sci. USA 1992, 89, 5381.
7. M. C. Needels, D. G. Jones, E. H.Tate, G. L. Heinkel, L. M. Kochersperger, W. J. Dower,
R. W. Barett, M. A. Gallop Proc. Natl. Acad. Sci. USA 1993, 90, 10700.
8. J. Nielsen, S. Brenner, K. D. Janda J. Am. Chem. Soc. 1993, 115, 9812.
9. V. Nikolaiev, A. Stierandova, V. Krchnak, B. Seligman, K. S. Lam, S. E. Salmon, M.
Lebl Pept. Res. 1993, 6, 161.
10. J. M. Kerr, S. C. Banville, R. N. Zuckermann J. Am. Chem. Soc. 1993, 115, 2529.
11. M. H. J. Ohlmeyer, R. N. Swanson, L. W. Dillard, J. C. Reader, G. Asouline, R.
Kobayashi, M. Wigler, W. C. Still Proc. Natl. Acad. Sci. USA 1993, 90, 10922.
12. 12.. Furka, F. Sebestyn, J. Gulys In
Proc. 2nd Int. Conf. Biochem. Separations,
Keszthely, Hungary, 1988, 35.
13. . Furka Drug Development Research 1994, 33, 90.
14. S. P. A. Fodor, J. L. Read, M. C. Pirrung, L. Stryer, A. Tsai Lu, D. Solas Science, 1991,
251, 767.
15. F. Sebestyn, G. Dib, A. Kovcs, A. Furka Bioorg. & Med. Chem. Letters 1993, 3, 413.
118
120
5. Screening methods
Compound arrays of individual compounds, combinatorial compound libraries as well as
arrays of new materials are prepared in order to find among their components new
pharmaceuticals, new insecticides, new fungicides, new plastics, new semiconductors etc. The
new useful compounds or materials can be found by examining the libraries looking for
components having pre-determined properties. This process is called screening. In order to be
able to do screening we need assays that unequivocally show the presence or absence of
components having the desired property. The development of the assay methods itself is an area
of intensive research, dealing with this subject, however, is not within the scope of this book. The
results of the assays often appear as changes in color, fluorescence, radioactivity, conductivity
etc.
Life on earth is largely dependent on pairs of molecules that perfectly fit together like
enzymes and substrates, antigens and antibodies, hormones and receptors (Figure 5.1). Detection
of binding of a component of a synthesized library (red in the figure) to a large target molecule
(green) is an often applied screening procedure. The binding can be detected by changing a color
appearance of fluorescence or radioactivity etc.
121
122
application of his microtiter plates. Improvements in the sensitivity of the assay methods made
possible to further increase productivity by replacing the 96-well plates with 384 and even 1,536well ones and by applying automation. Appearance automatic high performance work stations
made possible screening of well over hundred thousand compounds per day. The SAGIAN Core
Systems (Figure 5.4.) is one example of a standardized integrated system. Both liquid handling
and reading of the results of assays are fully automated. Figure 5.5 shows a plate reader.
The above core systems can be integrated with devices produced by other companies.
Figure 5.5. shows the Analyst GT of Molecular Devices Corp., a microplate reader optimized
for HTS, integrated into Sagian Core Systems.
123
124
Head
Figure 5.6. Three features of faces: hair (1), mustache (2) and beard (9)
Each of the features has three variants: hairless, medium hair and long hair; no mustache,
small mustache and large mustache; no beard, small beard and large beard. From all the variants
of the features applied to the head of Figure 5.6, the 27 different faces of Figure 5.7. can be
deduced. This figure is the same as the number of components in a tripeptide library synthesized
by using the same three amino acids in all three coupling positions.
2
Group 1
3
1
2
Group 2
2
Group 3
In the optimized procedure the synthesis is modified the following way. Before mixing, in each
coupling step a sample is removed and preserved for later use (Figure 5.8). After removal of the
samples the products are mixed as normal. As an alternative approach, instead removing samples
after the last coupling operation before mixing, one may choose to leave unmixed the final three
products of the last coupling step.
It is a good practice to determine before beginning the iteration whether or not the library
contains the desired bioactive component. For this reason the full library is cleaved from the final
mixed product of the synthesis and the solution of the peptides is then tested for the presence of
the desired bioactive component. The iteration experiments, of course, are executed only if the
test is positive.
Iteration step No. 1. Determination of the amino acid occupying coupling position 3 in the
bioactive peptide. The iteration procedure begins with determination of the amino terminal
residue of the bioactive peptide. For this reason the samples removed in the final (third) coupling
step are separately submitted to cleavage. The components of the resulting three tripeptide
mixtures are demonstrated in the three columns of Figure 5.9.
Products of the coupling operations
Removed samples
Step 1
Step 2
Step 3
126
peptide. If the sample marked by + shows activity in screening, it means, that the bioactive
peptide has to be among the peptides of the marked column otherwise there would be no activity.
+
Figure 5.9. Testing the three sub-libraries cleaved from the three samples removed after the third
coupling. The + sign shows the bioactive sub-library
Since all peptides of this column have red amino terminal, consequently the N-terminal
amino acid of the bioactive peptide is also the red one. The other two amino acids are, of
course, not yet identified. This is analogous to choosing the medium hair group in Figure 5.7.
Iteration step No.2. Determination of the amino acid occupying coupling position 2 in the
bioactive peptide. The amino acid residue occupying the coupling position 2 in the bioactive
peptide can be determined in three steps outlined in Figure 5.10.
Samples removed after second coupling
Step 1
Coupling
Coupling
Step 2
Step 3
Cleavage
Cleavage
Testing
Testing
+
Figure 5.10. Determination of the amino acid
occupying coupling position 2 in the bioactive peptide.
127
The samples removed after the second coupling operation contain dipeptides still attached
to the support. All peptides within a sample contain the same (yellow, blue or red) amino
acid at their terminal position. In the first step the red amino acid which is known to occupy
the coupling position 3 in the bioactive peptides is separately coupled to the three samples.
In the second step the peptides are separately cleaved from the support. The figure shows
that the three groups of peptides are differing from each other only by the amino acid (yellow,
blue or red) occupying the coupling position 2.
In the third step the three peptide mixtures are separately tested. If the test shows activity
in the mixture marked by the plus sign, for example, the bioactive peptide has to be in this
mixture (this is similar to selecting column 1 of group 2 in Figure 5.7.). Since all components of
the mixture have blue amino acid in coupling position 2 the bioactive peptide also has the
blue amino acid in this coupling position. As a result of iteration step No. 2 the second amino
acid of the bioactive tripeptide has been identified leaving only one amino acid to be determined.
Iteration step No.3. Determination of the amino acid occupying coupling position 1 in the
bioactive peptide.
The so far unknown third amino acid of the bioactive peptide can be
determined in four steps demonstrated in Figure 5.11. The three samples removed after the first
coupling operation are submitted to a two step elongation. First the Blue amino acid is attached
that is known to occupy coupling position 2 in the bioactive peptide. This is followed by
attachment of the red amino acid identified as the amino terminal amino acid in the sequence of
the active tripeptide. In the third step the peptides are separately cleaved from the support. As the
figure shows each of the three product samples contains a single tripeptide. These samples are
finally tested. If the sample marked by + sign is proves to be the active one, then the so far
unknown amino acid is the yellow one. The yellow amino acid occupies the C-terminal
position of the tripeptide. The amino acid sequence of the bioactive amino acid is: Red-BlueYellow. This step is analogous to identifying the suspects face as that of column 1, row 3 of
group 2 in Figure 5.7.
Step 1
Coupling
Coupling
Coupling
Coupling
Step 2
Step 3
Cleavage
Step 4
Testing
Cleavage
Testing
+
Figure 5.11. Determination of the amino acid
occupying coupling position 1 in the bioactive peptide.
128
In a real case the number of iteration steps depends on the length of the peptides in the
synthesized library. The number of the samples that need to be tested in an iteration step is the
same as the number of amino acids used as building blocks in the corresponding coupling step.
A synthesized library may contain a number of bioactive or other useful components that
can be identified. It may prove useful to take this into account when deciding the quantity of the
library to be prepared. The number and kinds of tests as well as their sensitivity have to play a
definitive role in planning. The quantity of the samples removed in the synthetic process as well
as the quantity of the final mixed full library is also a question that needs to be decided. It is
advisable to remove the samples in each synthetic step in the same molar quantity and leave after the last sample removal the same molar quantity for mixing. Since in the course of the
synthesis the number of components increases in each step their molarity is accordingly reduced.
Table 5.1. The quantity of samples removed in the synthesis of a tetrapeptide library for iteration
experiments
Coupling number Quantity in %
1
0.006
2
0.12
3
2.44
4
48.7
Left for mixing
48.7
If 20 amino acids are used in each step, the molarity of the components decreases by a
factor of 20. As a consequence, in order to keep the molarity constant in the removed samples,
their proportion have to be increased about 20 times in each step (not exactly 20 times since the
total quantity decreases after each sample removal). This is exemplified by a tetrapeptide library
synthesized using 20 amino acids in each step. The quantities of the removed samples are shown
in Table 5.1.
As already mentioned, when dealing with combinatorial libraries made by the split-mix
method, less labor is needed not only in the synthesis but also in screening. This is the case doing
the deconvolution by the iteration method, too. The efficiency can be shown by a simple
example. A tetrapeptide library made from 20 amino acids has 160,000 components. If this
library is prepared by the parallel method the screening process needs 160,000 experiments since
all components have to be tested separately. The iteration procedure for the same library needs 20
experiments after each coupling operation plus an experiment with the full library, altogether
only 81 experiments.
129
could have been achieved by pre-preparing sets of libraries that without any further modification
make possible identification of the bioactive component.
The screening strategy fulfilling the requirement described above was independently
developed in two laboratories. The principles and realization of the strategy later named
positional scanning, was first described in a patent application filed by Furka et al. in May, 19927
then published and practiced by Pinilla et al.8
The sub-libraries described in Chapter 4 and demonstrated in Figures 4.11. and 4.12. can
be considered as candidates for being components of pre-prepared sets of libraries in positional
scanning.
3 21
321
321
The reason is demonstrated in Figure 5.12. The sub-library B is one of the 9 sub-libraries
of the full library of Figure 4.12. A sub-library is a special partial library of a full library. It is
prepared by using a single amino acid in one coupling position in the synthesis. In all other
coupling positions all those amino acids are varied that are used in the synthesis of the full
library. As a consequence, all those components are present in a sub-library that contain the same
(non-varied) amino acid in the non-varied position. The non-varied position in the sub-library
of Figure 5.12. is coupling position 3, and the non-varied amino acid is the red one. In the
sub-library B all those peptides are present which contain the red amino acid in coupling
position 3 (for example trimer A) and no peptide is present that has other amino acid in this
position (compare to Figure 5.13.). Trimer C, for example is not present in the sub-library
because it contains the yellow amino acid in coupling position 3.
If a bioactive peptide
of the full library happens to contain the non-varied amino acid in the non-varied coupling
position (like the red amino acid in coupling position 3 of the trimer A) then the bioactive
peptide has to be found among the components of the sub-library. Consequently, the sub-library
gives a positive response in the screening test. On the other hand, if a different amino acid
occupies the non-varied position a negative result is expected.
In order to be able to do positional scanning, all sub-libraries of a full library have to be
prepared and tested. Figure 5.13. shows the full set of sub-libraries and their compositions of the
full library A. The number of sub-libraries in a set is the same as the total number of couplings
executed in the synthesis of the full library. In the synthesis of library A, three couplings are
130
executed in each of the three coupling positions so the number couplings as well as that of the
sub-libraries is nine.
B2
B1
C1
B3
C2
C3
D2
D1
D3
If a full pentapeptide library is made using 20 amino acids in all coupling positions the set
of set of sub-libraries contains 100 sub-libraries. A practical way for denotation of the sublibraries is to use a figure for indication of the non-varied coupling position followed by a one
letter symbol for the non-varied amino acid9 as demonstrated in Figure 5.14.
1A
2A
3A
4A
5A
1C
2C
3C
4C
5C
1D
2D
3D
4D
5D
1E
2E
3E
4E
5E
1F
2F
3F
4F
5F
1G
2G
3G
4G
5G
1H
2H
3H
4H
5H
1I
2I
3I
4I
5I
1K
2K
3K
4K
5K
1L
2L
3L
4L
5L
1M
2M
3M
4M
5M
1N
2N
3N
4N
5N
1P
2P
3P
4P
5P
1Q
2Q
3Q
4Q
5Q
1R
2R
3R
4R
5R
1S
2S
3S
4S
5S
1T
2T
3T
4T
5T
131
1V
2V
3V
4V
5V
1W
2W
3W
4W
5W
1Y
2Y
3Y
4Y
5Y
The first step in positional scanning is testing the full library. If the result is positive then
all components of the kit are tested. From the result, the amino acid sequence of the bioactive
peptide can be deduced. If the assays show that the sub-libraries marked by boxes in Figure 5.14.
are bioactive then the coupling positions 1, 2, 3, 4 and 5 are occupied in the bioactive peptide by
E, L, V, R and T, respectively. Taking into account that the order of amino acids in peptide
sequences is opposite of their coupling order, the sequence of the bioactive peptide is:
T-R-V-L-E
As it will be shown below, preparation of all sub-libraries, for example the 100 sublibraries of Figure 5.14., needs too much work and long time to prepare them for screening with a
single target. A company, however, can synthesize them in larger quantities and divide them into
smaller equimolar portions. The collections formed from these smaller quantities could be sold as
kits ready for use by biologists.
The synthesis of a single sub-library of Figure 5.14. needs 81 couplings: one coupling in
the non-varied coupling position plus 20 couplings in each of the remaining 4 positions. Since
there are 100 different sub-libraries in the figure, the total number of the required couplings is
8100!
The synthesis of the 100 sub-libraries, however, can be optimized in order to be able do
the preparation in less number of amino acid couplings. The optimization can be achieved by
doing as many couplings as possible with the combined form of the sub-libraries under
preparation. This is briefly described bellow.
Step 1. Preparation of the 1A to 1Y sub-libraries. The resin is divided into 20 equal
portions then a different amino acid is coupled to each portion. No mixing. One fifth part of each
portion is removed then four full split-mix cycles (split-couple-combine) are executed on each of
the 20 removed portions to get the 20 1X type sub-libraries. The total number of couplings is:
20 + 4x20x20 = 1620
Step 2. Preparation of the 2A to 2Y sub-libraries. The 20 portions remaining in Step 1 are
mixed, divided into 20 equal samples then each of them is coupled with a different amino acid.
No mixing. One fourth part of each sample is removed. The removed samples are separately
submitted to three full split-mix cycles. The result is the 20 2X type library. The total number of
couplings is:
20 + 3x20x20 = 1220
Step 3. Preparation of the 3A to 2Y sub-libraries. The 20 samples remaining in Step 2 are
mixed, divided into 20 equal portions then a different amino acid is coupled to each one. No
mixing. After coupling one third part of each sample is removed then separately submitted to two
full split-mix cycles. The product is the 20 3X type library. The number of the executed amino
acid couplings is:
20 + 2x20x20 = 820
Step 4. Preparation of the 4A to 4Y sub- libraries. The 20 samples remaining in Step 3 are
mixed, divided into 20 equal parts then a different amino acid is coupled to each part. No mixing.
Half of each sample is removed then one full split-mix cycle is executed on every sample. As a
132
result, the 20 4X type sub-libraries are formed. The total number of the executed amino acid
couplings is:
20 + 20x20 = 420
Preparation of the 5A to 5Y sub-libraries. The 20 samples remaining in Step 4 are not
mixed. Each of them is coupled with a different amino acid. The products are the 20 5X type sublibraries. The total number of couplings is:
20
The total number couplings executed in the whole process leading to the 100 sub-libraries
of the positional scanning kit of pentapeptides is: 4100. Compare this to the 8100 couplings
needed in the non-optimized process. The number of couplings could be reduced by almost 50%.
The fact that the synthesis of a positional screening kit is still to laborious prompted us to
think about other possibilities to circumvent the need for preparation of full positional scanning
kits. The result was the development of the omission libraries and the amino acid tester libraries.
A C D E F G
I K L M N P Q R S T V W Y
A C D E F G
I K L M N P Q R S T V W Y
A C D E F G
I K L M N P Q R S T V W Y
A C D E F G
I K L M N P Q R S T V W Y
A C D E F G
I K L M N P Q R S T V W Y
Figure 5.15. Amino acids used in the synthesis of an omission peptapeptide library. Histidine (H)
is omitted in all the five coupling positions.
The number of peptides in omission libraries synthesized from 20 amino acids, as well as
the number of peptides missing from them is summarized in Table 5.2.
Table 5.2. Number of peptides in full and omission libraries and the number of peptides missing
from the omission libraries
Length Full library Omission library Missing peptides
2
400
361
39
3
8,000
6,859
1141
4
160,000
130,321
29,679
5
3,200,000
2,476,099
723,901
6
64,000,000
47,045,881
16,954,119
Full
library
Using a very simple example, Figure 16. shows how the composition of omission libraries
can be derived from that of the full one. The fact that all the peptides of the omitted amino acid
are missing from the library makes the omission libraries applicable in the identification of
134
bioactive peptides. If the bioactive peptide contains the omitted amino acid, for example, the
omission library gives negative result in testing, since the bioactive peptide is not present in the
omission library. On the other hand, if the bioactive peptide does not contain the omitted amino
acid the omission library gives positive test since all peptides except those of the omitted amino
acid are present. Based on these properties the omission libraries can be used for determination of
the amino acid composition of the bioactive peptide.
This is demonstrated in Figure 5.16. If two omission libraries (blue and red) give
negative result in screening test this means that the bioactive peptide contains blue and red
amino acids. The yellow amino acid is not present because the test with the yellow omission
library is positive.
By use of omission libraries the amino acid composition of the bioactive peptide can be
determined. Nothing is known, however, about the coupling position of these amino acids. This
has to be determined by additional experiments. This task, however, is much less complicated
than the original one. This can be illustrated by a simple example.
Suppose we deal with a full pentapeptide library and the result of using the 20 omission
libraries is that the bioactive peptide contains the following four amino acids: A, G, R and H.
Then a much less complex library can be defined that is built up using these 4 amino acids in all
coupling positions.
1
2
3
4
5
AGRH
AGRH
AGRH
AGRH
AGRH
If this simpler library that contains only 256 components instead of the 3.2 million ones
in the in the original pentapeptide library is prepared the bioactive peptide must be present in it.
This means that the original task is reduced to a much simpler one. This simpler library can be
named occurrence library. The amino acid sequence can be determined by application of the
iteration or the positional scanning method to the occurrence library. Practical example will be
shown in a separate paragraph.
The synthesis of the omission libraries is simple. The number of libraries in a kit is 20 if
the 20 amino acids are used in preparation of the full library. The number of components of the
kit does not depend on the length of the peptides. Preparation of the kit is simpler and less time
consuming than, for example, the synthesis of the 100 components of the positional scanning kit.
135
the full library. If 20 amino acids are used in the synthesis, the number of components is 20 like
in the case of the omission library kit.
An amino acid tester library gives an opposite result in screening compared to an
omission library. An alanine tester library, for example, that comprises all alanine containing
peptides shows activity in screening only when alanine is present in the sequence of the bioactive
peptide. If the test is carried out with a tester library of an amino acid that is not present in the
bioactive component no or a reduced activity is expected.
Figure 5.17. shows the composition of simple amino acid tester libraries that can be
compared to the full one. If the libraries marked by the plus sign prove to be active then the
amino acid composition of the bioactive peptide is: yellow, red.
Full
library
136
total number of peptides in the seven groups is 1141, in accordance with the corresponding figure
of Table 5.2.
Considering the possibility of the synthesis, the groups of 1, 4, 5, 7 and groups 5, 6 of
Table 5.3. can be amalgamized into groups 1 and 2, respectively of Table 5.4. Group 3 of Table
5.3. remains alone and is transferred into Table 5.4. as group 3. The total number of peptides of
course remains unchanged: 1141.
The synthesis of the alanine tester library needs the preparation and mixing of the three
partial libraries represented in Table 5.4. as groups 1 to 3. Although not demonstrated in the
tables, the number of component libraries to be prepared in the case of tetrapeptides and
pentapeptides is four and five, respectively.
137
5.2.1.7. Examples
The applicability of the screening strategies described before is demonstrated by a few
model experiments. The task in these experiments was to determine whether or not a synthesized
tripeptide library has a component that inhibits binding of LHRH14 to its antibody. The amino
acid sequence15 of the hormone is shown below:
pGlu-His-Trp-Ser-Tyr-Gly-Leu-Arg-Pro-Gly-NH2
138
The LHRH polyclonal antibody as well as the radioactively labeled LHRH were the products of
Advanced ChemTech. The competitive inhibition of LHRH to its antibody was determined by
radioimmunoassay16 (RIA).
Testing the full trippeptide library.11 Since the LHRH is a decapeptide amide, a tripeptide
amide library was prepared and used in screening. In the split-mix synthesis of the tripeptide
amide library 19 amino acids were used in the first and second coupling position (cysteine was
omitted) and, because LHRH has pyroglutamic acid at the N-terminal position, pyroglutamic
acid was added to this set in coupling position 3. The library was prepared on Rink amide resin
using the F-moc strategy.
The tripeptide amide library was added in increasing concentrations to the mixture of
radioactive LHRH and its antibody and their binding was determined by RIA. The result is
demonstrated in Figure 5.18. It can be seen that binding is strongly reduced by increasing the
concentration of the library. This makes probable that the library has component/s that inhibit
binding, that is, it is worthwhile to make further experiments in order to identify this component.
The result also suggests that the optimal concentration for the binding experiments should be
around 50 microgram/ml. In all the further experiments the libraries were applied in molarities
equivalent to this concentration.
139
100
90
100 - LH-RH Binding %
80
70
60
50
40
30
20
10
3p
3Y
3W
3V
3T
3S
3R
3Q
3P
3M
3L
3K
3I
3H
3G
3F
3E
3D
-10
3A
Fi
Figure 5.19. Inhibitory effect of sub-libraries used in the first iteration step.
3p denotes pyroglutamic acid in the N-terminal position
Figure 5.19. shows how strong the inhibitory effect of the sub-libraries is. It can be clearly
seen that sub-library 3R exhibits the far strongest effect. This means that the amino acid
occupying the coupling position 3 in the inhibitory tripeptide amide is arginine, R.
Application of omission libraries.10 The omission libraries were derived from the
tripeptide amide full library described above. Thus in the synthesis of 19 omission libraries (-A to
Y) one amino acid was omitted in all the three coupling positions. The remaining 18 amino
acids were built in into all positions. Pyroglutamic acid was also present in all omission libraries
in coupling position 3.
Pyroglutamic acid omission library could not be prepared since this amino acid can be
inserted only into coupling position 3 of the tripeptides. It was also important, however, to test
whether or not this amino acid is present in the active peptide. For this reason a full tripeptide
amide library was prepared by using 19 amino acids in each coupling position and the
pyroglutamic acid was omitted from position 3 (denoted by p).
The importance in inhibition of the amide groups of the peptides was also tested. In order
to do this one part of the full library of tripeptides was cleaved from the support in the form of
carboxylic acids instead of amides and tested in this form (denoted by -a).
When tested the omission libraries, the full tripeptide amide library (denoted by X) was
also included. The result is demonstrated in Figure 5.20. It can be seen that the omission libraries
that less reduces the competitive binding are: -G, -P, -R and a. This means that the amino acid
composition of the inhibitory tripeptide is glycine (G), proline (P) and arginine R. The peptides
that do not have amide groups are not effective inhibitors. This means that the amide group is
also essential part of the inhibitory tripeptide.
140
120
Binding of LH-RH %
100
80
60
40
20
0
X
-A
-D
-E
-F
-G
-H
-I
-K
-L
-M
-N
-P
-Q
-R
-S
-T
-V
-W
-Y
-p
-a
The results of the experiments carried out with omission libraries gave no indication
about the position of the amino acids within the sequence of the active peptide. Despite this, the
information gained by only 21 screening experiments is very valuable. They define an amino
acid occurrence library that can be synthesized by varying only three amino acids, Gly, Pro, and
Arg in all of the three coupling positions. The inhibitory tripeptide is expected to be present
among the 27 components of this tripeptide amide library. In other words, by screening with
omission libraries, the complexity of the library in which the active peptide is found could be
reduced from the original 7220 to only 27.
The positions of the identified amino acids could be determined by using one of the
following three possibilities:
1. Preparation by parallel synthesis and screening of the 27 components of the occurrence
library.
2. Application of positional scanning to the occurrence library (preparation and screening of
nine sub-libraries).
3. Positional scanning with nine sub-libraries of the full library (if available).
Preparation and use of amino acid tester libraries.12 Amino acid tester libraries offer an
alternative choice besides omission libraries for determination of the amino acid composition of
active peptides. The set of libraries used in the experiments were derived from the full libraries
prepared from 19 amino acids plus pyroglutamic acid in coupling position 3. Each library of the
set contains 1064 tripeptide amides and can be formed like the alanine tester library in Table 5.5.
by mixing the groups 1 to 3. Instead of preparing separately the three groups, however, they were
synthesized then mixed in a single optimized process using the ACT 357 automatic synthesizer.
141
0.380
Group 1
A
18
Group 2
0.360
19
Group 3
0.324
18
Groups 1+2
A
A
20
ATL
Figure 5.21. Flow diagram of the synthesis of the alanine tester library (ATL).
A: Coupling with alanine; 18, 19, 20: Portioning, then coupling with 18, 19 and 20 different
amino acids, respectively, then mixing.
142
The Groups 1 + 2 resin was distributed into 20 equal parts, each coupled with one of the
20 amino acids (including alanine and pyroglutamic acid) then mixed. Finally, the Groups 1+2
resin was mixed with the Group 3 sample to give the alanine tester library (ATL). All operations
were preprogrammed and at the end the product was accumulated and mixed in the collection
vessel.
The applicability of the synthesized amino acid tester libraries in determination of the
amino acid composition of active peptides was tested under reaction conditions described at
omission libraries. Figure 5.23. shows that the inhibition of binding of LHRH to its antibody is
strongest in the case of the glycine (G), proline (P) and arginine (R) tester libraries. Consider that
on the y axis not LHRH binding% but 100- LHRH binding% is plotted. Consequently the active
tripeptide amide contains glycine, proline and arginine.
Reaction vessel
0.38 g resin
Collection vessel
0.684 g resin
Figure 5.22. The reaction block of the ACT 357 Synthesizer and the quantities and places of the
resin at the start of the process
100-LHRH Binding%
Applicability of the amino acid tester libraries is the same as that of the omission libraries.
80
70
60
50
40
30
20
10
0
A
Figure 5.23. Effect of amino acid tester libraries on binding of LHRH to its antibody
Synthesis and use of positional scanning libraries. Screening with both omission and
amino acid tester libraries led to the same result: binding of LHRH to its antibody is inhibited by
a tripeptide amide having composition glycine, proline and arginine. As outlined before, based on
143
this result an occurrence library can be defined. If this library is synthesized, it contains among its
components the inhibitor tripeptide amide (Table 5.6.).
Amino acids
G P R
G P R
G P R
The position of the three amino acids in the sequence of the inhibitor tripeptide amide is
determined by synthesizing and testing of the nine component omission library kit of the
positional scanning library. The synthesis is optimized to make possible to prepare the nine
component libraries of the kit (1G, 2G, 3G, 1P, 2P, 3P, 1R, 2R and 3R) in a single run on the
automatic synthesizer ACT 357. The solid support was again Rink resin. The synthesizer was
pre-programmed that made possible to execute the whole process automatically.
The flow diagram is demonstrated in Figure 5.24. The starting resin, placed into the
collection vessel was first divided into three portions then coupled with glycine, proline and
arginine, respectively.
Start
Split
Mix
Mix & Split
Coupling with G
1/3
1/3
1/3
Coupling with P
Coupling with R
1/2
1/2
1/2
3G
2G
2P
3P
2R
3R
1G
1P
1R
Figure 5.24. Flow diagram of the synthesis of the 9 positional scanning sub-libraries
of the occurrence library
144
Before mixing, 1/3 part of each reaction product was transferred into a separate reaction
vessel then each of them was individually submitted to two consecutive portioning-mixing cycles
coupling in each cycle with glycine, proline and arginine, yielding 1G, 1P, and 1R as end
product. The remainder was mixed, divided into three parts then each part coupled with one of
the three amino acids. Again, before mixing, 1/2 part of each sample was transferred to a separate
reaction vessel then individually submitted to a full portioning mixing cycle, using again glycine,
proline and arginine in couplings. These operations resulted in formation of 2G, 2P and 2R. The
remainder was mixed, divided into three portions then each coupled with one of the three amino
acids. The three products were 3G, 3P and 3R.
The synthesized nine first order sub-libraries were used to determine the position of
glycine, proline and arginine in the tripeptide responsible for competitive inhibition of binding of
LH-RH to its antibody (Figure 5.25).
100
80
60
40
20
0
1R 2R 3R
1G 2G 3G
1P 2P 3P
Since on the y axis not LHRH binding% but instead 100- LHRH binding% is plotted,
Figure 5.25. shows that the inhibition of binding of LHRH to its antibody is strongest in the case
of the 1G, 2P and 3R sub-libraries. Consequently, arginine, proline and glycine occupy the
coupling positions 3, 2 and 1 in the tripeptide, respectively. The sequence of the inhibitor
tripeptide is Arg-Pro-Gly-NH2. This sequence happens to be identical with the C-terminal
sequence of LHRH.
5.2.2. Deconvolution methods of libraries not cleaved from the solid support
The components of the tethered libraries are found in the beads of the solid support as
individual compounds. Consequently, they can be tested as individual substances. It has to be
taken into account, however, that the structure of the compounds present in any particular bead is
unknown. For this reason the deconvolution process has to solve two problems:
Identify the bead that contains the component showing the wanted property
Identify the compound tethered to the bead.
145
The beads containing the individual components of the combinatorial libraries can be, and
are, tested in two different ways:
Both approaches have to make possible identification of the bead containing the active
compound and determination of the structure of the wanted compound. As it will be shown both
deconvolution processes need less number of assays then are needed in the determination of the
activities of compound arrays prepared by parallel synthesis.
Figure 5.26. Identification of the beads that specifically bind to target molecules. The beads
containing specifically binding peptide are colored
In application of the method the beads containing the full tethered peptide library, or a
fraction of it, is immersed in the solution of a target molecule. The beads containing peptide that
binds to the target are identified, separated from the rest of the beads then the peptide they
contain is sequenced. The binding have to be visualized somehow. This can be achieved by
labeling the target molecule before or after the binding experiment. The target protein can be
146
labeled by attaching to it a color, fluorescent or radioactive residue. If the target protein is labeled
by color and the beads are examined through a microscope the binding beads can easily be
distinguished from rest of the beads by their color as shown in Figure 5.26. The beads are colored
because the target molecules are colored that are attached to the peptide molecules of the bead.
The colored beads are manually separated from the rest of the beads then by washing with
an appropriate solvent the attached labeled target protein is removed from them. After washing,
the sequence of the peptides is determined using automatic sequencing machine.
A more effective and faster selection process can be applied if the target molecule is
labeled by fluorescence. The beads can be sorted by fluorescence-activated cell sorting
instrument. A special automatic machine was developed by Morten21 for sorting fluorescent
beads.
A different approach, infrared termography, was applied by Taylor and Morken22 to
identify catalysts in non-peptide tethered libraries synthesized by the split-mix procedure. The
method is based on the heat that is evolved in the beads that contain a catalyst when the tethered
library immersed into a solution of a substrate. The heat increases the temperature of the catalyst
containing beads. In the beads that do not contain catalyst no heat is evolved so their temperature
remains unchanged. Although the increase of the temperature is very low, when the beads are
examined through an infrared microscope the catalyst containing beads appear as bright spots as
demonstrated in Figure 5.27.
The screened library was an organic one and the beads were encoded. The bright beads
were separated and the identification process ended with determination of their code.
can be used for structure determination. If encoded libraries are used the synthesized compounds
and the codes can be released separately. It is also a possibility to release the content of the beads
and identify the bead containing the active compound in several stages.
The procedure developed at Pharmacopeia23 uses this latter possibility for screening
libraries of small organic compounds. The libraries are prepared applying the binary encoding
technique and using a photolabile linker which allows a two stage release of the organic
substance. After portions of beads are distributed into small containers (Figure 5.28/A), the first
portions of the substances are released by irradiation. The content of each vessel is then
submitted to screening. If one of them proves to be active (marked by + sign in the figure), the
beads of this container are re-distributed into vessels each containing a single bead (Figure
5.28/B). After releasing the second portion of the substances, a second screening identifies the
bead which carried the active substance (marked by + sign). Finally the encoding molecules are
released from the identified bead and determined by electron capture gas chromatography thus
determining the structure of the organic molecule responsible for the biological activity.
+
A
+
Figure 5.28. Identification of the bead containing the bioactive component in two stages.
It is worthwhile to note that this two stage process needs less number of screening
experiments then does the one stage process when all beads are tested individually. The 11
containers of Figure 5.28/A contains a total of 110 beads. The one stage process would need 110
screening experiments. The two stage process needs as shown in the figure only 21 assays.
5.2.2.3. Examples
The application of the screening methods developed for libraries not cleaved from the
solid supports are demonstrated by examples described in the literature.
Identification of bioactive peptides with enzyme-linked colorimetric assay.24 The
experiment is carried out with tethered peptide libraries synthesized on TentaGel resin. Binding
of the target molecule to the beads containing the active peptide is indicated by the blue color of
an indigo derivative that forms from 5-bromo-4-chloro-3-indolyl-phosphate (BCIP)
dephosphorylated by the enzyme alkaline phosphatase. The target protein is biotinylated before
the binding experiment and the alkaline phosphatase is derivatized with streptavidin. The target
protein binds strongly the reporter enzyme that converts BCIP to the blue indigo. If the protein
binds to a bead containing the active peptide then a solution of BCIP is added, then insoluble
148
indigo forms on the bead staining its surface to blue. The experiment is carried out with 5,000 to
50,000 beads at room temperature or 4 oC. The goal of the experiment is to identify only those
beads that bind the target protein by strong specific interaction. Weak, non-specific interactions
may also occur. In order to make these interactions invisible, the beads are incubated with gelatin
that coats the beads by weak, non-specific interaction. The strong specific binding of the target
molecule to the beads is expected to displace the gelatin from the surface. Non-specific binding,
however, may also occur with the alkaline phosphatase, too. In order to eliminate the misleading
effect of this the beads are prescreened with streptevidin-alkaline phosphatase. The beads are
incubated with streptevidin-alkaline phosphatase and after washing a incubated with a BCIP
solution. As a result of interaction of the streptavidin-alkaline phosphatase with some peptides,
blue beads appear that are manually removed. Following the prescreening and washings, the
beads are ready for final screening. The beads are incubated with a solution of the biotinylated
target protein for 1-2 hours when the target molecules displace the gelatin from some beads and
bind strongly to the their peptides. This is an invisible process. In order to make it visible the
beads, also after washings, are first incubated with streptavidin-alkaline phophatase which is
attached to the immobilized target molecules via the strong biotin-streptavidin interaction.
Finally, the beads are washed with BCIP solution. Blue color develops on some beads. The beads
are examined in Petri dish under microscope. The blue beads are removed manually. The target
protein-enzyme complex is washed away from the beads with urea solution. The amino acid
sequences of the peptides that show specific binding property can be determined without cleaving
them from the beads. Automatic micro-sequencing machines that are based on the well known
Edman degradation can be used for this purpose.
Whole cell binding assay.24 In addition to target proteins, binding experiment can be
carried out with intact cells, too. This approach can be used to study the cell surface receptors and
their ligands. Sterilized tethered peptide libraries are used in the experiments. The beads
surrounded by binding cells can be distinguished from the inactive ones as seen in Figure 5.29.
The active beads are again manually separated and sequenced after removal of the attached cells.
Figure 5.29. Cell binding. The large circles are beads, the smaller ones are cells. The cell binding
beads are marked by arrow
149
References
1. Gy. Taktsy Acta Microbiologica Acad. Sci. Hung. 1955, 3, 191.
2. Furka (1982) Tanulmny, gygyszatilag hasznosithat peptidek szisztematikus
felkutatsnak lehetsgrl (Study on possibilities of systematic searching for
pharmaceutically useful peptides). Unpublished theoretical study written in Hungarian for
internal use, describing the PM synthesis and an iteration strategy for screening of soluble
libraries. Notarized on June 15, 1982, file number 36237/1982. See also paragraph 1.2.
3. . Furka Drug Discovery Today 2002, 7, 1.
4. H. M. Geysen, R. H. Meloen, S. J. Barteling Proc. Natl. Acad. Sci. USA 1984, 81, 3998.
5. R. A. Houghten, C. Pinilla, S. E. Blondelle, J. R. . Appel, C. T. Dooley, J. H. Cuervo
Nature 1991, 354, 84.
6. K. D. Janda Proc. Natl. Acad. Sci. 1994, 91, 10779.
7. . Furka, F. Sebestyn WO 93/24517.
8. C. Pinilla, J. R. Appel, R. A. Houghten, In C. H. Scneider, A. N. Eberle, (Eds) Peptides
1992, 1993, ESCOM, Leiden, 65.
9. . Furka Drug Development Research 1994, 33, 90.
10. T. Carell, E. A. Winter, J. Rebek Jr. Angew. Chem. Int. Ed. Engl. 1994, 33, 2061.
11. E. Cmpian, M. Peterson, H. H. Saneii, . Furka Bioorg. & Med. Chem. Letters 1998, 8,
2357.
12. E. Cmpian, J. Chou, M. L. Peterson, H. H. Saneii, . Furka, R. Ramage, R. Epton (Eds)
In Peptides 1996, 1998, Mayflower Scientific Ltd. England, 131.
13. E. Cmpian, H. H. Saneii and . Furka PharmaChem 2003, April, 43.
14. A. V. Schally et al. J. Biol. Chem. 1971, 246, 7230.
15. H. Matsuo et al. Biochem. Biophys. Res. Commun. 1971, 43, 1334.
16. C. Patrono, B. A. Peskar, (Eds), Radioimmuoassay in Basic and Clinical Pharmacology,
1987, Springer -Verlag, Heidelberg.
17. E. Cmpian, J. Chou, . Furka unpublished results.
18. J. A. Smith J. G. R. Hurrel, S. J. Leach Immunochemistry 1977, 14, 565.
19. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherlands, 1988, Vol. 5, p 47.
20. K. S. Lam, S. E. Salmon, E. M. Hersh, V. J. Hruby, W. M. Kazmierski, R. J. Knapp
Nature 1991, 354, 82 and its correction: K. S. Lam, S. E. Salmon, E. M. Hersh, V. J.
Hruby, W. M. Kazmierski, R. J. Knapp Nature 1992, 360, 768.
21. M. Meldal Biopolymers (Peptide Science) 2002, 66, 93.
22. S. J. Taylor, J. P. Morken Science 1998, 280, 267.
23. http://www.pharmacopeia.com
24. K. S. Lam, A. L. Lehman, A. Song, N. Doan, A. M. Enstrom, J. Maxwell, R. Liu, In G. A.
Morales, B. A. Bunin (Eds) Methods in Enzymology, Combinatorial Chemistry Part B,
2003, Elsevier Academic Press, 298.
25. J.-M. Lehn Chem. Eur. J. 1999, 5, 2455.
150
151
composition. Additional parameters, however, are also very important in library fabrication like
temperature, the atmosphere and the pressure that also should be varied.
One of the main methods in thin film library fabrication is the vapor deposition technique.
In these techniques like sputtering, pulsed laser deposition, electron beam evaporation or laser
molecular epitaxy, the target material is evaporated by a high energy source (ion gun, electron
gun, laser) then is deposited on the substrate. Another fabrication approach is to deliver the
components of the film into small wells in dissolved form then to evaporate the solvent.
The physical vapor deposition technique makes possible formation of two kinds of
libraries. In one kind of libraries the components are discrete films each having a definite
composition that differs from film to film. The other kind of library is formed in a single film so
that the composition is smoothly varied across the film. The composition of such films is
different in all of its points. The delivery of components in dissolved form into wells leads to the
formation of series of discrete films.
In one of the fabrication methods7 spatially addressable deposition can be performed that
in principle resembles to the earlier described light-directed, spatially addressable parallel
chemical synthesis invented by Fodor et al.8 Both methods are based on the use of masks that
cover parts of the solid surface at predetermined places. In fabrication of thin film libraries the
mask prevents deposition of the vapor on the covered parts of the solid surface on which the
films are made.
Figure 6.1. Quaternary masking system.9 Masks (a to d) and positions of the 1024 library
components (e)
152
The 5 masks (a trough d) of a quaternary masking system are shown in Figure 6.1. The
deposition process begins with the use of mask a in position shown in the figure. After the first
deposition the mask is rotated by 90o and the deposition is continued. This is followed by two
more rotations with deposition after each rotation. Then the deposition is continued by using the
remaining 4 masks each with 3 rotations after the depositions. By proper variation of the
precursors and their quantity (thickness of their deposited layers) in the deposition cycles a
library of 45=1024 discrete thin films is formed and the composition of the films differ in every
position (Figure 6.1/c). Such a thin film library can be composed on a 2.4x2.5 cm silicon plate.
Composition of a library by application of the masks is a fully combinatorial process and for this
reason is fast.
Each deposited film is composed of 5 layers that may differ in composition and/or
thickness. The multilayer films are postannealed at intermediate temperature (200-500 o) to
homogenize the composition then heated at elevated temperatures (800-1000 o) to promote
reaction of the constituents.
A parallel approach can also be applied to fabricate discrete film libraries: the constituents
of the films are transferred in solution into small wells formatted on plates. The solutions that
have different composition or concentration are delivered by automatic fluid dispensers or ink
jetting.10 The series of liquid samples are first heated to a moderate temperature to evaporate the
solvent then an elevated temperature is applied to bring about the reaction of the constituents and
finish the formation of the films.
Targets
1,2,3
3
1
Laser beam
Substrate
Figure 6.3. Pulsed-laser deposition system with three rotatable targets (1,2,3)
153
In this deposition system the constituents of the films are deposited in layers that need to
be homogenized by post-annealing. There are in use other multi-target deposition systems in
which deposition from the targets occur simultaneously. In these co-deposition systems a
homogenous layer is formed.
As already mentioned it is possible to fabricate a multi-component library in a single
multi-layer film, too. The composition in each layer of the film is continuously changing along a
gradient.11 In Figure 6.4. such deposition system is outlined.
120o
120 o
Ca
Ba
Sr
d
Shutter
CaCO 3
a
Shutter
SrCO 3
Shutter
BaCO3
b
Figure 6.4. Gradient deposition on a triangle in three steps (a, b, c) using three targets (CaCO3,
SrCO3, BaCO3).
The substrate is an equilateral triangle shaped LaAlO3 piece (height 2.5 cm). The bottom
precursor is TiO2.In the first deposition step (a) the target is CaCO3. A shutter moves at constant
speed across the triangle in the direction of the arrow and gradually covers it. The thickness of
the deposited CaCO3 layer is continuously increasing (from 0 to 1225 A) and the maximum is at
the corner 1. Before SrCO3 deposition in the second step (b) the triangle is rotated anticlockwise
by 120o otherwise the process is the same. In the last step (c) BaCO3 is deposited again after 120o
rotation. In the final film (d) the composition is smoothly changing from point to point and the
maximum thickness of the CaCO3 (1225 A), SrCO3 (1475 A) and BaCO3 (1647 A) layers are at
the corners 1, 3 and 2, respectively. The chip is heated for homogenization at 400o for 24 hours
for homogenization and at 900 o for 1.5 hours for crystallization.
6.1.2. Screening
The methods used for screening of inorganic film libraries are very different since very
different properties have to be measured. The very large number of different compositions
present in a single chip, as well as the small size of the films represent special difficulties. Special
screening methods have been developed and are under development in order to face the problems
involved in the experiments. The non-destructive physical methods are preferred like optical
measurements. For example X-ray microbeam techniques available at synchrotron radiation
facilities are used with spot size of 3x20 m.
154
Some measurements are executed in serial mode. These are relatively slow since each
film has to be measured separately. Other screening determinations can be carried out in parallel.
These methods are much faster since in parallel mode, a large number of films, often the entire
library, can be measured simultaneously. This technique can be used, for example, in screening
of phosphor libraries.
The new film fabrication and screening methods are subjects of intensive research.
Appearance of even better and faster methods are expected in both synthesis and screening. But
according to same opinions even the use of the existing methods make the materials research
about 10,000 times faster than the conventional ones.
Detailed characterization of the different screening technologies falls outside the scope of
this book.
High throughput
testing
When thinking about preparing new catalysts it has to be taken into account that the
possible number of different compositions of the elements is immensely high. It can easily be
155
calculated, for example, that if 6 elements are used as catalyst components and each of them in
any of 10 different concentrations, the total number of possible compositions is 106, one million
and this multiplied if different preparation conditions are applied. For this reason it is absolutely
impossible to consider testing all (50-70) elements in a catalyst search.
There are too many parameters that can be varied. This can be expressed by stating that
the parameter space is usually very high in catalyst research. As a consequence, computational
methods need to be applied in order to reduce the number of executable experiments. In
combinatorial catalyst research usually three kinds of activities are amalgamated (Figure 6.5.).
Catalyst discovery is a multi-step iterative process. It starts with library design that
involves data-mining from the literature and considers many variables like precursor materials
and their relative concentrations, support materials, mixing conditions, calcination temperatures,
the reactor applied in testing and analytical tools. The designed library is then synthesized and
tested. In these phases automated processes are usually applied. In a single step only a small
fraction of the huge parameter space can be explored. For this reason the experimental data
coming out from the first step are usually considered as preliminary results that are the basis of
further iterations. The further experiments are guided by computational methods that can make
predictions based on the already existing experimental data (Figure 6.6.).
Library design
Activity
Preparation
Testing
Information mining
Figure 6.6. Iterative scheme of catalyst library design, fabrication and testing
In the informatic platforms used so far following methods, some of them developed in the
area of artificial intelligence, have been applied:
Artificial neural network13
Genetic algorithm14
Holographic research strategy15
Support vector machines30
Decision trees31
156
Catalyst film
Heating laser beam
Figure 6.7. Testing the activity of thin film catalyst library by analysis of the
reaction products by mass spectrometry
157
Ir
Bi
V
Ti
Zn
Rh
Gd
Pd
Cr
Fe
Co
Ni
Cu
Ag
Er
Pt
The advantage of using the IR thermography is that the activities of the library members
can be determined in parallel, that is, in a single experiment. This is much faster than the serial
determination. The disadvantage is that nothing is known about the products and the selectivity
of the catalysts.
Microreactors. The traditional devices for testing new catalysts are single laboratory
reactors. These single reactors in combinatorial catalyst research are replaced by a parallel array
of microreactors. The catalysts to be tested are prepared in series of small vials using automatic
liquid dispensing systems. After evaporation of the solvents the residues are calcined, grinded
then filled into the microreactors.
A possible arrangement of 16 microreactors is demonstrated in Figure 6.9. The reactant
gas mixture is distributed among the 16 heated microreactors. A constant gas stream flows
through all reactors and finally leaves at the outlet. A probe is sequentially positioned to one of
the reactors and direct the reacted gas mixture to the analyzer. The analysis is usually done by
mass spectrometry, gas chromatography or by combination of both.
Heating
Positionable
probe
Inlet
Reactors,
end view
To MS, GC
Side view
Outlet
This arrangement is advantageous because both the activity and the selectivity of the
catalysts can be determined by MS and/or GC. The effect of other parameters like that of the
158
temperature can also be tested. The disadvantage of this experimental set up is that the analysis
can be executed only sequentially that is relatively slow.
A different experimental arrangement and application of a different analytical method
makes possible the parallel analysis of the products of all reactors (Figure 6.10.).19
The device consists of a bundle of 49 catalyst cartridges each attached by a gold plated
nozzle to a 20 cm long analysis tube. The nozzle has at its center a small hole that allows the
gases to pass. At the end part of the bundle of the analysis tubes a CaF2 window, transparent to
IR radiation, is mounted. The distance from the end of the tubes is only 1 mm. Outside, at the
CaF2 window there is a semitransparent mirror that directs the IR radiation into the analysis
tubes.
The gaseous reaction mixture is fed into the catalyst cartridges through a common inlet.
The gas passes through the catalyst cartridges then enters through the nozzles into the analysis
tubes and finally leaves the apparatus in combined form at the outlet. The products are analyzed
while they are in the analysis tubes. The IR radiation reflected by the gold coated nozzles is
simultaneously determined by a focal plane array (FPA) detector.20 The FPA detector contains an
array of small detectors. The density is several thousand elements in a few square millimeters and
each element can record a full IR spectrum.
Gold-coated nozzle
Figure 6.10. Schematic view of 49 parallel reactors analyzing the products with IR measurement
in parallel mode. a: catalyst cartridge (1 cm), b: analysis cell (20 cm, diameter 4 mm), c: inlet of
gaseous reaction mixture, e: bundle of 49 catalyst cartridges, f: heater, g: bundle of analysis
tubes, h: outlet of gaseous products, i: CaF2 window, j: semitransparent mirror, k: IR beam
The parallel microreactor systems can be applied in monolith form, too.21 The catalyst
active layer is deposited on the walls of the monolith by a wash-coat procedure.
Catalysts on beads. As described by Schunk and his colleagues the catalysts can be
deposited on beads and can be tested in a new type of microreactor.22,23 As carriers of catalysts
uniform -Al2O3, -Al2O3, SiO2 or TiO2 beads are used. Their diameter is 1 mm and the each
bead carries a catalyst of different composition. The catalysts are deposited on the beads by wet
impregnation method using automated liquid dispensing. The beads are dried at 80 oC for 16 h
and calcinated at 420 oC for 3 h in air.
159
The catalysts carried by the beads are tested in a special single bead reactor developed
for this purpose. The reactor has two identical parts: the base and top (Figure 6.11.).
The wells in the upper and lower part of the microreactor are composed of silicon
membranes. The wells are etched into the membranes then the membranes are combined by
silicon fusion bonding. The number of wells in a microreactor is 384 or 625. After filling the
reactors with beads the upper and lower parts of the reactor are pressed together or are
permanently bonded.
Top
Bead
Base
Inlet
Positionable
sampling capillary
b
To MS
a
Figure 6.12. The one bead microreactor in the flange. a: top view, b: schematic side view
The reactors are mounted into a stainless steel flange (Figure 6.12.). The flange system
provides sealing and heating up to 450 oC. It is connected to the reaction gas feeding system and
provides a continuous flow of the gas through all individual reactor wells. It also contains a
sampling capillary that is sequentially and automatically positioned to the outlet of the reactor
wells and transfers the samples to a scanning mass spectrometer for analysis. The analysis time
per bead is about 25-80 sec.
The practice of preparation of catalyst libraries on inorganic beads offered the possibility
of speeding up the process by using the split-mix method.2-4 The first such approach was patented
in 200024 and applied for testing the single bead microreactor.25 A Mo-Bi-Co-Fe-Ni library was
prepared on 3000 -Al2O3 beads using the following solutions: (NH4)6Mo7O24.4H2O,
160
BiIII
0.0
Drying
Calcination
0.002
Drying
0.1
Drying
1.0
Drying
161
Figure 6.14. Two possibilities for plotting repeating quantities along an axis
The same quantities are arranged in the two plots but in different order. In plot a there are
large changes in the quantities along the horizontal axis. In plot b, however, the quantities are
changing smoothly in a wave-like manner. Any given quantity differs from its neighbors by only
one unit. In HRS this second plotting form is applied.
Lets suppose that in the hypothetical experiment six precursors are used represented by A,
B, C, D, E and F. Their concentration levels are shown in Table 6.1.
A B
0.0 0.0
0.2 0.2
0.4
C
0.0
0.2
0.4
0.6
D E
0.0 0.0
0.2 0.2
0.5
1.0
F
1.0
0.2
0.5
above a single fixed level of B so forming three full waves. The full combination of the levels
of variables along the X-axis leads to 24 data points along the X-axis.
The remaining D, E and F variables are similarly plotted along the Y-axis leading to 24
data points. Eventually all of the 574 catalyst compositions appear in the 2D holographic
representation.
C
B
A
F E D
B
A
C
E D F
A
B
C
D E F
B
C
A
E F D
C
B
A
F E D
Figure 6.16. The holographic presentations after the two dimensional transformations. Black
rectangle: location of the best catalyst in the transformed representation, large enhanced
rectangle: catalyst compositions selected for the next catalyst generation, small enhanced
rectangle: best experimental catalyst composition within the catalyst generation
6.3. Polymers
Polymers are an important class of materials. Their application is so widespread that our
life today could not be imagined without them. They are used as structural, packaging and
coating materials, they are components of our clothes and they are applied even in
microelectronics and nanotechnology. Their properties depend not only on composition but to a
high degree on conditions of their processing that is effected by a large number of variables. The
combinatorial methods that are introduced and used in this area help to faster determine the
influence of the mentioned variables. In this respect the polymer libraries prepared in the form of
continuous thin films are very important. The dependence of the properties of the films, like
164
dewetting, phase behavior, surface morphology and crystallization, can be studied by optical
means. Application of studies of continuous gradient films instead of conventional approaches
makes the research faster, cheaper and more successful. Below, fabrication of some thin film
libraries is described.
Fifure 6.17. Principle of the flow coating process generating continuous gradient thickness films.
S: substrate, : blade angle, G: height of the blade above the substrate (typically from tens of
microns to hundreds of microns), H: thickness of the wet film, h: thickness of the dry film. The
substrate is moving in the direction of the white arrow.
(Reprinted with permission from C. M. Stafford et al. Rev. Sci. Instrum. 2006, 77, 023908)
165
beginning the solution is rich in component B and at the end is rich in polymer A. So at the end
the syringe contains a gradient along its length.
B-rich
A-rich
substrate
B
v1
v2
stirrer
a
In the second step (Figure 6.18/b) the content of the syringe is deposited on a substrate as
a stripe. The composition of the deposited stripe forms a continuous gradient.
The third step (Figure 6.18/c) is spreading the composition-gradient stripe on the substrate
orthogonal to its direction by using a knife-edge coater. After the solvent evaporates a continuous
linear gradient film remains on the substrate. The remaining solvent is removed under vacuum
during annealing.
thickness gradient
heating
cooling
temperature gradient
High-thoughput screening. The simplest way to study the one or two dimensional
gradient libraries is optical microscopy. Figure 6.20. exemplifies this. A digital camera coupled
to an optical microscope takes 1024x1024 pixel images and sends them to computer for analysis.
The computer also controls the x-y movement of the sample stage over a predetermined grid that
divides the sample area into a virtual array of cells. The cells are photographed in serial manner
and the magnified images are sent to computer.
167
As an example, Figure 6.21. shows the result of a dewetting experiment carried out with a
thickness-temperature gradient polystyrene library. The thickness range was from 33 to 90 nm.
The endpoint temperatures were 135.0 0.5 and 75.0 0.1 oC over 40 mm (gradient 2 oC/mm).
Figure 6.22. shows a composite picture.
Figure 6.22. shows magnified images of the A, B and D boxed regions of Figure 6.22.
These photos illustrate how the structures within the library depend on thickness and temperature.
The methods demonstrated above represent only a few examples of the numerous
combinatorial approaches applied in the field of polymer research. Even these few examples
convince the reader about the applicability of the combinatorial methods in this area.
168
References
1. J. J. Hanak J. Mater. Sci. 1970, 5, 964.
2. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherlands, 1988, Vol. 5, p 47.
3. . Furka, F. Sebestyn, M. Asgedom, G. Dib Proceedings of the 10th International
Symposium of Medicinal Chemistry, Budapest, Hungary, 1988, p 288, Abstract P-168.
4. . Furka, F. Sebestyn, M. Asgedom, G. Dib Int. J. Peptide Protein Res. 1991, 37, 487.
5. H. M. Geysen, R. H. Meloen, S. J. Barteling Proc. Natl. Acad. Sci. USA 1984, 81, 3998.
6. X.-D. Xiang, X. Sun, G. Briceno, Y. Lou, K.-A. Wang, H. Chang, W. G. WallaceFreedman, S.-W. Chen, and P. G. Schultz Science 1995, 268, 1738.
7. J. Wang, Y. Yoo, C. Gao, I. Takeuchi, X. Sun, H. Chang, X.-D. Xiang, P. G. Schultz
Science 1998, 279, 1712.
8. S. P. A. Fodor, J. L. Read, M. C. Pirrung, L. Stryer, A. T. Lu and D. Solas Science 1991,
251, 767.
9. X.-D. Xiang In I. Sucholeiki (Ed) High Throughput Synthesis, Principles and Practices,
Marcel Decker Inc. 2000, 231.
10. J. D. Hewes, D. Kaiser, A. Karim, E. Amis Combinatorial Chemistry
http://polymers.msel.nist.gov/.
11. H. Chang, X.-D. Xiang In I. Sucholeiki (Ed) High Throughput Synthesis, Principles and
Practices, Marcel Decker Inc. 2000, 251.
12. Selim Senkan Angew. Chem. Int. Ed. 2001, 40, 312.
13. Z. Hou, Q. Dai, X. Wu, G. Chen Appl. Catal. A.: General 1997, 161, 183.
14. V. Nissen Evolutionre Algoritmen, Deutscher Univeritatsverlag, Bamberg, 1994.
15. L. Vgvri, A. Tompos, S. Gbls, J. L. Margitfalvi Catal. Today 2003, 81, 517.
16. P. Cong, R. D. Doolen, Q.Fan, D. M. Giaquinta, S. Guan, E. W. McFarland, D. M.
Poojary, K. Self, H. W. Turner, W. H. Weinberg Angew. Chem. Int. Ed. 1999, 38, 484.
17. M. Orschel, J. Klein, H. W. Schmidt,W. F. Maier Angew. Chem. Int. Ed. 1999, 38, 2791.
18. F. C. Moates, M. Somani, J. Annamalai, J. T. Richardson, D. Luss, R. C. Willson Ind.
Eng. Chem. Res. 1996, 35, 4801.
19. P. Kubanek, O. Busch, S. Thomson, H. W. Schmidt, F. Schth* J. Comb. Chem. 2004, 6,
420.
20. C. M. Snively, G. Oskarsdottir, J. Lauterbach Angew. Chem., Int. Ed. Engl. 2001, 40,
3028.
21. M. Lucas, P. Claus Appl. Catal. A.: General 2003, 254, 35.
22. S. A. Shunk, C. Baltes, J. Klein OIL GAS European Magazine 2/2005, 77.
23. T. Zech, G. Bohner, O. Laus, J. Klein Rev. Sci. Instrum. 2005, 76, 062215-1.
24. WO 002002043860A2
25. J. Klein, T. Zech, J. M. Newsam, S. A. Schunk Appl. Catal. A.: General 2003, 254, 121.
26. Y. Sun, B. C. Chan, R. Ramnarayanan, W. M. Leventry, T. E. Mallouk, S. R. Bare, R. R.
Willis J. Comb. Chem., 2002, 4, 569.
27. C. M. Stafford, K. E. Roskov, T. H. Epps III, M. J. Fasolka Rev. Sci. Instrum. 2006, 77,
023908.
28. J. C. Meredith, A. Karim, E. J. Amis MRS Bulletin April 2002.
29. J. C. Meredith, A.P. Smith, A. Karim, E.J. Amis, Macromolecules 2000, 33, 9747.
30. L. A. Baumes, J. M. Serra, P. Serna, A. Corma J. Comb. Chem., 2006, 8, 583.
169
31. L. A. Baumes, M. Moliner, A. Corma, QSAR & Comb. Sci., 2007, 26, 255.
170
171
According to the rule if any two of the above conditions are satisfied it indicates a poor
absorption or permeation of the compound.
172
The brain blood barrier penetration is again another property that needs to be considered.
Methods of quantitative structure activity relationship (QSAR) including artificial neural
networks and a genetic algorithm based approaches are also widely applied.
Other software that are used in library design and drug search are: CombiLibMaker,
ChemEnlighten, TOPKAT and ChemSpace.
CombiLibMaker performs virtual combinatorial chemistry. Libraries can be defined and
enumerated with full control of stereochemistry. The generated virtual libraries can be stored in
databases for subsequent searching and retrieval. The libraries can be submitted to virtual
screening including docking. CombiLibMaker reads and writes all of common 2D and 3D
structural database formats.
ChemEnlighten is a decision support program for scientists who work with lists of
compounds to set priorities for synthesis, screening and purchase, provides access to vital
information and analysis tools. Different databases can be searched and standard descriptors can
be calculated as molecular weight, hydrogen donors and acceptors, and works with ClogP/CMR
to calculate log P, molar refractivity, molecular connectivity, shape and topology metrics.
Different subsets can be quickly selected.
TOPKAT offers quantitative structure-toxicity relationship (QSTR) models that predict
toxicity of a compound solely from its structure.
ChemSpace helps to decide which chemistry should be used in the synthesis of a library
and which chemistry will most likely result in activity. A typical virtual library contains at least
50 million compounds. They can be screened according their physicochemical properties,
novelty, drug-likeness, diversity, therapeutic relevance and synthetic feasibility.
There are other kinds of software that help the combinatorial chemist in practical
realization of synthesis of libraries. What synthesis route to choose? Are the starting materials
available? Where to buy them at reasonable price? These are important questions. Software of
MDL Information Systems, Inc. helps to solve these problems.
Available Chemical Directory (ACD) is a very important database where the starting
materials and reagents are found. There is a list of 435,000 chemicals in the ACD that can be
purchased from 680 suppliers. The database is continually updated. Once the list of the starting
compounds and reagents is selected ACD helps to identify and locate the commercial sources and
side-by-side comparisons can be made concerning purity, quantity and price.
By use of MDL ISIS ACD Finder the compounds can be searched by structure, name and
formula. Figure 7.1. shows 3-ethyl butylacetate as an example. The ACD also links to Pure
Substance Database that provides safety, hazard and regulatory information. The structures can
be entered using ISIS Draw. This is a structure drawing program that can be used in construction
of the library, too. It can be downloaded free of charge from the home page of MDL.
Other software of MDL provides fast access to synthetic methodology information by
connecting directly to chemical literature.
173
174
There are also available software developed for searching pharmacology, safety,
metabolism and toxicology information:
MDL Drug Data Report contains current bioactivity findings and newly launched
developmental drugs
MDL Comprehensive Medicinal Chemistry contains searchable 3D models plus important
biochemical properties including drug class, logP and pKa.
OHS Hazard Communication contains full-service tools for employee safety etc.
MDL Metabolite Database is the worlds largest and most comprehensive xenobiotic
transformations compiled from literature.
MDL Toxicity Database contains the complete content of the Registry of Toxic Effects of
Chemical Substances database.
175
Odyssey is a high-throughput sample storage and retrieval system. The system is equipped
with an identification system and barcode tracking, enabling programmed access to each
microwell plate, and can process over 400,000 plates a year (Figure 7.2.).
An LCD touch-screen provides an intuitive user interface. The systems are fully
automatic and include comprehensive safety features assuring reliable and safe operation. Data is
accessed using a customized database. The Odyssey is available in 3 configurations suitable for
storage and retrieval of 2,500, 5,000 and 10,000 plates.
7.1. Software companies
The companies listed below are engaged in developing software and commercialize such
products. Beside the name of the companies the addresses of their home pages are also indicated.
Aber Genomic Computing (www.abergc.com).
It is an informatics company based in Wales, UK. AberGC provides novel data mining,
scheduling and predictive modeling solutions based upon evolutionary computing, machine
learning and other supervised learning techniques. Their new product, gmax-bio , is the first
commercial package to fully utilize these techniques and is designed for all aspects of drug
discovery research. Gmax-bio is a novel informatics package based on genomic computing
techniques which uniquely utilize Darwinian methods of natural selection to evolve mathematical
algorithms to rapidly solve complex data-mining and predictive modeling problems.
Accelrys (www.accelrys.com).
It is a leading provider of software for biologists, chemists, and materials scientists. Covers
computation, simulation, and the management and mining of scientific data.
Afferent Systems (www.afferent.com).
176
177
178
geneticXchange (www.genetixchange.com).
It is a software product company that produces the K1 System Data Integration Middleware
Platform for any Biotech needing a solution to the biological data integration challenge.
Gensym Corporation (www.gensym.com).
The company is a leading supplier of software products and services for intelligent real-time
systems that help organizations manage and optimize complex dynamic operations. Common
applications include quality management, process optimization, dynamic scheduling, network
fault management, energy and environmental management, and abnormal situation management.
IBM (www-3.ibm.com/solutions/lifesciences/solutions.html).
The company offers technology infrastructure in high-performance computing, data integration,
knowledge management, storage, e-business, and information services. Today, IBM systems
include the most advanced storage management; and a world-renowned computational biology
center.
IDBS (www.idbs.co.uk).
Its specialized applications are used to acquire, manage, integrate and visualize chemical and
biological data ranging from the large amounts of data generated in High Throughput Screening
and combi-chem programs, through multiple IC50 determinations and profiling, to the complex
experimental protocols of toxicology studies.
Labtronics (www.labtronics.com).
The company is recognized for laboratory automation and instrument interfacing. Labtronics has
an Innovative Software Solution.
LabVantage Solutions (www.labvantage.com).
The company is a provider of state-of-the-art software, implementation services, and consulting
to leading discovery-oriented, conventional research, and quality control laboratories. It offers
configurable, industry-specific solutions for in a variety of industries including high throughput
screening, genomics, proteomics, pharmaceuticals, oil and gas, process chemicals, food and
beverage, environmental, and forensics.
Managed Ventures (www.menagedventures.com).
The company has developed custom application components in Java for High Throughput
Screening (HTS), compound registration, inventory and proteomics. Services include the
implementation of solutions for drug discovery using pre-built components to rapidly deliver
working web-based applications. HTS and other informatics applications have been integrated for
Managed Ventures clients in less than 8 weeks using open source Java, XML, SOAP and any
JDBC-compliant database (Oracle, DB2, SQL Server, mySQL).
MatriCal (http://www.matrical.com/).
The company specializes in microwell plates and automated sample management and storage
solutions. The MatriStore is a compact, economical, climate controlled compound management
system that supports multiple sample formats, including 96 and 384 mini-tubes and 96 to 1536
microplates, and others. A storage capacity from 750k to 40 million samples with automatic
sample retrieval in plate or individual sample format
179
180
181
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
182
Index
223 Sample Changer, 46
A
ACT 357, 67
Analyst GT, 123
anchors, 22
Apex 396, 43
Available Compounds Database, 173-174
B
Benz, 1
C
De Witt, S. H., 33
deconvolution, 60, 124-138
DiverseSolutions, 172
diversity descriptors, 172
fabrication
of films by evaporating solutions, 153
of polymer films, 164
of thin films, 151-154
fluorous separations, 40
Fodor, S. P. A., 80, 81, 84, 152
Ford, H., 1
Frank, R., 32, 36, 37
Furka, ., 5, 13, 27, 55, 92, 93, 100, 104,
130
G
H
Hanak, J. J., 151, 155
partial, 63-80
piperazine 2-carboxamide, 115-117
polymer, 164-168
library
temperature-gradient, 166
thickness gradient, 164-165
unusual, 79
virtual, 66, 104-108, 172
phage display, 86
thin film, 151-155
light directed synthesis, 84-86
linker, 20
Lipinski, A., 172
Lipinskis Rule, 172
Kri, Gy., v
I
ISIS Draw, 173
iteration method, 124-129, 139-140
J
Janda, K. D., 124
LabMate, 33
Levassor, 1
library
amino acid tester tester application, 143
amino acid tester tester preparation,
141-143
amino acid tester, 135-137
benzimidazole, 110-115
catalyst prepared by split-mix, 161
catalyst, 156-164
cherry picked, 104-110
combinatorial dynamic, 138
composition-gradient, 165
design product based approach, 172
design reactant based approach, 172
design, 171-176
dynamic combinatorial, 138
inorganic, 151-155
occurrence, 144
omission application of, 140-141
omission, 133-135, 140-141
organic, 61-63
O
Ohlmeyer, 63
OHS Hazard Communication, 175
Olds, Ransome Eli, 1
one bead-one product, 1
184
ORGSYN Database
Panhard, 1
Parallel evaporation module, 43
PEG, 19
Peugeot, 1
Pinilla, S. E, 130
planning experiments, 71-75
polymers, 164-168
positional scanning, 129-133
application, 145
preparation of catalyst libraries, 157
protecting group, 22
Alloc, 114-117
BOC, 23
Fmoc, 32
nitro, 25
Nvoc, 84
trityl, 24
Z, 23-24
PubChem, 175
pulsed-laser deposition, 153
Pure Substance Database, 173
Q
Quad 3+ system, 47
R
Reference library of Synthetic Methodology,
175
resin
hydroxymethyl, 21
Merrifield, 20-21
Rink amide, 21
trityl chloride, 20
Wang, 21
Tentagel, 19
185
T
Taktsy, Gy., 29, 31, 122
Tanimoto, 172
thin film catalyst arrays, 157
TOPKAT, 173
Toxicity Database, 175
186