You are on page 1of 203

rpd Furka

Combinatorial Chemistry
Principles and Techniques

rpd Furka

Combinatorial Chemistry
Principles and Techniques

Published by rpd Furka in electronic form


Budapest 2007

rpd Furka, 2007

Preface
Combinatorial technologies that were invented in 1980s provided a possibility to produce
new compounds in practically unlimited number. New strategies and technologies have also been
developed that made possible to screen very large number of compounds and to identify useful
components of mixtures containing millions of different substances. This dramatically changed
the drug discovery process in the pharmaceutical industry and the way the researchers design
their experiments. Instead of preparing and examining a single compound, families of new
substances are synthesized and screened. In addition, combinatorial thinking and practice proved
to be useful in areas outside the pharmaceutical research. Such area are, for example, search for
more effective catalysts and materials research.
Combinatorial chemistry became an accepted new branch within chemistry. It is the
subject of numerous books, journals, international conferences and university courses. This book
is written for university students and young researchers. The author feels it important to make it
freely available for all potential readers. For this reason the book will be published exclusively in
electronic form that can be downloaded from appropriate Web sites free of charge.
The author wishes to express his appreciation to Dr. Jzsef L. Margitfalvi of the Central
Chemical Research Institute of the Hungarian Academy of Sciences and Dr. Gyrgy Kri of
Gedeon Richter Ltd, Budapest for reading parts of the manuscript and for their valuable
suggestions.
The mother tong of the author is Hungarian. Despite all efforts the text obviously contains
grammatical errors. Correction of these errors, of course, would be important. The help of the
readers in this respect would be highly appreciated. If you can help please contact the author by
e-mail: afurka@szerves.chem.elte.hu.
rpd Furka

vi

Table of Contents
Preface............................................................................
Table of Contents.......................................................................................................
1. Introduction............
1.1. Birth of the combinatorial approach...................................................................
1.2. The translated version of the document notarized in 1982.................................
1.3. Publication of the split-mix combinatorial synthesis..........................................
References..................................................................................................................
2. The solid phase synthesis.......................................................................................
2.1. Solid supports......................................................................................................
2.1.1. Crosslinked polystyrene...................................................................................
2.1.2. Polyethylene glycol (PEG) grafted supports....................................................
2.1.3. Inorganic ports.................................................................................................
2.1.4. Non-bead form supports..................................................................................
2.2. Linkers, anchors..................................................................................................
2.3. Protecting groups................................................................................................
2.3.1. Protection of amino groups..............................................................................
2.3.2. Protection of carboxyl groups..........................................................................
2.3.3. Protection of other functional groups..............................................................
2.3.4. Coupling reagents for peptide synthesis..........................................................
2.4. Solid phase synthesis of organic molecules........................................................
2.5. Solid phase reagents and scavenger resins in solution phase synthesis..............
References..................................................................................................................
3. Parallel synthesis. Synthesis of compound arrays based on
saving reaction time...............................................................................................
3.1. The parallel synthesis..........................................................................................
3.1.1. The multipin metod of Geysen........................................................................
3.1.2. The SPOT technique of Frank.........................................................................
3.1.3. Other devices for parallel synthesis.................................................................
3.1.4. Parallel synthetic methods with reduced number of operations......................
3.1.4.1. Synthesis of oligonucleotides on paper discs................................................
3.1.4.2. The tea-bag synthesis....................................................................................
3.2. The Ugi multicomponent reactions.....................................................................
3.3. Solution phase combinatorial synthesis..............................................................
3.3.1. Dendrimer supported synthesis........................................................................
3.3.2. Separations using fluorous tags and fluorous solvents....................................
3.3.3. Application of solid phase reagents.................................................................
3.3.4. The use of scavengers in solution phase reactions...........................................
3.4. Automation in parallel synthesis.........................................................................

vii

v
vi
1
2
5
13
14
15
18
18
19
20
20
20
22
23
24
24
25
26
27
28
29
30
31
32
33
36
36
37
37
39
39
40
40
41
42

3.4.1. Automatic parallel synthesizers.......................................................................


3.4.2. Quality control.................................................................................................
3.4.3. Parallel purification..........................................................................................
3.4.4. Manufacturers of laboratory robots.................................................................
References..................................................................................................................
4. Combinatorial synthetic methods..........................................................................
4.1. Combinatorial synthesis on bead-form resin......................................................
4.1.1. The split-mix synthesis....................................................................................
4.1.1.1. The key features of the split-mix synthesis..................................................
4.1.1.2. Encoding of beads in the synthesis of organic libraries...............................
4.1.1.3. Realization of the split-mix synthesis..........................................................
4.1.1.4. Automation of the split-mix synthesis.........................................................
4.1.1.5. Preliminary considerations when planning experiments with peptide
libraries.........................................................................................................
4.1.1.6. Full and partial libraries...............................................................................
4.1.1.7. Unusual partial libraries...............................................................................
4.1.1.8. Binary synthesis using the split-mix procedure...........................................
4.1.2. Combinatorial synthesis using amino acid mixtures.......................................
4.2. Combinatorial synthesis using soluble support..................................................
4.3. Combinatorial synthesis on solid surface...........................................................
4.4. Combinatorial peptide synthesis by biological methods....................................
4.5. Combinatorial synthesis using macroscopic solid support units........................
4.5.1. Encoding by attached labels. The radiofrequency and optical
encoding methods............................................................................................
4.5.2. Units without labels. Encoding by position in space.......................................
4.5.2.1. The Encore technique...................................................................................
4.5.2.2. The String Synthesis.....................................................................................
4.5.2.3. String synthesis of cherry picked libraries...................................................
4.6. Examples............................................................................................................
4.6.1. Split-Mix Synthesis of an encoded benzimidazole library.............................
4.6.2. Synthesis of a 10,000 member piperazine 2-carboxamide .
library by Directed Sorting..............................................................................
4.6.3. Synthesis of two libraries on one support........................................................
References..................................................................................................................
5. Screening methods.................................................................................................
5.1. High throughput screening of arrays of individual compounds..........................
5.2. Screening of combinatorial libraries. Deconvolution methods...........................
5.2.1. Deconvolution methods for dissolved libraries...............................................
5.2.1.1. The iteration method.....................................................................................
5.2.1.2. Positional scanning.......................................................................................
5.2.1.3. Omission libraries.........................................................................................

viii

42
45
47
48
52
55
55
55
56
61
64
66
71
75
79
80
82
83
84
86
87
88
92
92
93
104
110
110
115
117
118
121
122
124
124
124
129
133

5.2.1.4. The amino acid tester libraries......................................................................


5.2.1.5. Other methods for identification of the bioactive component of
combinatorial libraries..................................................................................
5.2.1.6. Dynamic combinatorial libraries...................................................................
5.2.1.7. Examples.......................................................................................................
5.2.2. Deconvolution methods of libraries tethered to the solid support...................
5.2.2.1. Screening of combinatorial libraries in tethered form..................................
5.2.2.2. Screening of combinatorial libraries by releasing the content of
individual beads intosolution........................................................................
5.2.2.3. Examples......................................................................................................
References.................................................................................................................
6. Combinatorial methods in materials and catalyst research...................................
6.1. Inorganic materials.............................................................................................
6.1.1. Preparation of thin film libraries......................................................................
6.1.2. Screening.........................................................................................................
6.2. Heterogeneous catalysts......................................................................................
6.2.1. Fabrication and testing of catalyst libraries.....................................................
6.2.2. Catalyst library design.....................................................................................
6.3. Polymers.............................................................................................................
References.................................................................................................................
7. Computational aspects of library design and synthesis.........................................
7.1. Software companies............................................................................................
References..................................................................................................................
Index..........................................................................................................................

ix

135
138
138
138
145
146
147
148
150
151
151
151
154
155
156
161
164
168
171
176
182
183

1. Introduction
The discovery of new materials played an important role in the history of mankind. Many
discovered materials had effect on every days life. The impact of some of these materials was so
definitive that they gave the name of long historical eras. So bronze gave the name for Bronze
Age, for example, and iron for the Iron Age.
The life today is also largely affected by the materials we use. The standard of life could
not be the same without semiconductors, insulators, adhesives, synthetic fibers, drugs, pesticides,
paints etc. In order to improve our life, more and more useful materials and compounds need to
be discovered. The question is how to do that? When we need a new bridge or want to build a
skyscraper, for example, first these objects are designed then they are built according to the plans.
Can we follow this route when we wish to make a new super conductor or a new drug? Certainly
not. Our theoretical knowledge may be sufficient for designing a bridge or a skyscraper but is
definitely not enough for designing a new more effective drug or designing a super conductor
working at or near room temperature. We do not know exactly how the super conducting or other
important properties of materials depend on their structure. The drugs exert their effects by
interactions with proteins or other molecules found in living organisms. The rules governing
these interactions, however, are largely unknown. The rational design of drugs had some
successes. The drug candidates are designed in computers based on the already known three
dimensional structures of target proteins. Both the ligand molecule and the protein itself can take
up a practically unlimited number of conformations, and that leads to difficulties. The
consequence is that mostly the traditional approach is followed: series of compounds are
synthesized then the useful drug candidates are identified by trial and error. In practice, thousands
of compounds are needed to be prepared and tested in order to find a drug candidate.
In the pharmaceutical research one of the bottlenecks was the synthesis of the very large
number of compounds needed in the discovery process. Before 1980 the traditional approach was
used. The compounds were prepared one at a time, and their testing were also carried out one by
one. In the industry, however, sophisticated methods were developed and applied in order to
improve productivity in the mass production of goods. It seems worthwhile to compare the
production of compounds to that of automobiles. Compounds are mostly prepared step by step
from the starting materials. The automobiles are also assembled from parts. The drug candidates
are unique substances all differing from each other. The automobiles are also unique products
since they can differ, for example, in their color, in their engines, in their transmission etc. They
certainly differ from each other in their locks and keys.
The first car manufacturers in the world were Panhard & Levassor in 1889 and Peugeot in
18911. These French manufacturers did not standardize their car models, each car was different
from the other. The first standardized car was the Benz Velo. Benz manufactured 134 identical
Velos in 1895.
Ransome Eli Olds invented the basic concept of the assembly line in 1901 that was
improved Henry Ford and installed it in his car factory in 1913. As a result, by 1927, 15 million
Ford Model Ts had been manufactured. As a result of further improvements and application of

automation, today the streets are full of cars. This shows the power of organizing the process of
production and application of automation. These methods that proved to be very successful in
industry were not applied at all in the mass production of compounds.
After 1980 the situation began to change. Several innovative papers were published which
radically changed our theory and practice in designing and preparing new substances for
pharmaceutical research and other areas of application. The new synthetic and screening
procedures and, which is also very important, the new way of thinking introduced in these papers
founded a rapidly growing new scientific field, Combinatorial Chemistry, revolutionized the
pharmaceutical research and are gradually expanding to other areas within and outside chemistry.
The new methods were developed in several laboratories. The way of thinking that led to these
methods was probably different in all cases. The reasons that lad to the development of the
combinatorial synthesis of peptides in the author's laboratory is described below.

1.1. Birth of the combinatorial approach


In 1964/65 I was a post doctoral fellow at the University of Alberta, Canada. I worked on
a project led by Professor L. B. Smillie which resulted in determination of the amino acid
sequence of a pro-enzyme, chymotrypsinogen-B2. After returning to Budapest, I was wondering
from how many sequence possibilities did we choose the right one. Since the protein comprised
245 amino acid residues and any of these positions could be occupied by any of the 20 different
natural amino acids, the number of possible sequences, as expressed by a simple formula,
amounted to 20 245 (=5.65x10318) combinations. This certainly seemed to be a very big number,
much-much bigger than, for example, the number of molecules in 1 mole substance (6.02x1023)
but in order to really perceive its magnitude it had to be compared to something which was also
very big. Finally I found an estimate of the mass of the universe based on an Einstein formula3. I
also calculated the mass of a protein mixture in which each sequence variant of the 245 amino
acids is represented by only a single molecule.
Estimated mass of the universe: ~1053 kg
Mass of the protein mixture:
~10295 kg
The comparison showed that the mass of the protein mixture would exceed that of the
universe by more than two hundred orders of magnitude. The number of possible protein
sequences also seems striking if it is compared to the estimated number of elementary particles in
the visible universe4.
Number of sequences in the mixture: 5.65x10 318
Number of elementary particles: ~1088
This was my first meeting with the immense diversity of molecules and the result shocked
me.
Many years later I applied the same simple formula (20n, where n is the number of amino
acid residues) to calculate the number of theoretically possible sequences in peptide families built
up from the 20 natural amino acids. Such collections of compounds are named libraries. The
results are listed in the second column of Table 1.1.

Although the figures expressing the number of components in peptide libraries were far
from being as frightening as the number of the possible protein sequences, they seemed still very
large if the possibility of their synthesis was considered. I thought that many useful bioactive
peptides could - supposedly - be found among the largely unknown components of the libraries.
For this reason the nonexistent peptide libraries reminded me of exceptionally rich gold reefs
which await exploitation. Gold can be produced by mining out all the gold containing rock then
separating the gold from the useless stone.
Table 1.1. The number of possible peptide sequences.
Number of
residues
2
3
4
5
6
7

Name

Number of
sequences
400
8,000
160,000
3,200,000
64,000,000
1,280,000,000

Dipeptides
Tripeptides
Tetrapeptides
Pentapeptides
Hexapeptides
Heptapeptides

Exploitation of the peptide libraries could be achieved via the synthesis of all possible
sequences followed by screening them against all potential targets. At that time, however, even
the synthesis of all, say, pentapeptides seemed absolutely impossible. We usually prepared one
peptide at a time mostly by solid phase synthesis (see later the details of this synthetic method)
with an elongation rate of one amino acid a day.

A
A

A
AA

E
EA

A
R
RA

E
A

R
AE

E
E

AR

R
A

R
RE

EE

E
ER

R
RR

Figure 1.1. The optimized synthesis of peptides from three amino acids (A, E and R). The solid
support is represented by .
With this rate, the synthesis of all the 3.2 million pentapeptides would have taken 3.2x5=16
million days, that is, 43.8 thousand years of uninterrupted work.

The synthesis could have been optimized by reducing to an absolute minimum the
number of necessary coupling steps. This can be achieved by using the already prepared peptides
as starting materials in the synthesis of the longer ones. This is illustrated in Figure 1.1. A peptide
library is prepared by solid phase synthesis using three amino acids (A, E and R). First the amino
acids are attached to the solid support (resin). Then the resin containing one of the attached amino
acid is divided into three portions and the synthesis is continued with the coupling of one of the
amino acids to one of the resin portions and so on. In the first step 3 couplings are carried out,
exactly the number of the formed products. In the second step 9 couplings are needed and 9
dipeptides are formed on the resin samples. In general, the number of coupling steps in such an
optimized synthesis of a peptide library is the same as the total number of products formed in the
whole synthetic process. If the 3.2 million pentapeptides are prepared the number of coupling
steps is the sum of amino acids + dipeptides + tripeptides + tetrapeptides + pentapeptides.
20 + 400 + 8,000 + 160,000 + 3,200,000 = 3,368,420
Supposing again the rate of one coupling per day in order to get the necessary time in
years, the above figure is divided by 365. The result is 9,228 years. This shows that optimization
of the synthesis reduces the time of the synthetic process from 43,800 years to 9,228 years, which
is still too long to be realizable.
I considered the accessibility of all peptide sequences to be very important, and around
1980 I began to think about potential solutions for their synthesis. It took only a short time to find
one, which would work at least in principle. The idea was also based on the method of solid
phase synthesis developed by professor Merrifield5. According to this first idea, the amino acids
used in the solid phase preparation of peptides would be replaced by an equimolar mixture of 20
different amino acids in every coupling step of the synthesis. This would lead - at least in
principle - to formation of a rapidly growing number of sequences and finally a full peptide
library could be cleaved from the support in the form of a mixture. It was clear, however, that in
such couplings the products are expected to form in unequal molar quantities as a consequence of
the differences in the reactivity of the amino acids. The differences in molarities would be
amplified in each successive coupling step leading to a mixture with uncertain composition. I felt
that a better solution might exist and I was rethinking the problem again and again. In early
spring in 1982 I spent a weekend in a little town in South-East of Hungary forgetting this time the
whole diversity problem. To my great surprise, however, next morning I awoke with the perfect
solution in my mind. The method based on this idea is known nowadays as the split-mix
procedure.
The split-mix method opened the possibility for producing peptide mixtures containing
millions of components. Such mixtures, however seemed unacceptable in the conventional drug
discovery practice where single compounds were used in pure form. For this reason there was an
urgent need to present in addition, an efficient strategy for identification of the bioactive
substance that may be present in the complex synthetic mixture. This task, however, looked
similar to finding the proverbial needle in a huge haystack. Nevertheless I could develop a
theoretical solution in a very short time. I called it synthetic back searching strategy which later
proved to be in principle identical with the "iteration strategy", published by others.
I was fully aware of the importance of the combinatorial approach in the pharmaceutical
research but one of the leading Hungarian pharmaceutical companies I contacted showed no
interest at all. In addition, the split-mix method was considered by the patent attorneys only as a
potential research tool and for this reason it was judged not to be patentable. They suggested me,
4

however, to describe the method in a document and - in order to give me some support in
potential future priority disputes - notarize it. I did so and the document written in Hungarian - in
which the principles of combinatorial chemistry including both synthesis and screening were first
clearly explained - was notarized in May, 1982. The photo of the first and last pages of the
document is demonstrated in Figure 1.2.

Figure 1.2. The photo of the first and last page of the 1982 document

The 1982 document, as shown in the Figure 1.2, was written in Hungarian. This is the
first authentic document in which the principles of combinatorial chemistry are described. The
translated version can be seen below.

1.2. The translated version of the document notarized in 1982


STUDY ON POSSIBILITIES OF SYSTEMATIC SEARCHING FOR PHARMACEUTICALLY
USEFUL PEPTIDES
Written by Dr. rpd Furka, university professor
Budapest, May 29, 1982
As exemplified, among others, by the peptide hormones discovered so far, the shorterlengthier peptides take part in a number of important functions in the living organism. It can be
supposed, that only a small fraction is known of these biologically active peptides having

potential therapeutic effect. This fact motivates the intensive international and domestic research
activity in this field.
Two, in principle different, approaches offer themselves for searching for peptides
bearing new biological effects:
1. Isolation of peptides from living organisms based on their previously known
biological effects.
2. Preparation of peptides by synthesis with post determination of their biological
effects.
Until now the isolation procedure proved to be more effective in spite of the fact that this
method is also very laborious. This may be explained by the fact that the number of possible
peptides grows rapidly with the number of residues so even the synthesis of all tetrapeptides (160
thousands) seems to be a hopeless task. If we consider the 20 natural amino acids the dependence
of the number (Nn) of possible peptides on the number of residues (n) is expressed by the
following formula:
Nn = 20n
If the n-residue peptides are synthesized stepwise and independently, the number of the required
synthetic steps (Sn) can be calculated as follows:
Sn = (n-1) 20n
It is noted, that a synthetic step means a complete coupling cycle, that is, in addition to the
coupling step itself incorporates the operations connected with the protecting groups, too.
With good organization, that is, choosing a systematic synthesis route the number of
synthetic steps can be reduced. The minimum number of synthetic steps is:
n

S n 20i
i2

The synthesized peptides are supposed to be submitted to screening tests. Since several
tests have to be done on each peptide, the total number of the required screening tests is
hopelessly large. If the number of kinds of screening tests is denoted by t, the total number of
screening tests is expressed by the following equation:
Tn = t 20n
Table 1.2 shows the possible number of peptides depending on the number of residues,
the number of synthetic steps required for their synthesis, and number of the screening tests,
calculating with 10 different tests (t=10). The figures - which are rounded - clearly show, that
even the synthesis and testing of all tripeptides would be an almost hopeless venture.
Because of the very large number of possible peptides, the stepwise synthesis of all
peptides - even in the case of small ones - is an unrealizable task. The large number of the
screening experiments constitutes a further problem. The proposal to be outlined on the next
pages will try to somewhat improve this almost hopeless situation.
6

Table 1.2.
Possible number of peptides (Nn ) containing different number of residues (n),
the number of synthetic steps required for their synthesis (Sn ) in an optimized
process, furthermore the number of screening experiments (Tn ) calculating
with 10 different screening tests (t=10)
(the figures are rounded)
n
2
3
4
5
6
7
8
9
10

4
8
160
3
64
1
25
512
10

Nn
hundred
thousand
thousand
million
million
billion
billion
billion
trillion

Sn
hundred
thousand
thousand
million
million
billion
billion
billion
trillion

4
8
168
3
67
1
26
537
10

4
80
2
30
640
13
256
5
102

Tn
thousand
thousand
million
million
million
billion
billion
trillion
trillion

Systematic search for biologically active small peptides through synthesis and screening of
peptide mixtures
The proposal to be outlined here constitutes a research project which makes possible to
search for biologically active peptides with much greater chance than before. When I write down
this project I'm fully aware of its potential importance in industry. It is also clear, that it's
realization is possible only through cooperation of different institutions. Primarily the
participation of the pharmaceutical industry is desirable since the investments can be recovered
through pharmaceutical industry.
The essence of the proposal is that instead of one by one synthesis of peptides, peptide
mixtures should be prepared containing several hundred or several thousand peptides in
approximately 1 to 1 molar ratio, and these peptide mixtures should be submitted to screening
tests. It will be shown that on this way much labor can be saved both in the synthetic work and in
the screening experiments. In the first stage one has to determine whether or not the mixture
shows any biological effect. If biological effect is observed, of course, it has to be determined
which component (or which components) are responsible for the activity.
Method for synthesis of peptide mixtures
Since not single peptides but rather mixtures of peptides are synthesized, post synthetic
purification and removal of by-products are out of question. Because of this, the classical method
of synthesis (in solution) can not be used either. In the synthesis of peptide mixtures the solid
phase method has to be applied. It is noted here, that in the syntheses not necessarily the 20
amino acids are used. In some cases more than 20 amino acids may be used, for example if - in
addition - non-common amino acids are intended to be used as building blocks. Less than 20

amino acids may be used, for example, in decapeptides, since the synthesis of all peptides seems
to be unrealistic and have to compromise with the use of fewer kinds of amino acids. Let denote
by k the number of the amino acids intended to vary in the i-th position. The numbers of amino
acids varied in the C-terminal and N-terminal position are k1 and kn , respectively.
Realization of the synthesis
The resin is divided into k1 equal portions (that is to as many portions as many amino
acids are intended to vary at the C-terminal of peptides). Then each portion of resin is coupled
with one of the k1 kinds of amino acids then the amino-protecting group is removed from every
sample. A small quantity is removed from every sample and they are taken aside for later use,
then the samples are thoroughly mixed. Then the mixture of aminoacyl resins is divided into
k2 equal portions and each of them is coupled with one of the k2 kinds of protected amino acids
then the amino-protecting groups are removed from each sample. Before mixing, again small
samples are removed and taken aside. The mixture of dipeptides is cleaved from a small portion
of the mixed resin to use it in biological tests. The rest of the mixed resin is divided into k3 equal
parts and the amino acids intended to occupy the third position are coupled to them. Then the
synthesis is likewise continued until the mixture of n-residue peptides is reached.
It is worthwhile to add some notes. As in an ordinary solid phase synthesis, one has to
make an effort to achieve good conversion by applying the reagents in excess. Fortunately,
however, conversions lower than 100%, or minor unwanted splitting reactions do not cause so
serious problems like in ordinary syntheses. The labour requirement could be significantly
reduced by using mixtures of properly protected amino acids in acylation reactions. This,
however, does not seem to be an acceptable solution because of the differences in the reactivity
of the activated amino acids which would lead to the formation of peptides in significantly
different concentrations thus causing problems in the screening experiments. Formation of
peptides in equal concentrations can only be assured by mechanical mixing of samples followed
by dividing into equal portions. This makes possible a complete conversion for every amino acid
component. Possibility of acylations with mixtures of several amino acids of identical reactivity
might be a matter of further considerations. Smaller differences in reactivities could be
compensated by properly selected molar ratios of the amino acid derivatives of the mixture. In
the following calculations, however, the possibility of acylations with the mixtures of amino acid
derivatives will be left out of considerations.
The number of peptides formed in the synthesis, that is, the number of components in the
peptide mixtures - in a general case - can be calculated by the following formula:
Nn = k1.k2 . . . . . . kn-1.kn
If the same number (k) of amino acids are varied in every position
Nn = k n
The number of synthetic steps in the synthesis of a peptide mixture containing Nn peptides
(considering the attachment of the first amino acid to the resin as separate step) is:
Sn = k1 + k2 + . . . . + kn-1 + kn

If the same number (k) of amino acids are varied in each position,
Sn = nk
The formulae show the advantage of the synthesis of peptide mixtures: the number of the
synthetic steps can be calculated by summing the numbers of the varied amino acids, while the
number peptides is given by the product of the numbers of the varied amino acids.
One example: the synthesis of the mixture of tetrapeptides prepared by varying the 20 kinds
of amino acids, needs only 80 synthetic steps! It is noted, that in the same run all shorter peptides
- that is the 400 dipeptides and the 8000 tripeptides - are formed, too. The traditional synthesis of
these peptides would need 168 400 synthetic steps. A different comparison: in the traditional
method with 80 steps only about 30 tetrapeptides can be synthesized.
Screening of peptide mixtures
Peptides mixtures - in the first approximation - are synthesized to determine whether or
not they contain biologically active component. It is supposed - although it needs experimental
verification - that screening experiments can be made with mixtures, too. This offers great
advantage over the traditional method since the number of screening tests is reduced by a factor
equal to the number of components of the mixture. For example, the mixture of the 8000
tripeptides can be examined by a single series of tests. If there is active peptide among them, one
of the executable t tests gives positive result. If the number of active peptides is more than one,
then, of course, more tests may give positive result. In the synthesis of the mixture of n-residue
peptides it is wortwhile to test the shorter peptides, too. The synthesis is so designed to allow for
this. Taking this requirement into account, and the number of kinds of tests being t, the total
number of the executable tests is:
Tn = t(n-1)
Although this equation certainly holds, its realization in practice deserves some notes. There is without any doubt - an upper limit in the number of components of the peptide mixtures to be
submitted to screening tests. It is difficult to estimate this number without experiments. The
mixtures may probably contain many thousands of components, and as it can be judged today,
the method outlined above is rather limited by possibilities of screening tests than by the number
of the required synthetic steps. If there are too many components in the mixture, too large
samples have to be applied in the screening experiments to achieve observable effect for a single
component. The mixture supposedly contains a number of more or less active analogs and their
effect is probably summarized. Nevertheless, an unsurpassable limit in the number of
components certainly exists. Therefore in certain cases may prove useful to examine the effect of
the n-residue mixtures without final mixing. In other cases the synthesis should be designed so
not to surpass the optimal number of components.
"Back-searching" for the active peptide
If the peptide mixture is detected to contain active component, that is, if the mixture
shows a new type biological effect, then the further task is the isolation and structure
determination of the active peptide followed by its synthesis. Once the mixture containing the
9

active component or components is in our hand the isolation can be carried out using the effective
separation methods, since these make possible to separate the active compound even from
thousands of inactive components. It is possible, however, to follow a different method, too. This
will be outlined here. This approach to the identification of the active peptides is supposed to be
less tedious then the isolation methods, moreover it supplies additional information concerning
the structure-effect relationship. Applicability of the method requires a procedure for quantitative
determination of activity. For the sake of simplicity let's suppose that the mixture contains a
single effective component (besides analogs having the same kind of effect but smaller activity).
Back-searching step No. 1
The experiments are started with the kn samples taken aside in the synthesis of the nresidue peptides before final mixing. The mixtures of n-residue peptides are cleaved from each
resin sample. The mixtures of peptides differ from each other only in the n-th (that is the Nterminal) residue of their component peptides. Each peptide mixture is submitted to a quantitative
activity determination. This shows how the activity depends on the terminal amino acid residue,
that is, this way we can determine the N-terminal residue of the active peptide, and in addition it
will show the effect of its replacement by other amino acid residues. Let's suppose, for example,
that the N-terminal residue in the sample showing the highest activity (as well as in the active
peptide) is Phe (phenylalanine). It is noted here that if there are several samples showing equally
high activity it is practical to choose as the N-terminal residue of the active peptide the cheapest
or the synthetically less problematical amino acid. This note holds for the subsequent backsearching steps, too.
Back-searching step No. 2
The experiment is continued with the kn-1 samples taken aside in the synthetic stage of the
(n-1)-residue peptides. The amino acid determined before, that is Phe in our example, is coupled
to each sample. Cleavage of the peptides from the support gives kn-1 different peptide mixtures.
Their common feature is that every peptide has Phe in the N-terminal position. By submitting the
peptide mixtures to quantitative screening experiments one can determine the amino acid residue
occupying position n-1 (that is, the pre-N-terminal position) in the active peptide. This
experiment also shows the effect on activity of substitution of this amino acids with other ones.
Let's suppose that the pre-aminoterminal amino acid is Arg (arginine). It should be noted that in
this back-searching step Phe is coupled to kn-1 samples and the same number (kn-1) of screening
experiments have to be done. Not all of the t kinds of tests are required, only the one proved
before to be positive. Consequently the number of the synthetic steps and the number of
screening experiments are the same: kn-1. It is also noted that in the previous back-searching step
only screening test are done (their number is kn) synthetic steps are not needed.
Back-searching step No. 3
This, and the subsequent back-searching steps may be realized using two different
approaches. The peptides in samples taken aside during the synthesis have to be elongated to
contain n residues, in such way, to carry on their N-terminal section the amino acid residues
assuring activity. This can be realized on two ways. Either by stepwise coupling with amino acids
(in our example with protected Arg then Phe) or by coupling in a single step with a previously
10

synthesized oligopeptide having the required sequence (in our example Phe.Arg). The required
synthetic steps in the two approaches significantly differ. The number of the screening
experiments, however, are the same in both cases. Let's turn now to the No. 3. back-searching
step.
Stepwise elongation
Let's take the kn-2 samples taken aside in the synthesis of (n-2)-residue peptides. Each
sample is coupled first with protected Arg then with protected Phe. After cleaving the peptides
from the support each of the kn-2 peptide mixtures are submitted to activity tests to determine the
amino acid residue occupying the third position counting from the N-terminal end. The number
of screening tests to be executed is kn-2. The number of the required synthetic steps is: 2kn-2. The
multiplying factor preceding k is the bigger the shorter are the peptides to be elongated. The
numerical value of the factor is equal to the number of amino acids to be coupled with in the
elongation process.
Elongation with oligopeptide
A previously synthesized dipeptide (in our example Phe.Arg) is coupled to each of the kn-2
samples taken aside and the process is continued as described above. The number of screening
test is also kn-2. The number of synthetic steps (leaving out of consideration the synthesis of the
oligopeptide) is also kn-2. This procedure seems to be more economical. In practice it means that
the active peptide is synthesized in parallel with the screening tests using the classical method
started from the N-terminus. Small fractions of the growing peptide are sacrificed in the backsearching steps. This back-searching method has the great advantage (in addition to the fact that
it needs less synthetic steps) that when the back-searching procedure is finished the active peptide
is synthesized, too.
Back-searching of more than one active peptide
In the synthetic peptide mixtures several active peptides may be present, showing
different effects. In these cases the number of back-searching steps will be bigger by a factor
equal to the number of the differing active peptides. That is, if the number of the active peptides
is "a" the values deduced above are multiplied by a. It is noted that the presence in the mixture of
peptides having different effects may complicate the back-searching process especially in the
case of peptides with opposing effects. This, however, is not treated in details.
The back-searching process ends when the sequences of all active peptides are
determined by applying either the oligopeptide or the stepwise elongation method.
Total number of synthetic steps and screening tests summarized
for the whole synthetic backsearching process
Number of synthetic steps using oligopeptide elongation
n

S n ki

In synthesis:

i 1

11

n 1

S n a ki

In back-searching:

i 1
n 1

S n (a 1) ki k n

Total in synthesis and back-searching:

i 1

If k amino acids are varied in each step:

Sn = [n(a + 1) - a]k

Number of synthetic steps using stepwise elongation.


n

S n ki

In synthesis:

i 1
n 1

S n a ( n 1)ki

In back-searching:

i 1
n 1

S n k i a (n 1) ki

Total in synthesis and backsearching:

i 1

i 1
n 1

S n nk ak i

If k amino acids are varied in each step:

i 1

Number of screening tests equally valid using the oligopeptide and stepwise elongation
In synthesis:

Tn = t(n-1)

In back-searching:

Tn a ki

n
i 1
n

Total in synthesis and back-searching:

Tn t ( n 1) a ki

If k amino acids are varied in each step:

Tn = t(n-1) + ank

i 1

An example: preparation and screening of all pentapeptides


N5 = 320000,

n=5

Total number of synthetic steps


Oligopeptide elongation
Stepwise elongation
Number of tests

12

k=20

t=10

180
300
140

a=1

Extension of the method to other types of compounds


Applicability of the method outlined before is not restricted for only the systematic
searching for active peptides. The same principle applies to all other sequential types of
compounds, that is, when the compounds belonging to this type of compounds differ from each
other only in their building blocks or the sequences of these building blocks. Among them may
occur natural compounds like oligosaccharides or oligonucleotides but synthetic products may be
taken into account, too. Among these later ones one may think about sequential copolymers or
sequential polycondensates.
Dr. rpd Furka
university professor

File number 36237/1982


I certify this stitched document comprising 14, that is, fourteen pages was subscribed in my
presence by Dr. Arpad Furka, university professor, with his own hands.
Budapest, 1982. Nineteen hundred and eighty two, June 15, (fifteen).
Dr. Judit Bokai
state notary public

1.3. Publication of the split-mix combinatorial synthesis


The split-mix method was published in 1988 as posters on two international congresses.
First on the 14th International Congress of Biochemistry in Prague6, then on the 10th
International Symposium of Medicinal Chemistry, Budapest, Hungary7. The manuscript for
publication in print was submitted in February 1990 to the International Journal of Peptide and
Protein Research. The paper appeared in June 19918. It was unusual, however, that within the 16
month passed between submittance of the manuscript and appearance of the paper four patents
were filed on the subject of the split-mix synthesis and the author of one of the patents was the
Editor in Chief of the journal where the paper was submitted. Also soon after our paper appeared
two other papers were published (in September 1991) on the same topic9,10 also with the Editor in
Chief among the members of the authors of one of the publications10. More can be read on this
subject in a paper published in Periodica Polytechnica Ser. Chem11
http://www.pp.bme.hu/ch/index.html
and in the home page of the author of this book12.
http://www.szerves.chem.elte.hu/furka

13

(year 2004)

References
1. http://inventors.about.com/library/weekly/aacarsassemblya.htm
2. L. B. Smillie, . Furka, N. Nagabhushan, K. J. Stevenson, C. O. Parkes Nature 1968, 218,
343.
3. A. Einstein The meaning of relativity, Princeton University Press, 1955, 5th Ed., Princeton,
NY, p. 107.
4. A. Linde Scientific American 1994, November, p. 48.
5. R. B. Merrifield J. Am. Chem. Soc. 1963, 85, 2149.
6. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherland, 1988, Vol. 5, p 47.
7. . Furka, F. Sebestyn, M. Asgedom, G. Dib Proceedings of the 10th International
Symposium of Medicinal Chemistry, Budapest, Hungary, 1988, p 288, Abstract P-168.
8. . Furka, F. Sebestyn, M. Asgedom, G. Dib Int. J. Peptide Protein Res. 1991, 37, 487.
9. R. A. Houghten, C. Pinilla, S. E. Blondelle, J. R. . Appel, C. T. Dooley, J. H. Cuervo Nature
1991, 354, 84.
10. K. S. Lam, S. E. Salmon, Hersh E. M, V. J. Hruby, W. M. Kazmierski, R. J. Knapp Nature
1991, 354, 82.
11. . Furka, I Hargittai PERIODICA POLYTECHNICA SER. CHEM. 2004, 48, No. 1, p. 13.

14

2. The solid phase synthesis


The solid phase synthesis is very important in combinatorial chemistry since most of the
combinatorial synthetic procedures are based on this method. The solid phase synthesis was
developed by Merrifield1 and demonstrated in the synthesis of peptides.
Peptides and proteins are built up from -amino acids. The structure of the -amino acids
is expressed by the following general formula where R is the side chain by which the -amino
acids differ from each other:
H2N-CH-COOH
R

Table 2.1. The natural -amino acids


Name
Alanine
Arginine
Asparagine
Aspartic acid
Cysteine
Glutamine
Glutamic acid
Glycine
Histidine
Isoleucine
Leucine
Lysine
Methionine
Phenylalanine
Proline
Serine
Threonine
Tryptophane
Tyrosine
Valine

Side chain
-R
-CH3
-(CH2)3NH(C=NH)NH2
-CH2CONH2
-CH2COOH
-CH2SH
-(CH2)2CONH2
-(CH2)2COOH
-H
-CH2(4-imidazolyl)
-CH(CH3)CH2CH3
-CH2CH(CH3)2
-(CH2)4NH2
-(CH2)2SCH3
-(benzyl)
-CH2OH
-CH(CH3)OH
-CH2(3-indolyl)
-(4-hydroxybenzyl)
-CH(CH3)2

Three letter symbol


Ala
Arg
Asn
Asp
Cys
Gln
Glu
Gly
His
Ile
Leu
Lys
Met
Phe
Pro
Ser
Thr
Trp
Tyr
Val

One letter
symbol
A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V

The only exception is proline in which the side chain and the amino group form a ring.

15

COOH
N
H

The twenty -amino acids that are components of proteins are listed in Table 2.1.
In the traditional way, peptides are synthesized in solution from properly protected amino
acids.
Z-NH-CH-COOX + H2N-CH-COOB
R1

Z-NH-CH-CO-NH-CH-COOB

R2

R1

R2

The carboxyl group of one amino acid is protected (by protecting group B) while the
amino group is free. The amino group of the other amino acid is protected (Z) and the carboxyl
group is activated (X) in order to make it capable to acylate the other amino acid.
Z-NH-CH-COOH + Cl-CH2R

Z-NH-CH-COO -CH2R

NH2-CH-COO-CH2R
Z-NH-CH-COX + NH2-CH-COO-CH23

R1

Z-NH-CH-CO-NH-CH-COO -CH2R1

NH2-CH-CO-NH-CH-COOH +
+
R1
R
Figure 2.1. Solid phase synthesis of a dipeptide.
1: Attachment of the first N-protected amino acid to the solid support ( ). 2. Removal of the
protecting group (Z) from the amino group of the attached amino acid. 3: Coupling the second Nprotected amino acid to the attached one. 4: Cleaving the dipeptide from the solid support and
removing the protecting group.

16

In continuation of the synthesis the amino-protecting group of the dipeptide is removed


then acylated with another amino-protected and activated amino acid. The product of every
synthetic step of the synthesis is usually isolated from the solution then purified. This is mostly a
tedious job.
In the solid phase synthesis the first amino acid is attached by its carboxyl group to a
polymer support then the next amino acid is coupled to the already attached one and the rest of
the amino acids are coupled sequentially to the attached peptide chain. The solid support used by
Merrifield was a styrene-divinylbenzene co-polymer in fine bead form. The resin was
functionalyzed by introduction of chloromethyl groups that made possible to attach the first
amino acid to the resin.
The solid phase method is demonstrated in Figure 2.1 using the synthesis of a dipeptide as
example. In the synthetic step No. 1 the first amino acid is attached to the resin. All functional
groups of the amino acid are protected (Z) except the -carboxyl group. In step No. 2 the
protecting group is removed from the -amino group of the attached amino acid in order to be
able to couple the second amino acid to the first one. The second amino acid is also fully
protected except again the -carboxyl group which is properly activated (by group X). The
coupling is realized in step No. 3. The steps No. 2 and No. 3 form a cycle that is sequentially
repeated until the attachment of the last amino acid is finished. The closing step of the synthesis
(step No. 4 in our example) is the cleavage of the product from the resin which usually involves
the cleavage of the protecting groups from the functional groups of the peptide. It seems
worthwhile to note that the synthesis can be started with a resin which already contains an
attached amino acid. Such resins are commercially available.

Resin

Frit

Figure 2.2. Reaction vessel for solid phase synthesis


The solid phase reactions can be carried out in a reaction vessel demonstrated in Figure
2.2. The glass or plastic tube has a grid at the bottom that keeps the resin in the vessel when the
tap is open. The tube is usually mounted on a laboratory shaker. The resin (containing the
attached amino acid) is placed in the reaction vessel then all the multi step operations are carried
out in the tube. First the resin is swelled by adding solvent, usually a mixture of dimethyl
formamide (DMF) and dichloro methane (DCM), then the group protecting the -amino group is
removed by adding a proper reagent. Following this operation the protected amino acid and the
coupling reagent are added in dissolved form. During the coupling reaction, the reaction vessel is
shaked. The solid phase reactions are generally slower than the solution phase counterparts and
shaking shortens the reaction time. After all of the described operations the resin is thoroughly
washed with solvent. The peptide can be elongated by repeating the above cycle. After finishing

17

the coupling reactions a mixture of reagents is added which cleaves the peptide from the resin
and removes the protecting groups. The synthesized peptide can be recovered from the filtrate.
In the solid phase synthesis the amino acids and the reagents can be added in excess to
drive the reactions to completion. The excess of the amino acids and reagents can easily be
removed by filtration. The coupling step can even be repeated to ensure complete conversion.
The traces of the reagents are removed by repeated washings and the product of coupling remains
on the filter in pure form.
As outlined above the elongation of the peptide chain on the support is realized in
identical coupling cycles (of course the added protected amino acid may vary from cycle to
cycle). This opens the possibility of automation. In fact Professor Merrifield and his colleagues
constructed and published an automatic peptide synthesizer2. Today many kinds of solid phase
peptide synthesizers are commercially available. In addition, solid phase automatic synthesizers
have also been developed for preparation of other kinds of organic compounds, too (see later).

2.1. Solid supports


Since the seminal publication of Merrifield in 1963 1 different types of solid supports have
been developed. The solid support, of course, have to be insoluble in the solvents used in the
solid phase synthesis and it is also a requirement not to react with the reagents applied in the
synthesis. In the case of most solid supports the reactions take place both inside and at the surface
of the solid particles. These supports are mostly used in the form of small resin beads that swell
in the solvents applied in the synthesis. The reactions in other kinds of supports take place only at
the surface. These supports are used as polymer or glass beads, rods, sheets etc. and (except their
surface layer) do not swell in the solvents.
The solid supports are usually composed of two parts: the core and the linker. The starting
compound of the synthesis is attached to the support via the linker.
Core

Linker

Start compound

The core ensures the insolubility of the support, determines the swelling properties, while the
linker provides the functional group for attachment of the start compound and determines the
reaction conditions for the cleavage of the product. The linker itself and the covalent bond
formed with the start compound must be stable under the reaction conditions of the synthesis.

2.1.1. Crosslinked polystyrene


Cross-linked polystyrene resins are the most commonly used supports for solid phase
synthesis. The polystyrene resins are synthesized from styrene and divinylbenzene by suspension
polymerization in the form of small beads. The ratio of divinylbenzene to styrene determines the
density of cross links. Higher crosslink density increases the mechanical stability of the beads.
Lowering the crosslink density, on the other hand, increases swelling and increases the
accessibility of the functional groups buried inside the beads. In practice, mostly 1-2%
divinylbenzene is used. Crosslinked polystyrene is very hydrophobic so it swells only in apolar
18

solvents. Table 2.2 shows the swelling factor (ml/g) of 1% crosslinked polystyrene in different
solvents.

Table 2.2. Swelling factor of 1% crosslinked polystyrene


Solvent
Tetrahydrofurane
Toluene
Dichloromethane
Dioxane

Swelling factor
5.5
5.3
5.2
4.9

Solvent
Acetonitrile
Dimethylformamide
Methanol
Water

Swelling factor
4.7
3.5
1.8
-

Functional groups can be introduced into the resin by two approaches: either by postfunctionalization of the aromatic rings of polystyrene, or by using functionalized styrene in
polymerization.
The bead size of the resin is an important factor that has to be considered in solid phase
synthesis. The reactions are faster when small beads are used, but application of very small beads
may cause problems in filtration. The bead size is characterized either by the diameter of the
beads or by the inversely proportional mesh size. In practice most often the 200-400 mesh (35-75
micron) or the 100-200 mesh (75-150 micron) bead sizes are used. The bead size distribution also
deserves consideration. A narrow bead size distribution is advantageous. The capacity of the
polystyrene beads is around 0.5 mmol/g.

2.1.2. Polyethylene glycol (PEG) grafted supports


PEG-grafted polystyrene has a 1-2% crosslinked polystyrene core and to its aromatic
rings, polyethylene glycol chains are covalently attached. Its commercial name is Tentagel3
(Figure 2.3).
O
O

Figure 2.3. Structure of PEG-grafted polystyrene (n ~ 70). X is a functional group (Br,


OH, SH or NH2) for attachment of the substrate
The advantage of the PEG-grafted polystyrene is that the substrate at the end of a flexible
chain is more accessible to reagents. It behaves like being in a solution-like environment. The
PEG-chains gives a hydrophilic character to the resin and swells well in water and methanol but
poorly in ether or ethanol.

19

2.1.3. Inorganic supports


Glass beads with controlled pore size can be manufactured and are commercially
available. The glass beads can be functionalized and can be used as supports in solid phase
synthesis. The mechanical stability of such glass beads surpass that of the resin beads but do not
swell in solvents.
Functionalized ceramics can also be used as supports.

2.1.4. Non-bead form supports


Polymers can be used as supports for solid phase synthesis not only as microscopic beads
but also in the form of macroscopic objects if their surface can be functionalized with groups that
can serve as anchors to hold the substrate in a reasonable quantity. Using appropriate monomers
like styrene or others, polyolefin chains can be grafted by radiation into the surface of the objects
and the chains can be functionalized.4 Such grafted macroscopic solid support units as SynPhase
crowns and SynPhase lanterns are commercially available in different sizes at Mimotopes,
Australia.5

2.2. Linkers, anchors


The initial building block of the compound to be prepared by solid phase synthesis is
covalently attached to the solid support via the linker. The linker is a bifunctional molecule. It has
one functional group for irreversible attachment to the core resin and a second functional group
for forming a reversible covalent bond with the initial building block of the product. The linker
that is bound to the resin is called anchor.
Resin

Resin

Linker

Anchor

The anchor can also be considered as a protecting group of one of the functional groups of
the final product and, as such, it determines the reaction conditions by which the product can be
cleaved from the support. A large variety of the commercially available resins contain the already
built in anchor. A series of selected examples are found below.
Merrifield resin.
The Merrifield resin can be used to attach carboxylic acids to the resin. The product can
be cleaved from the resin in carboxylic acid form using HF.
CH2-Cl
Trityl chloride resin.
The trityl chloride resin is much more reactive than the Merrifield resin. It can be used for
attachment of a vide variety of compounds like carboxylic acids, alcohols, phenols, amines,

20

thiols. The products can be cleaved under mild conditions using a solution of trifluoroacetic acid
(TFA) in varying concentrations (2-50%).

Cl

Hydroxymethyl resin.
The resin can be applied for attachment of activated carboxylic acids and the cleavage
conditions resemble that of the Merrifield resin.
CH2-OH
Wang resin.
The resin is used to bind carboxylic acids. The ester linkage formed has a good stability
during the solid phase reactions but its cleavage conditions are milder than that of the Merrifield
resin. Usually 95% TFA is applied. It is frequently used in peptide synthesis.
O-CH3
O

CH2-OH

Aminomethyl resin.
Carboxylic acids in their activated form can be attached to the resin. Since the formed
amide bond is resistant to cleavage, the resin is used when the synthesized products are not
cleaved from the support; they are tested in bound form.
CH2-NH2

Rink amide resin.


The Rink resin is designed to bind carboxylic acids and cleave the product in
carboxamide form under mild conditions. The amino group in the resin is usually present in
protected form. For attachment of the substrate first the protecting group is removed then it is
reacted with the activated carboxylic acid compound. Cleavage of the product in carboxamide
form can be performed with dilute (~1%) TFA.

21

NH2

CH

OCH3

OCH3
Photolabile anchors.
Photolabile anchors have been developed that allow cleavage of the product from the
support by irradiation without using any chemical reagents. Such anchors, like the 2-nitrobenzhydrylamine resin below, usually contain nitro group that absorbs UV light.

CH

NH2
NO2

Traceless anchors.
The initial building block of a multi-step solid phase synthesis needs to have one
functional group (in addition to others) for its attachment to the solid support. It may happen that
in the end product this group is unnecessary and needs to be removed. For this reason anchors
have been developed that can be cleaved without leaving any functionality in the end product at
the cleavage site. These traceless anchors usually contain silicon based linkers.

2.3. Protecting groups


If a chemist wants to carry out a reaction on only one functional group of a multifunctional group compound, the reactivity of the rest of the functional groups needs to be
suppressed. This can be achieved by application of protecting groups. A protecting group is
reversibly attached to the functional group to convert it to a less reactive form. When the
protection is no longer needed, the protecting group is cleaved and the original functionality is
restored. A large number of protecting groups were developed for use in peptide synthesis since
the amino acids are multi-functional compounds. It is an important requirement for a protecting
group to be stable under the expected reaction conditions and to be cleavable - if possible - at
mild reaction conditions. The stability/cleavage conditions of a protecting group are considered
relative to those of the others. Two protecting groups are said to be orthogonal if either of them
can be removed without affecting the stability of the other one. Some of the protecting groups
most widely used in peptide synthesis are described below.

22

2.3.1. Protection of amino groups


The benzylcarbonyl (Z) group.
Bergmann and Zervas suggested the benzyloxycarbonyl group for amino-protection in
peptide synthesis in 1932 and this important protection type is still in use. The Z group can be
introduced by the reaction of the amino group containing compound with benzylchloroformate
under Schotten-Bauman conditions.

O
CH 2 -O-C-Cl

H2N~

O
CH 2 -O-C-NH ~

The Z protection is stable under mildly basic conditions and nucleophilic reagents at ambient
temperature. Cleavage can be brought about by HBr/AcOH, HBr/TFA or catalytic
hydrogenolysis.
The t-butoxycarbonyl (Boc) group.
An alternative choice for amino group protection is the Boc group. Its advantage is that
can be removed under milder conditions than the Z group.
O
Me3C-O-C-NH~
The Boc group is completely stable to catalytic hydrogenolysis and as such is orthogonal to the Z
group. Basic and nucleophylic reagents are no effect on the Boc group and its removal can be
carried out by TFA at room temperature. The most convenient reagent that can be used in the
protection reaction is the Boc anhydride (Boc2O).
The 9-fluorenylmethoxycarbonyl (Fmoc) group.
The Fmoc group differs from both Z and Boc groups since it is very stable to acidic
reagents.

23

O
H
CH2-O-C-NH-

The Fmoc group can be removed under basic conditions. Usually 20% piperidine dissolved in
DMF is used as reagent. One of the reagents for introducing the Fmoc group is the FmocCl.

2.3.2. Protection of carboxyl groups


Carboxyl groups are most often protected by converting them to benzyl esters or t-butyl
esters.
CH3
~COO

~COO-CH2-

C CH3
CH3

benzyl ester

t-butyl ester

The benzyl esters are cleaved by saponification, HBr/AcOH, HF and catalytic


hydrogenation but not by TFA. Their response to acids is similar to that of the Z groups but
somewhat less sensitive.
The t-butyl esters, unlike benzyl esters, are stable to bases or nucleophilic attack. The
properties of t-butyl esters are somewhat similar to those of the Boc groups although they are less
sensitive to acidolysis. They can be cleaved by TFA.

2.3.3. Protection of other functional groups


The alcoholic and phenolic hydroxyl groups are protected by converting them to benzyl
ether or t-butyl ether. The former protecting group can be cleaved by HF, HBr/AcOH or by
catalytic hydrogenolysis and the latter one by TFA.
Thiol groups can also be protected by benzyl ether formation or by tritylation.
The guanidino group (present in arginine) can be protected by nitration or by
arylsulphonyl groups. The nitro group resists HBr/AcOH and can be cleaved by liquid HF.
Among the arylsulphonyl groups the tosyl (Tos) group can be cleaved by liquid HF or sodium in
in liquid ammonia. Two other arylsulfonyl groups are more sensitive to acidic conditions. The
2,2,5,7,8-pentamethylchroman-6-sulphonyl (Pmc) group can be cleaved by TFA under conditions

24

similar to the removal of the Boc group. The 4-methoxy-2,3,6-trimethylbenzenesulphonyl (Mtr)


group is also cleaved by TFA but is less sensitive and requires a few hours for cleavage.
O

Me
SO2

Me

Me
O

NH-C-NH~

Me

Me
Me

SO2

NH-C-NH~

Me
Me

Pmc protection

Me

Mtr protection

O
O2N-NH-C-NH~

Protection by nitro group


The amide groups (in side chains of asparagine and glutamine) can be protected by
tritylation. The trityl protecting group is stable to base, catalytic hydrogenolysis, very mild acid
but is cleaved with TFA. It is used in conjunction with the Fmoc amino group protection strategy.
The NH group of the imidazole ring (in the side chain of histidine) is protected in
conjunction with the Fmoc strategy by tritylation. The trityl protecting group can be removed by
TFA at room temperature.
N
N
CPh3

Protection by trityl group


The indole ring (in tryptophane) can be protected by Boc group that can be removed by TFA.

N
Boc

Protection by Boc group


2.3.4. Coupling reagents for peptide synthesis
In the coupling reactions of peptide synthesis the carboxyl group of the acylating amino
acid is activated. Care should be taken in selecting the activation method to avoid racemization.
One of the choises is 1,3-diisopropylcarbodiimide (DIC) with addition of N-hydroxybenztriazole
(HOBt) in order to reduce racemization.

25

H 3C

CH3
CH

N=C=N

CH

H 3C

CH3

OH

1,3-Diisopropylcarbodiimide (DIC)

N-Hydroxybenztriazole (HOBt)

Another very often used coupling reagent is O-benztriazole-N,N,N,N-tetramethyluronium-hexafluoro-phosphate (HBTU) that is known not to cause racemization.
_

PF6

N
N
O

CH3
+

N
CH3

N
H3C

CH3

O-benztriazole-N,N,N,N-tetramethyl-uronium-hexafluoro-phosphate
(HBTU)
One of the bases applied in the coupling reactions is N,N'-Diisopropylethylamine
(DIPEA).

N,N'-Diisopropylethylamine (DIPEA)

2.4. Solid phase synthesis of organic molecules


The vast number of reactions developed for the synthesis of organic molecules were
optimized for solution phase. In the decades following its introduction by Merrifield,1 of the solid
phase method was mainly used in peptide chemistry. Except the studies of Lezenoff,6-9 Camps,10
Frechet11-13 , Crowley and Rapoport,14 little experience has been accumulated in its application in
the synthesis of organic molecules. The advent of combinatorial chemistry, however, induced
radical changes and initiated a fast expanding research in the field. The classes of chemical
reactions developed for solid phase as a result of this research are showed below:
Anchoring reactions
Amide bond forming reactions
Aromatic substitutions
Condensation reactions

26

Cycloaddition reactions
Organometallic reactions
Michael additions
Heterocyclic forming reactions
Multi-component reactions
Olefin forming reactions
Oxidation reactions
Reduction reactions
Substitution reactions
Protection/deprotection reactions
Cleavage from supports
Other types of solid phase reactions
Excellent compilations of these reactions were prepared by Hermkers et al.15,16, .
Furka17 and W. M. Bennett.18

2.5. Solid phase reagents and scavenger resins in solution phase synthesis
Solid phase additives are successfully applied in many solution phase synthetic reactions.
In solid phase reactions the substrate is bound to the solid phase carrier and the reagents are in
solution. In solution phase reactions both the substrates and the reagents are in solution. In some
solution phase reactions, however, the reagent is bound to resin. The advantage of such reagents
is that the by products of the reagent remains bound to the resin and can be easily removed from
the reaction mixture by filtration. One example is the polymer bound HOBt that is used in amide
formatting reactions.
N
N
N
OH

Polymer bound HOBt

More solid phase reagents and examples of their applications are found in the already
mentioned compilations.15-18
Different types of resins can also be used in solution phase reactions for removal of the
excess of reagents, substrates or by products. Those resins that can be used for such purposes are
named scavenger resins. One example is formylpolystyrene19-20 which is used for removal of
primary amines from reaction mixtures.

27

O
H

Formylpolystyrene
Other examples of scavenger resins and their applications are found in the above
mentioned compilations.15-18

References
1. R. B. Merrifield J. Am. Chem. Soc. 1963, 85, 2149.
2. R. B. Merrifield J. M. Steward,N. Jernberg Anal. Chem. 1966, 38, 1905.
3. W. Rapp In G. Jung (Ed) Combinatorial Peptide and Nonpeptide Libraries 1996, VCH,
New York, 425.
4. H. M. Geysen, R. H. Meloen, S. Barteling Proc Natl Acad Sci USA 1984, 81, 3998.
5. http://www.mimotopes.com
6. C. C. Lezenoff Acc. Chem. Res. 1978, 11, 327.
7. P.M. Worster, C. R. McArthur, C. C. Lezenoff Angew. Chem. 1979, 91, 255.
8. C. C. Lezenoff, V. Yedidia Can. J. Chem. 58, 1980, 287.
9. V. Yedidia, C. C. Lezenoff Can. J. Chem. 1980, 58, 1144.
10. E. Camps, J. Cartells, J. Pi Anales de Quimica 70, 848 (1974).
11. J. M. J. Frechet Tetrahedron 1981, 37, 663.
12. M. J. Farrall, J. M. J. Frechet J. Org. Chem. 1976, 46, 3877.
13. J. M. J. Frechet, C. Schuerch, J. Am. Chem. Soc. 1971, 93, 492.
14. J. I. Crowley, H. Rapoport Acc. Chem. Res. 1976, 9, 135.
15. P. H. H. Hermkens, H. C. J. Ottenheijm, D. Rees Tetrahedron 1996, 52, 4527.
16. P. H. H. Hermkens, H. C. J. Ottenheijm, D. Rees Tetrahedron 1997, 53, 5643.
17. . Furka In Combinatorial & Solid Phase Organic Chemistry 1998, Advanced
ChemTech Handbook, Louisville, 35.
18. W. M. Bennett In H. Fenniri (Ed) Combinatorial Chemistry 2000, Oxford University
Press, Oxford, New York, 139.
19. K. G. Dendrinos, A. G. Kalivretenos J. Chem. Soc. Perkin Trans. 1998, 1, 1463.
20. M.V. Creswell, G. L. Bolton, J. C. Hodges, M. Meppen Tetrahedron 1998, 54, 3983.

28

3. Parallel synthesis. Synthesis of compound arrays based on saving


reaction time
Execution of the chemical reactions used in the synthesis of compounds or in the
analytical methods take time like the food cooking processes applied in the kitchen. The house
wives know for a long time how to organize their work economically. A typical house wife, for
example, begins cooking the food then starts the washing machine and while the soup is boiling
and the washing machine runs the program she occupies herself with vacuum cleaning the carpet.
That is, she is doing different activities in parallel in order to be more effective.
The idea of making our activities more efficient by doing operations in parallel is really
old. The Tibetan monks, for example, mechanized their praying by applying praying mills and
they operate these praying machines in parallel that makes praying very efficient.

Figure 3.1. Praying mills1


In chemistry, the concept of parallel work was introduced relatively late. The Hungarian
microbiologist, Gy. Taktsy was the pioneer of this type of activity. Taktsy organized his
microbiological analytical work into parallel operations.2 He developed a radically new method
for serological titrations in 1955. Among the new tools that made the parallel work very easy he
introduced the microtiter plates that gained wide application in both biology and chemistry. The
mictotiter plates are plastic plates with holes drilled into them. The holes serve as reaction
vessels. The standard plate has 96 reaction vessels and their arrangement is shown in Figure 3.2.

29

Side view

Top view

Figure 3.2. Standard microtiter plate: 8 rows and 12 columns

3.1. The parallel synthesis


The principle of parallel synthesis is the same as that applied by the house wives in the
kitchen and the Tibetan monks in praying. Execution of the chemical reactions takes time and
during that time not only one but a series of reactions can be realized. Each synthetic reaction is
started in a different reaction vessel and all the necessary operations are executed in parallel.
Figure 3.3. shows the principle of the synthesis of five different trimers, for example tripeptides,
in parallel.
cleavage

cleavage

cleavage

cleavage

cleavage

Figure 3.3. Parallel synthesis of five trimers in five (numbered) reaction vessels. The Black, gray
and white circles represent building blocks, for example amino acids
The five trimers are synthesized on solid support (P) in reaction vessels 1 to 5. At the end
of the synthesis, each trimer is individually cleaved from the support and collected in one of the
30

five vessels designated for storing the end products. The figure demonstrates that in parallel
synthesis the number of reaction vessels is the same as the number of compounds to be prepared.
The number of operations is practically the same as in the one by one synthesis of the same
compounds since the solvents and reagents have to be serially transported into each reaction
vessel. The real advantage is that the reaction time for the in synthesizing the 5 compounds is
about the same as preparing a single one. The series of compounds prepared by the parallel and
the other combinatorial methods are called compound libraries.

3.1.1. The multipin metod of Geysen


The first example of parallel synthesis was published by Geysen and his colleagues.3 They
synthesized series of peptide epitopes in an apparatus developed for this purpose (Figure 3.4.). In
the multipin apparatus the authors used the microtiter plate introduced by Taktsy for reaction
vessels (Figure 3.4./b) and a cover plate with mounted polyethylene rods fitting into the wells
(Figure 3.4/a). The end of polyethylene rods (pins) were coated with derivatized polyacrilic acid
(marked by black).
a
Pin

Figure 3.4. The multipin apparatus

Solution

Figure 3.4. The multipin apparatus


The amino acids used in building the peptides and the coupling reagents were dissolved
and added to the wells. The coated ends of the pins were immersed into solution and kept there
until the coupling reactions ended. Washings and deprotection of the peptides were also executed
in the apparatus. The formed peptides were attached to the pins. The peptides prepared in the
Geysens experiment were screened - after deprotection - without cleaving them from the pins.
1 2 3 4

5 6 7

8 9 10 11 12

1
2
3
4
5
6
7
8

Figure 3.5. The well row 4/column 9


31

The sequence of the peptide formed on a pin depended on the order of the amino acids
added to the particular well. The amino acids or their order (or both) were different for each well
so a different peptide formed on each pin. The wells, as well as the pins, were characterized by
their coordinates: rows and columns. By recording the order of the added amino acids into each
well, the expected sequences of the peptides could be determined from the position occupied in
the plate.
If the order of added amino acids in the well row 4/column 9 is Gly, Gly, Arg, Phe, for
example (Figure 3.5), then the sequence of the tetrapeptide formed on the pin row 4/column 9 is
Phe.Arg.Gly.Gly (taking into account that numbering of the amino acids in peptides stars at the
N-terminus).
The multipin method is still used and the multipin apparatus is a commercially available
product. In such apparatus, however, not coated plastic rods are used as pins. The coated head of
the roads is replaced by SynPhase crowns or SynPhase lanterns (Figure 3.6) mentioned in
chapter 2.

Figure 3.6. Pins with crown (a) and lantern (b)


The multipin procedure was applied by Ellman and his colleagues4 in pioneering the
preparation of organic libraries by parallel synthesis.

R2

R2

NHFmoc
O

O
NH

R2
R3

NHFmoc

R1

R1

H
N

R2

O
R3

R4
N

R2

O
R3

R4
N

O
R3

N
H O

R1

R1

R1

Scheme 3.1.
Derivatives of 1,4-benzodiazepines were constructed from 2-aminobenzophenones, amino
acids and alkylating agents (Scheme 3.1). The Fmoc protected 2-aminobenzophenones were first
attached to an acid labile linker (L) then through the linker to the pins (P). After removal of the
protecting group it was coupled with a protected amino acid (1). This was followed by the
removal of the Fmoc protecting group and cyclization (2), then by alkylation of the ring nitrogen
to introduce R 4 (3). Finally the product was cleaved from the support (4).

3.1.2. The SPOT technique of Frank


The SPOT method introduced by Frank5,6 and his group was also developed for preparing
32

peptide arrays. The synthesis is carried out on cellulose paper membranes derivatized to serve as
anchors for the first amino acids of the sequences to be prepared. Small droplets of solutions of
protected amino acids dissolved in low volatility solvents and coupling reagents are pipetted onto
predefined positions of the membrane (Figure 3.6). The spots thus formed can be considered as
reaction vessels where the conversion reactions of the solid phase synthesis take place. An array
of as many as 2000 peptides can be made on an 8x12 cm paper sheet. The peptides can be
screened on the paper after removing the protecting groups.

3.1.3. Other devices for parallel synthesis


De Witt and co-workers7 also developed an apparatus for parallel synthesis. It was
designed for the synthesis of small organic molecules. The solid support was placed in porous
tubes immersed into vials containing solutions of reagents which diffused into tubes. The
temperature of the reaction mixtures could be controlled by heating or cooling the reaction block.

Figure 3.6. The spot synthesis


Another inexpensive device was described by Meyers et al.8 Beckman polypropylene
deepwell plates were modified by drilling a small hole in the bottom of each well. A porous
polyethylene frit was fitted into the bottom of the wells to allow removal of the solutions and
solvents by vacuum. At the bottom, a rubber gasket prevented the leakage of the wells during the
reactions.

Figure 3.7. The LabMate manual synthesizer


(photo: www.aapptec.com)

33

A simple commercially available parallel device, offered by the firm aapptec, is


demonstrated in Figure 3.7. It is a manual synthesizer allowing the parallel synthesis of 25
different compounds in 0.05 to 2 mmol quantity.
The content of the reaction vessels can be mixed and heated in 4 different zones. Cooling
is also possible if connected to a circulating chiller.
Another parallel synthesizer is offered by BCHI. The Syncore Reactor (Figure 3.8.)
can accommodate 24 to 96 reaction vessels. It can be used for both solid phase and liquid phase
parallel synthesis. Among the features vortex mixing, heating and cooling and parallel
evaporation of the samples can be mentioned.

Figure 3.8. The Syncore Reactor of BCHI (photo: www.buchi.com)

Preparation of many organic compounds needs heating. Since the mid-1980s the use of
microwave heating began to spread in chemical laboratory practice9. This kind of heating raised
considerably the speed of chemical reactions in both solution and solid phase. The reaction times
are typically reduced from days or hours to minutes or second often followed by increased yields,
too. This type of heating is also applied in the synthesis of combinatorial libraries in order to save
time. In microwave heating the energy is not transferred by conduction or convection so the
reaction vessel is not heated only the solvent and the reactants. The energy is absorbed by dipolar
molecules. Molecules that have larger dipolar momentum absorb better. For this reason solvents
with large dielectric constant are preferred.
Although examples for application of domestic microwave ovens in parallel synthesis of
combinatorial libraries are found in the literature10 in practice rather specially constructed heaters
are applied. The experiments carried out in domestic ovens are often difficult to reproduce
because of the uneven electromagnetic field distribution, the pulsed irradiation and the
unpredictable formation of hot spots. Two kinds of specially constructed commercial reactors are
available that work either in multimode or in monomode operation.11,12

34

Figure 3.9. In the XP-1500 Plus system of CEM up to 12 samples can be heated
simultaneously (photo: courtesy of CEM)

The multimode reactor has a large cavity like the domestic oven but reflection by the
walls and a mode stirrer ensures a nearly homogenous distribution of the electromagnetic field. In
this reactor the samples can be heated in parallel. A parallel reactor for heating up to 12 samples
is demonstrated in Figure 3.9.

Figure 3.10. The software controlled ExplorerPLS system of CEM


(photo: courtesy of CEM)

The ovens operating in monomode, on the other hand, heat only one sample at a time. The
vials containing the reactants are delivered serially into the oven. Such system, the
ExplorerPLS of CEM is demonstrated in Figure 3.10. The Explorer handles all of the routine
tasks necessary to execute a large number of reactions each day. The system has a sample deck
with interchangeable racks.

35

3.1.4. Parallel synthetic methods with reduced number of operations


As already mentioned the parallel synthetic methods are based on simultaneous execution
of a series of reactions and, as a consequence, are saving a lot of time in the synthesis of
compound arrays. The number of operations that are executed in these reactions, however, is
more or less the same as in the one by one preparation of the same compounds. In some parallel
methods attempts were made to reduce the number of operations, too.

3.1.4.1. Synthesis of oligonucleotides on paper discs


Ronald Frank and his colleagues who introduced the spot synthesis also developed a
method for the parallel preparation of oligonucleotides on paper discs which involves reduction
of the number of operations necessary in the synthesis.13 The method was demonstrated by
simultaneous preparation of two octamers using, as solid support, Whatmann 3MM paper discs
(diameter 2 cm) labeled by pencil. The synthesis was carried out according the principle outlined
in Figure 3.11: whenever the two growing chains had to be elongated with the same nucleotide
the two discs were transferred into the same reaction vessel and the elongation was realized in a
single coupling cycle.
The sequences of the two nucleotide are seen at the bottom of Figure 3.7 together with the
coupling order of the nucleotides. The sequences differ only at coupling positions 3 and 6. At
these positions the couplings were carried out separately in two reaction vessels while in the
remaining six coupling positions (1, 2, 4, 5, 7 and 8) the two discs were placed into the same
reaction vessel and each of these elongations (including the attachment of the first nucleotide to
the support) was realized in a single coupling cycle.
A

12

12

12

12

12

12

6
TAATATTA
TAGTACTA

1
2

8 765 432 1

Figure 3.11. Flow diagram of the synthesis of two oligonucleotides

At coupling positions 3 and 6 the discs were placed in separate reaction vessels and the
couplings were executed separately. The total number of the executed coupling cycles in the
synthesis of the two octamers was 10. In the case of normal parallel synthesis of the same two
compounds 16 coupling cycles would have been needed. This shows that the method is capable
to significantly reduce the number of the necessary operations. It seems worthwhile to note that
36

in a single coupling cycle 18 different operations had to be executed including washings and
dryings.

3.1.4.2. The tea-bag synthesis


A parallel synthetic method based on the principle suggested by Frank but using a
different solid support was developed by R. A. Houghten.14 Series of peptides were synthesized
on bead form polymer support. The polymer beads were enclosed into visibly labeled permeable
plastic bags (tea bags Figure 3.12.). A different peptide - according to pre-decided sequences was prepared in each bag. In a coupling step the bags were grouped according to the amino acid
appearing in their assigned sequence at that coupling position then placed into the same reaction
vessel for coupling with the same amino acid. For example in coupling step 3 all bags in which
the assigned sequence contained Ala in position 3 were grouped and transferred into the reaction
vessel where Ala was used in the elongation reaction. Before the next coupling step the bags were
manually regrouped after reading their label - again according to the sequences assigned to the
bags. All operations, including removal of protecting groups, couplings, washings and even the
cleavages were performed on the solid supports enclosed into the same bags. This procedure has
the same advantage that was pointed out at the Frank method: less number of operations is
needed than in a normal parallel synthesis and the number of reaction vessels is also less than the
number of the synthesized compounds.

tea-bag

reaction vessels

with bags

Figure 3.12. The tea bag method

3.2. The Ugi multicomponent reactions


One family of organic reactions is particularly suitable for parallel execution. These are
the Ugi multicomponent reactions. In multicomponent reactions several starting compounds can
be combined in a one step reaction to give a complex product. One of the most extensively used
such reactions are the Ugi four component reactions15,16 symbolized by U-4CR. In this reaction
an acid, very often carboxylic acid, a primary or secondary amine, an aldehyde or ketone and an
isonitrile transform into a single product (Figure 3.13). By varying R1, R2, R3 and R4 large
series of compounds can very easily be prepared. The rection is driven by the high reactivity of
the isonitrile component.

37

R1

COOH

R2
R3
R4

NC
CHO
NH2

R3
H
N

R1

R2

R4

Figure 3.13. The Ugi four component condensation


Althopugh the multicomponent reactions were first realized in 193817 their practical
importance was recognized only after the advent of combinatorial chemistry. A vast number of
compounds can be and are prepared by the Ugi and other multicomponent reactions. This can be
exemplified by Figure 3.14. that shows how the products can be varied by replacing the
carboxylic acid component by other acids.18

R4

X = O or S

N
N
H
R1 R2

R1

R5

R2

N
R3

R2
N

R3

R1

R3

X = O or S

H2O

H2S2O3

H R -COOH
5
N
R

N
H

HSCN

HX
R1-CO-R2
R3-NH-R4
R-NC

HOCN
R4

HN3

N
R3

R1

R2

Figure 3.14. Products of U-4CRs carried out with different acids. HX is replaced by acids seen on
the arrows

The variability of the products can be further increased by post condensation


transformations. An acid catalysed cyclization of the condensation product19 is demonstrated in
Figure 3.15. The cyclization is brought about by the removal of the protecting Boc group.
The Ugi reactions can be realized in both solution and solid phase. Any of the four
reactants can be linked to the resin. Mostly good conversations can be achieved.
NC
R1

CHO

R2

NH2

R1

H
N

R2
N

10% TFA

R
OH
Boc

R4

R4

N
R3

R2

O
R4

Boc

N
R3

N
R3

Figure 3.15. Post cyclization of the primary Ugi product

38

3.3. Solution phase combinatorial synthesis


Before introduction by Merreifield 20 of solid phase synthesis, the organic compounds
were generally prepared in solution. The use of the solution phase synthesis in combinatorial
chemistry has some advantages and also serious disadvantages. The main advantage is that the
overwhelming majority of synthetic procedures recorded in the literature are realized in solution
phase. The disadvantage, on the other hand, is that in a multi-step reaction the products need to
be isolated and purified in each step that is often tedious and time consuming. Nevertheless,
solution phase synthetic methods are applied in combinatorial chemistry, too. There approaches,
however, that make possible to reduce the disadvantages and so to make the solution phase
procedures competitive and applicable besides the solid phase methods.
3.3.1. Dendrimer supported synthesis.21
Dendrimers are branching oligomers. They are built up in stepwise manner from
monomers that result in branching at every coupling position.

These oligomers are soluble, relatively large molecules their size considerable exceeds
those of the building blocks and reagents used in combinatorial syntheses. To the ends of their
branches linkers can be attached so they can serve as soluble supports for combinatorial synthesis
(Figure 3.16.).

: linker,

: building block

Figure 3.16. Two step synthesis on dendrimer

Because the large size of the dendrimer molecules, the products of each coupling step can
easily be separated from the excess of reagents by size exclusion chromatography. After cleaving
the small molecule products from the support, the size exclusion chromatography also makes
possible to separate them from the dendrimer molecules. This kind of separation is usually much
faster than the conventional separation and purification processes but it is much slower than the
simple filtration in the solid phase procedures.

39

3.3.2. Separations using fluorous tags and fluorous solvents.


The fluorous solvents are immiscible with most organic solvents and water.22 This fact is
exploited in using fluorous-organic liquid-liquid extraction for separation of products of solution
phase combinatorial syntheses from reagents.23 This separation works if fluorous tags are
attached to the reagents or to the products. The attachment of the fluorous tags may occur before
the combinatorial reaction step or after it.
An example is demonstrated in Figure 3.17. A Stille coupling is carried out using a
fluorous reactant and a fluorous solvent, the commercially available FC-72, consisting mainly
C6F14 isomers.24 At the end of coupling, the product was extracted into dichloro methane and
the by product (Cl-Sn(Ch2CH2C6F13)3) was found in CF-72.

Sn(CH2CH2C6F13)3

PdCl2(PPh3) 2
+

Br

+ Cl-Sn(CH2CH2C 6F13 )3

LiCl/DMF/THF

MeO

MeO

Figure 3.17. Stille coupling using fluorous reactant

A second example is outlined in Figure 3.18. An aromatic urea derivative is


synthesized from an amine and an isocyanate that was applied in excess. After completion of the
reaction, a fluorous quencher was added also in excess by which the excess of the isocyanate was
transformed into a fluorous adduct. Finally an organic/fluorous liquid-liquid extraction was
carried out. The fluorous adduct and the excess of the fluorous quencher separated into the
fluorous solvent (FC-72) and the product was isolated by evaporation of the organic solvent
(THF). This approach was applied in the synthesis of a 9 member library.25

3.3.3. Application of solid phase reagents.


In organic reactions the reagents are often used in excess in order to drive the
transformations to completion. If the reaction is carried out in solvent with reagents attached to
resin, the excess of the reagents can easily be removed by filtration at the end.
O
Br

HH2
F3C

Br

N
H

F3C

OCN

N
H

excess
H
N

NCO

HN

Si(CH2CH 2C 6F13)3

Si(CH2CH2C6F13) 3
Si(CH2CH2C6F13) 3

Si(CH2CH 2C 6F13)3

Fluorous quencher in excess

Figure 3.18. Fluorous quenching

40

In Figure 3.19. an example is found showing acylation of a secondary amine with a solid
phase active ester.26 The product can be isolated from the filtrate while the by product and the
excess of the active ester remains on the filter.
NO2

NO2

O
O

OH
O
R3

R1

CH3-CN,Et3N

HN

R3
N

70
R2

R2

R1

product

active ester

Figure 3.19. Acylation with solid phase active ester

In another example shown in Figure 3.20. the coupling reagent, 1-(3dimethylaminopropyl)-3-ethylcarbodiimide (EDC), is used in insoluble form (P-EDC) for
coupling an acid (R1COOH) with a secondary amine (R2R3N). The transformed form of the
reagent can be filtered out and the product can be isolated from the filtrate. Not only reagents but
catalysts can also be used in solid phase form.25

N
C

C
N

Cl

Cl

N
HO

N
H

R1

R1

P-EDC
R2

NH
NH

Cl

R3

C
N

N
H

R2
R1

product

N
R3

Figure 3.20. Amide bond formation with solid phase carbodiimide

3.3.4. The use of scavengers in solution phase reactions.


Solid phase scavengers are often applied in reactions carried out in solution in order to
remove the remains of reagents used in excess. In Figure 3.21. a reaction is outlined in which a
solid phase catalyst (borohydride) and a solid phase scavenger (aldehyde) is used.25 In this
reaction an aldehide (R2CHO) is reacted with an excess of a primary amine (R1NH2) to form an
imine which is reduced with the solid phase borohydride catalyst to a secondary amine (product).
Although the catalyst can be filtered out, the product is still contaminated with the excess of the
primary amine. This reactant is then removed from the reaction mixture in the form of solid phase
imine by the added aldehyde scavenger. The clean product can be isolated from the filtrate.

41

H
O
R1

NH2

excess

NR3

R2
H

BH4

R2

R1

N
R1

R2

R1
NH2

product and
remaining amine

filtered out

clean product

Figure 3.21. The use of solid phase catalyst and scavenger in one reaction
3.4. Automation in parallel synthesis
Like many other important developments in combinatorial chemistry, automation also
began in the field of peptide chemistry. Introduction of the solid phase peptide synthesis
procedure by Merrifield19 in 1963 opened the possibility for automation. Merrifield not only
invented the new synthetic technology but he also built the first solid phase peptide synthesizer.27
In the last half century numerous companies developed and commercialized automatic
synthesizers.

3.4.1. Automatic parallel synthesizers


Automatic parallel synthesizers are developed for both peptide and organic synthesis. The
peptide synthesizers generally do not need heating or cooling. The organic syntheses, on the other
hand, often need heating or cooling or special atmosphere or elevated pressure. So the organic
synthesizers are more complex than the peptide synthesizers.
Figure 3.22. shows a peptide synthesizer made and sold by the firm aapptec. It can be
used to prepare simultaneously up to 96 peptides by solid phase synthesis. A Teflon reaction
block that can be shaken contains the reaction vessels closed by septum. The reactors have a frit
at their bottom to hold the resin and to make possible to remove the liquids by applying vacuum.
The resin can be filled into the reaction vessels by volume using a simple tool.
The solutions of the reagents and amino acids are stored in containers closed by septum.
The solvents are stored in bottles. The solvents and solutions are automatically transferred into
reaction vessels by needle like probe that can penetrate through septum. The probe is fitted to an
arm that can be moved in x, y and also in z direction.
The Apex 396 is available with one or two arms and also with two kinds of reactors. The
synthesized peptides can be cleaved automatically from the resin. The quantity of the synthesized
peptides may vary from 0.005 to 1 mmol.
The synthesizer like other automatic machines runs under computer control. Once the
resin is placed into the reaction vessels, the solutions of the protected amino acids and reagents
are filled into their containers and the solvent bottles are also filled the started machine executes
the synthesis in an unattended manner following step by step a pre-prepared program. The
software supplied with the machine allows easy programming.

42

Figure 3.22. The Apex 396, a solid phase peptide synthesizer


(photo: www.aaptec.com)

The peptides cleaved from the resin are in dissolved form. The solvent needs to be
evaporated or lyophilized. A simple module developed for this purpose is seen in Figure 3.23. It
is designed to prevent liquid bumping. It can be connected to vacuum pump equipped with cold
trap. The module can also be used to evaporate or concentrate fractions after purifications.

Figure 3.23. Parallel evaporation/lyophilization module (photo: www.aaptec.com)

Another automatic synthesizer is demonstrated in Figure 3.24. This is Model 384 Ultra
High Throughput Synthesizer is also manufactured and sold by aapptec.

43

Figure 3.24. The is Model 384, Ultra High Throughput Synthesizer for solid phase applications
(photo: www.aaptec.com)

Figure 3.25. The Solution, a solution phase parallel synthesizer (photo: www.aaptec.com)

The synthesizer has four reactor blocks each containing a maximum of 96 reaction
vessels. So a maximum of 384 different peptides or other compounds can simultaneously be
prepared. The maximum number of compounds that can be prepared depends on the volume of
the reaction vessels that can vary from 3 to 35 ml. Heating and cooling is optional.
An automatic machine, the Solution, designed for solution phase synthesis is shown in
Figure 3.25. but the instrument can be used for solid phase synthesis, too. It can accommodate
fifteen 80 ml reactors or ninetysix smaller reaction vessels (3 or 10 ml). The machine
automatically performs liquid-liquid extractions and can transfer the products to titer plates.

44

Another machine, the Sophas HTC (Figure 3.26), is manufactured by Zinzer Analytic. It
can be used for both solution phase and solid phase parallel preparations. 600 compaunds can be
synthesized in a parallel procedure.
The reaction vessels can be heated up to 150 oC and can be cooled to -40 or -80 oC. Inert
reaction conditions are assured. Slurry distribution and sample picking for HPLC during the
synthesis are also possible.

Figure 3.26. The Sophas HTC automatic parallel synthesizer


(photo: www.zinsser-analytic.com)

3.4.2. Quality control


The components of compound libraries prepared by parallel synthesis may contain by
products. For this reason they are usually submitted to quality control and, depending on the
result, they may needed purification before screening. The quality control usually involves HPLC
separations or mass spectroscopy or both.
After their synthesis, the library components are usually available in evaporated dry form.
Before submitting them to quality control the dry samples need to be dissolved. Samples are then
taken from the solutions that are serially submitted to liquid chromatography (LC). The
chromatogram usually shows if impurities are present in the samples. The synthesized products
as well as their impurities can be identified by taking samples from the LC eluents end analyzing
them by mass spectrometry (MS). In this process automatic liquid handlers do the job that ends at
sample loading.
One of the available liquid handlers is the 223 Sample Changer, the product of Gilson
(Figure 3.27.). This is a programmable sampler for automated sample preparation and transfer. It
is used for serial dilutions, sampling into vials and tube to tube transfers.

45

Figure 3.27. The 223 Sample Changer (photo: www.gilson.com)

Another liquid handler is also a Gilson product: the Multiple Probe 215 Liquid
Handler/Injector (Figure 3.28.). This device is a large-capacity multiple-probe liquid handler that
processes four or eight samples simultaneously. It also performs injection into four or eight
parallel systems simultaneously. The instrument is ideal for parallel injection into HPLC and
LC/MS systems.

Figure 3.28. The Multiple Probe 215 Liquid Handler/Injector


(photo: www.gilson.com)

The MALDILC System of Gilson is designed to perform a microbore HPLC with


fraction collection and simultaneous matrix addition onto MALDI plates. The plated fractions can
then be analyzed repeatedly by MALDI-TOF MS.

46

Figure 3.29. The MALDILC System (photo: www.gilson.com)

3.4.3. Parallel purification


The automatic parallel chemical synthesizers produce a large number of products. In order
to purify them in reasonable time the parallel approach needs to be applied in the purification
stage, too.
An example of parallel purification instruments is the Quad 3+ system (Figure 3.30.) that
is suitable to purify 4, 6 or 12 samples in parallel while recording their UV spectra.

Figure 3.31. The Quad 3+ system of Biotage


(photo: www.biotage.com)

47

3.4.4. Manufacturers of laboratory robots


The companies listed below are engaged in laboratory automation, manufacture and
commercialize such products. In addition to the companies that offer automatic compound
synthesizers other companies manufacture automatic chemical analyzers that are also important
in laboratories applying combinatorial chemistry. Besides the hardware, most companies offer
software, too.
aaptec (www.peptideprotein.com)
Advanced Automated Peptide Protein Technologies (aaptec) is the successor of Advanced
ChemTech and offers a family of automatic of peptide, protein and other chemical synthesizers.
accelab (www.accelab.de).
The company focuses on integrated solutions for productivity enhancement in research laboratories.
The accelab automation platforms are highly compact and flexible and allow real unattended 24-hour
operation. They use SCARA and gantry robots which handle the major part of the required
automation tasks, leaving the parallel routine sample processing to the integrated standard laboratory
instruments.
AutoDose (www.autodose.ch).
The company develops leading edge technologies for high precision powder dispensing.
POWDERNIUM automates the very traditional yet tedious weighing process of
solids/powders. POWDERNIUM is available in several models to accommodate volumes
ranging from 0.2mg to a few hundred grams and can run unattended. The accuracy can be as high
as 0.1 mg.
Beckman Coulter (www.beckman-coulter.com).
The company offers a complete range of automation tools for your research applications, from
modular liquid handling systems to integrated robotic systems in genetic analysis or drug
discovery.
Bio-Automation (www.bio-automation.com).
The company provides automation solutions to the Life Science Industry. Its Bio-Bot is a
laboratory robot that is designed to provide walk away time to single instrument or multiple
instrument workstations. The software provides a simple, yet powerful control.
BioDot, Inc. (www.biodot.com).
The company manufactures a wide line of material handling, dispensing, and processing modules
provide engineering solutions that can be customized to fit precise requirements.
Biotage (www.biotage.com).
The company is currently producing automated systems for the parallel purification and
screening of the high number of compounds generated by combinatorial chemistry. Automated
solutions include the basic Quad3 which can purify twelve fractions with twelve separate
solvents in less than thiry minutes to the flagship Parallex and FLEX systems which purify and
collect combinatorial arrays using an intelligent fraction collection system based on UV peak

48

characteristics. Biotage also offers automated and semi-automated systems for production-scale
HPLC and FLASH chromatography processes.
BUCHI Corporation (www.buchi.com)
Buchi produces and offers a large array of laboratory equipments.
Caliper Technologies (www.caliperls.com ).
The company designs and manufactures Labchip devices and systems that enable highthroughput screening. The LabChip systems replace entire chemical laboratories. Caliper's
microfluidic LabChip devices function like "liquid integrated circuits." They process fluid containing DNA, proteins, or cells - like semiconductors process electrons, executing biological
tests in seconds. Genes can be analyzed within minutes. Promising drug compounds can be tested
within days instead of months.
Carl Zeiss Jena (www.zeiss.de).
The company has developed a screening system for ultra high throughput screening (UHTS) for
pharmaceutical drug research. The high-performance multimode readers offer 96-channel
parallel detection of fluorescence, absorption and luminescence in 96, 384 and 1,536-well
microtiter plates. The compact workstations and systems containin a new robust technology for
the transport of microtiter plates with a throughput of > 100,000 specimens a day. The userfriendly software offer simple assay programming.
Cartesian Technologies, Inc. (www.cartesiantech.com).
The company manufactures equipment for pharmaceutical and agricultural research. The
equipment helps automate and increase the process efficiencies in areas such as drug screening,
genomics, and combinatorial chemistry.
Cellomics, Inc. (www.cellomics.com ).
Cellomics Inc.s mission is to improve the efficiency of the drug discovery process by delivering
a cell-based screening platform that automates target validation and lead optimization using
fluorescence-based assays. Today, the Companys integrated platform consists of proprietary
fluorescence assays, a proprietary, cell-based High Content Screening (HCS) system, and
bioinformatics software.
CyBio (www.cybio-ag.com).
The company offers modular technology platforms for automated drug research, high throughput
screening, liquid handling, luminescence readers.
Genetix (www.genetix.co.uk).
The company offers a multi-tasking robot, offering Colony Picking, Gridding and Liquid
Handling. The 'Q' BOT is an invaluable addition to any laboratory engaged in high throughput
Pharmaceutical, Genomic or Bioresearch screening.

49

Genomic Solutions (www.genomicsolutions.com).


The company provides instrumentation, software and related products and services. It offers to
life science researchers in biopharmaceutical companies, universities and government institutions
with integrated, flexible systems.
Gilson (www.gilson.com).
The company is specialized in analytical instrumentation for scientific research and industrial
markets. It has developed software and instrumentation to keep pace with the most sophisticated
HPLC, LC and sample preparation technology available for the laboratory.
Gyros (www.gyros.com).
Gyros AB offers pharmaceutical, biotechnology and diagnostic companies access to a unique,
proprietary technology platform. Routine or non-routine laboratory processes are miniaturized
and integrated into application-specific CD microlaboratories. Hundreds of samples can be
processed in parallel on the disposable CDs. Integrating different laboratory steps onto a single
CD microlaboratory offers the potential to reassess and redesign traditional working procedures.
Hamilton Company (www.hamiltoncomp.com).
Hamilton offers robotic instruments. MICROLAB automated precision liquid handling systems
increase the speed, throughput and productivity of sample preparation procedures.
HEL (www.helgroup.co.uk).
The company that manufactures reactor and calorimeter systems for process screening and
optimization. The systems runs on proprietary software and hardware and incorporate robotic
work-stations for liquid handling and automated sampling to HPLCs and other analytical tools.
The systems performs chemistry from a 5 ml to 100 ml scale with up to 32 reactions in parallel.
The systems are available in glass or metal and have wide operating temperature and pressure
ranges.
IRORI (www.irori.com).
The company provides combinatorial chemistry technology to the pharmaceutical industry. The
AccuTag-100 Combinatorial Chemistry System is used at large and small pharmaceutical and
drug discovery companies. It offers microreactors, miniature electronic tags, and automated
sorting instrument for combinatorial directed sorting.
KBiosystems (www.kbiosystems.com).
The company manufactures robots for medium to high throughput laboratory automation
including the Duncan high throughput PCR thermal cyclers, the K-Core 2D Gel cutter, the
Preptide proteomics robot and the K2 and K3 colony pickers, arrayers, re-arrayers and
replicators.
Labman Automation (www.labman.co.uk).
Labman designs and builds instruments for high-throughput screening and microarrayers. Also
deals with powder feeding and dispensing and weighing/labelling.

50

LEAP Technologies, Inc. (www.leaptec.com).


provides Front-End automation for chromatography, mass spectroscopy, elemental analysis, and
other analytical techniques. We specialize in applications that demand reliability, flexibility,
precision, and high throughput. Custom solutions are available for non-standard sample
preparation and loading problems. We work with major chromatography and mass spectroscopy
companies to provide total integrated solutions for critical automation applications.
Mitsubishi (www.mitsubishitoday.com).
The company offers PA-10, a powerful super-lightweight 10kg robotic arm with 7-axis
redundancy control that can be controlled by a PC or a built in control unit. The PA-10 is a useful
addition in any research lab. Important features that make the PA-10 useful are its innovative
open system and its flexibility with human arm like maneuverability.
PerkinElmer Life Sciences (www.lifesciences.perkinelmer.com).
The company supplies products, services and technologies for functional genomics, high
throughput screening and drug discovery as well as for clinical screening.
Personal Chemistry (www.personalchemistry.com).
The company developed the Coherent Synthesis and the Coherent Synthesis used in
pharmaceutical research.
Protedyne (www.protedyne.com).
The company manufactures the Computer Integrated Laboratory Automation (CILA) systems for
the biotechnology and pharmaceutical markets.
REMP, Switzerland (www.remp.com).
The company is specialized in laboratory automation and robotics. REMP supplies solutions for
compound storage and retrieval, compound cherry-picking, plate replication and re-formatting,
automated powder dosing, environmental conditioning, and plate heat sealing and piercing
applications.
Robocon Inc. (www.robocon.com).
The company specializes in laboratory automation for pharmaceutical research, biotechnology
and medical/veterinarian diagnostics.
Sias AG (www.sias.ch).
The company is producing robotic laboratory automation for liquid handling and analysis that is
available as a free standing workstation.
ST Robotics (www.strobotics.com).
The company produces "bench top robots" R16 and R17 and the bench top version of the
Cartesian R15.

51

TECAN (www.tecan.com).
The company produces a large portfolio of instruments and systems the Robotic Sample
Processors for automated liquid handling, the Microplate equipments such as Readers and
Washers.
TekCel (www.tekcel.com).
The company manufactures a family of robotic workbenches, called TekBenches. These
products include liquid handling and assay development, automated microplate sealing/resealing,
storage and retrieval system.
Tomtec (www.tomtec.com).
The company manufactures a complete line of liquid handling systems including harvesters, 96well pipetters, 384-well pipetters, plate washers, and robotic components and systems.
Zinsser Analytic (www.zinsser-analytic.com)
The company is specialized in developing, producing and distributing innovative laboratory
solutions for liquid handling and automation including systems for combinatorial chemistry and
tools for drug discovery.
Zymark Corporation (www.zymark.com).
The company is a is a designer and installer of workstation-based laboratory automation
products.

References
1. J. Kehnscherper, G. Kehnscherper, A. Hausen, W. Mochmann A vilg vallsai, Tessloff
& Babilon, 1999.
2. Gy. Taktsy Acta Microbiologica Acad. Sci. Hung. 1955, 3, 191.
3. H. M. Geysen, R. H. Meloen, S. J. Barteling Proc. Natl. Acad. Sci. USA 1984, 81, 3998.
4. B. A. Bunin, J. A. Ellman J. Am. Chem. Soc. 1992, 114, 11997.
5. R. Frank, S. Gler, S. Krause, W. Lindenmayer,In Peptides 1990, E. Giralt, D. Andreu
(Eds), 1991, ESCOM, Leiden, 151
6. R. Frank, Tetrahedron 1992, 48, 9217.
7. S. H. De Witt, J. S. Kiely, C. J. Stankovic, M. C. Schroeder. D. M. R. Cody, M. R. Pavia
Proc. Natl. Acad. Sci. U. S. A. 1993, 90, 6909.
8. H. V. Meyers, G. J. Dilley, T. L. Durgin, T. S. Powers, N. A. Winssinger, H. Zhu, M. R.
Pavia, Molecular Diversity, 1995, 1, 13.
9. P. Lidstrm, J. Tierney, B. Wathey, J. Westman Tetrahedron, 2001, 57, 9225.
10. B. M. Glass, A. P. Combs In I. Sucholeiki (Ed) High-Throughput Synthesis. Principles
and Practices, Marcel Dekker Inc. 2001, 123.
11. O. Kappe, A. Stadler In G. A. Morales, B. A. Bunin (Eds) Methods in Enzymology,
Combinatorial Chemistry Part B, 2003, Elsevier Academic Press, 197.
12. B. M. Glass, A. P. Combs, S. A. Jackson In G. A. Morales, B. A. Bunin (Eds) Methods in
Enzymology, Combinatorial Chemistry Part B, 2003, Elsevier Academic Press,223.
13. R. Frank, W. Heikens, G. Heisenberg-Moutsis, H. Blcker Nucleic Acid Research 1983,
11, 4365.
52

14. R. A. Houghten Proc. Natl. Acad. Sci. USA 1985, 82, 5131.
15. I. Ugi Isonitrile chemistry, Academic Press, 1971, 1.
16. I. Ugi Proc Estonian Acad Sc. Chem. 1995, 44, 237.
17. A. Laurent, C. F. Gerhardt Liebigs Ann. Chem. 1838, 28, 265.
18. I. Ugi, A. Dmling, B. Ebert In G. Jung (Ed) Combinatorial Chemistry. Synthesis,
Analysis, Screening, Wiley-VCH, 1999, 125.
19. C. Hulme, H.Bienam, T. Nixey, B. Chenera, W. Jones, P. Tempest, A. L. Smith In G. A.
Morales and B. A. Bunin (Eds) Methods in Enzymology, Combinatorial Chemistry
Elsevier Academic Press, 2003, 369, 469.
20. R. B. Merrifield J. Am. Chem. Soc. 1963, 85, 2149.
21. N. K. Terrett Combinatorial Chemistry, Oxford University Press, 1998, 64.
22. I. T. Horvth, J. Rbai Science, 1994, 266, 72.
23. D. P. Curran Angew. Chem. Int. Ed. Engl. 1998, 37, 1174.
24. D. P. Curran, M. Hoshino J. Org. Chem. 1996, 61, 6480
25. B. Linclau, D. P. Curran In I. Sucholeiki (Ed) High-Throughput Synthesis. Principles and
Practices, Marcel Dekker Inc. 2001, 135.
26. S. W. Kaldor, M. G. Siegel Current Opinion in Chem. Biol. 1997, 1, 101.
27. R. B. Merrifield J. M. Steward,N. Jernberg Anal. Chem. 1966, 38, 1905.

53

54

4. Combinatorial synthetic methods


The parallel synthetic methods described in the previous paragraph considerably speed up
preparation of compounds used for different purposes. The advent of the real combinatorial
processes accelerated the preparation of new compounds to never dreamed speeds.
Although there are compounds that can be prepared in a single reaction step most
substances are synthesized in a multi-step process. Such substances are built up stepwise from
building blocs. The building blocks of a peptide, for example, are the amino acids that are linked
together step by step. The real combinatorial methods can be used in multi-step processes and
their most important characteristic feature is that make possible to prepare in a single run all
structural derivatives that can be theoretically deduced from the structures of the building blocks.
The number of operations and number of reaction vessels relative to the number of synthesized
compounds are drastically reduced.
Most combinatorial synthetic methods are based on solid phase approach but there are
versions that can be realized in solution, too. Among the solid phase carriers most often the bead
form resin is applied but surfaces of solid materials are also used.

4.1. Combinatorial synthesis on bead-form resin


4.1.1. The split-mix synthesis
The split-mix method introduced by Furka and his colleagues1-3 is based on Merrifield's
solid phase procedure and originally it was demonstrated by synthesis of peptides. The principle
is described here in a simplified version, using only three different protected amino acids as
building blocks that are represented in Figure 4.1 by red, yellow and blue circles. The same
concept is valid regardless of the number or types of monomer units or other kinds of blocks
involved. The synthesis is executed by repetition of the following three simple operations that
form a cycle:
1. Dividing the solid support into equal portions;
2. Coupling each portion individually with only one of the different amino acids;
3. Mixing and homogenizing the portions.
In the first round (Figure 4.1.a) the amino acids are coupled to equal portions of the resin
and the final product - after recombining and mixing the portions - is the mixture of the three
amino acids bound to resin.
In the second cycle, this mixture is again divided into three equal portions and the amino
acids are individually coupled to these mixtures. In each coupling step, three different resin
bound dipeptides are formed, so the end product is a mixture of 9 dipeptides. In Figure 4.1.a. the
divergent, vertical and convergent arrows indicate dividing, coupling (with one kind of amino

55

acid) and mixing, respectively.

Figure 4.1. The split-mix synthesis.


a: Preparation of a library of nine dipeptides on solid support. Divergent arrows: dividing into
equal portions; vertical arrows: coupling; convergent arrows: mixing and homogenizing, b: 27
tripeptides, c, d and e: 81 tetrapeptides. Green circle represent resin, red yellow and blue circles
are amino acids or other organic monomers

A third dividing , coupling and mixing step that is not demonstrated in the figure would
lead to the formation of a mixture of 27 resin bound tripeptides (Figure 4.1.b.) and a fourth cycle
would produce 81 tetramers (c, d and e in Figure 4.1.).

4.1.1.1. The key features of the split-mix synthesis


The split-mix synthesis has several key features that are crucial to the method's utility in
the pharmaceutical discovery process.
Efficiency. By examining Figure 4.1. it can be seen that starting with a single substance
(the resin, used as the solid support), after each coupling step the number of compounds is
56

tripled: first 3x1=3 resin bound amino acids, then 3x3=9 resin bound dipeptides and, if the
process is continued, in the third cycle 3x9=27 resin bound tripeptides and in the fourth cycle 81
tetrapeptides are formed. This means that the number of peptides increases exponentially after
each coupling step (Table 4.1.).
This is the reason why the split-mix method is so productive. Table 4.1. also shows that
while the number of the products increases exponentially in each step the number of coupling
cycles, that can be considered as a measure of the invested labor, remains constant.

Table 4.1. Exponential increase of the number of peptides in split-mix synthesis


Step
Number of
number amino acids
1
3
2
3
3
3
4
3

Number of
Number of cycles
reaction vessels
in one step
3
3
3
3
3
3
3
3

Total number
of cycles
3
6
9
12

Number of
peptides
31=3
32=9
3 3=27
3 4=81

As the synthesis proceeds, the invested labor increases only linearly. Linear increase of
labor and exponential growth of the number of products: this is the reason of the exceptionally
high efficiency of the method.
Table 4.1. also shows the number of reaction vessels that are needed to execute the
synthesis. This is also only three is each step: one reaction vessel for each amino acid or other
kind of building block. The number of the reaction vessels do not depend on the number of
compounds formed in the process.
If 20 different amino acids are used in the synthesis, the number of peptides in each
coupling step is increased by a factor of 20. The number of peptides (Np) can be expressed by the
following formula where n is the number of the coupling steps, that is, the number of amino acids
forming the peptides.
Np = 20n
After executing 5 coupling cycles with each of the 20 amino acids, for example, more
than 3 million peptides are present in the mixture. Such a synthesis does not need millions of
reaction vessels neither thousands of years for their preparation. It is enough to use 20 reaction
vessels, one for each amino acid, and the pentapeptide library can be prepared in a couple of
days. The number of the executed coupling cycles (Nc), as expressed by the following formula,
increases only linearly with the length of peptides.
Nc = 20n
Formation of all possible sequences. Another feature of the split-mix synthesis is that all
possible combinations of amino acid building blocks are represented in the synthesized peptides.
This is clearly shown by the simple example outlined in Figure 4.1. No more sequential orders of
the red, yellow and blue circles can be deduced than those present in the dimmers, trimers and

57

tetramers demonstrated in Figure 4.1. This combinatorial nature of the composition of the
mixtures synthesized by the split-mix method is reflected in their name: "combinatorial libraries."
This combinatorial feature of the split-mix synthesis holds for preparations of non-sequential
(e.g. cyclic and other) libraries, too.
Formation of all possible sequences is the consequence of equally dividing the resin
mixtures into the reaction vessels of the next coupling step. As a result of this operation
all products formed in any reaction vessel are evenly distributed among the
reaction vessels of the next reaction step.
This can be considered as the combinatorial distribution rule that governs the product formation
in the combinatorial process.
Formation of the products in one to one molar quantities. Peptides are considered to be
natural compounds although certainly not all peptide sequences are found in nature. Peptide
libraries are most often prepared in order to find biologically active substances among them.
Other kinds of organic libraries are also synthesized for the same purpose. In the identification
process, or in screening of the libraries, the goal is to find the biologically most effective
component of the mixture. Serious problems may arise in screening if the peptides are not present
in equal quantities in the mixture. A low activity component, for example, if it is present in a
large amount, may show a stronger effect than a highly active component present in a much
lower quantity. Therefore, it is important to prepare libraries in which the constituents are present
in equal molar quantities. The split-mix method was designed to comply with this requirement.
After each round of couplings, the resin is thoroughly mixed. This ensures that before
dividing the resin the mixture is nearly homogenous. If the resin is divided into equal portions the
previously formed peptides can be supposed to be present in equal number and in equal molar
quantities in each portion. The coupling of any portion of the resin with an amino acid does not
alter the number or the molar ratio of the peptides originally present in the mixture; simply adds
the same amino acid to each sequence. Consequently, the molar ratio of the newly formed
peptides is expected to be the same as the molar ratio of originally present ones. That is, the new
sequences are formed in equal molar ratio.
In addition to the execution of couplings with equal portions of resin samples it is also
important that the couplings are carried out on spatially separated samples adding a single amino
acid to each sample. This makes possible to use appropriate chemistry to drive each coupling
reaction to completion regardless of the reactivity of the amino acids. As a result, both the
number of peptides and their equimolar ratio is preserved in every portion and in each step.
It is worthwhile to note that the equimolarity could be altered at will if for some reason it
would be advantageous. Simply unequal portions should be used in some couplings. Applying a
larger portion of resin in one of the couplings, for example, would result in formation in larger
molar quantity of a subgroup of products.
The parallel nature of the split-mix synthesis and formation of individual compounds. The
split-mix procedure has another intrinsic feature which plays an important role in screening and
gives a unique character to the method: in any individual bead of the solid support, only one kind
of peptide is formed. This may seem surprising at first glance, but becomes quite understandable
upon closer examination. In Figure 4.2. the fate of a randomly selected bead is followed in a three
step coupling process.
58

Lets suppose that the bead in the first reaction step randomly finds itself in the reaction
vessel where the coupling is done with the red amino acid. Consequently, to all of the
functional groups of this bead and to those of all of the other beads in the same vessel the
red amino acid is attached. In the next step the bead is in the vessel where the yellow amino
acid is added. To all coupling sites this amino acid is coupled. In the third step, for similar
reasons, all dipeptides are elongated with the blue amino acid. Thus all peptide molecules that
form in the bead are the same. The sequence of all peptides is blue-yellow-red (the reversed
order of couplings).
The beads behave in the process like independent reaction vessels. The content of these
reaction vessels is not interchanged with those of the other ones. Any selected bead randomly
travels through the successive reaction vessels and the final sequence stores the information about
the route the bead traveled in the course of the synthesis.

Figure 4.2. Formation of a single substance in each bead.


Small yellow, red and blue circles: amino acids or other kinds of organic monomers; the large
green circle: an arbitrarily selected bead of the solid support randomly appearing in different
reaction vessels in the three coupling steps.

The formation of one substance in each bead is a very important feature of the split-mix
synthesis. If the products are cleaved individually from separated beads, then they can be
examined as individual substances like those produced by the parallel synthesis. Furthermore, if
the formed compounds are not cleaved from the beads only the protecting groups are removed the
products can also be tested as individual substances. The libraries in which the products remain
59

bound to the beads are called tethered libraries. Such libraries can be prepared by attaching to the
resin the first amino acid by a cleavage resistant bond. The possibility of screening the products
as individual compounds like those produced in the parallel synthesis ensure an enormous
advantage in applications.
When comparing the products of the split-mix synthesis to those produced in a parallel
process attention have to be called to an important difference. In parallel synthesis not only
individual substances form in the reaction vessels, but the position of the reaction vessel in the
reaction block unambiguously determines the identity of the product. The coordinates (row and
column) identify the expected products since the synthetic history of each well, that is the added
reagents and their order is exactly known. The situation in split-mix synthesis is different.
Although each bead contains a single product it is not possible to easily identify the content. All
beads look the same and the synthetic history of the beads is unknown. This means that if we
determine is some way or other that the content of a selected bead, say a peptide, has a useful
property all that we know is only the length of the peptide and this is the same for all components
of the library. Neither the amino acid composition nor the sequence of the amino acids is known.
If we want do know these data they have to be determined in a separate process that is called
deconvolution. In the case of peptide libraries this can be done by scarifying at least a part of the
content of the bead for sequence determination.
If a component of a tethered peptide library is examined, the best choice is to submit the
bead to sequence determination using a peptide sequencer.4 On the other hand, if the peptide is
examined in cleaved form the appropriate choice is to determine the sequence by mass
spectrometry.5
After carrying out the synthesis of a peptide library, all peptides can be cleaved from the
support and this way a mixture of free peptides is formed. These libraries are called soluble
peptide libraries. In such library millions of peptides may be present and finding a bioactive
peptide among them seems, at first glance, like finding a needle in a hay stack. Nevertheless,
appropriate strategies have been developed to solve the problem. These strategies will be
described later.
Applicability of the split-mix method in the synthesis of organic libraries. Although the
split-mix method was developed with intention to prepare large number of peptides, it was clear
from the beginning that the method would be applicable for the synthesis of different families of
other kinds of organic compounds, too. The series of non-peptide compounds are usually called
organic libraries. Since most organic compounds are prepared by multi-step synthesis, it is quite
obvious that the split-mix synthesis can be used for preparation of organic libraries. It has to be
made clear that the split-mix method can only be applied in the synthesis of organic libraries if
the chemistry of the process is well developed. The advent of the combinatorial era brought to
light the importance of the solid phase organic reactions and, as a result of an intensive
development, a large number of previously described solution phase organic reactions have been
optimized to solid phase (see the second chapter). These reactions are applied in both parallel and
split-mix approach. From the point of view of the pharmaceutical research and many other
applications, the organic libraries are very important. Peptides are not the most preferred drug
candidates because of their high susceptibility to enzymatic degradation. The ideal drug leads are
small organic compounds due to their, in general, more favorable pharmacodynamic properties.
The use of organic libraries prepared by the split-mix method brings about a problem that
does not occur with peptide libraries. Identification of a library component formed in a bead is
not so easy than in the case of peptide libraries. Determination of the structure of an organic
60

compound in most cases is more complicated than that of a peptide. In order to circumvent this
problem, different encoding methods have been developed.

4.1.1.2. Encoding of beads in the synthesis of organic libraries


As already explained in the previous section, when peptide libraries are prepared the
sequence of the peptide formed in a bead depends on the synthetic history of the bead (Figure
4.2.). The structure of an organic compound formed in a bead is also determined by the synthetic
history of the bead, that is, by the route the bead traveled through the reaction vessels during the
synthetic process. Methods have been developed in order to chemically record, in parallel with
the synthesis of the library components, the route of all beads. Encoding chemical tags are be
attached to the beads in processes different from those that are applied for coupling of the
building blocks (the two reactions need to be orthogonal). Similarly, at the end of the synthesis
the tags have to be cleavable from the beads separately from the products. The chemical tags that
carry the information of the route of the beads also need to be easier analyzed than determining
the structure of the synthesized compounds. Two different approaches were suggested for
chemical encoding.
Encoding with sequences
Binary encoding
When encoding by sequences, the encoding tags are either oligonucleotides6-8 or
peptides.9,10 Their sequences encode both the identity of organic building blocks coupled to the
bead and the order of their coupling. Figure 4.3a shows a bead with organic molecules and the
encoding tags at the surface. The white-black-gray-white, squares encode the yellow, blue
and red organic monomers and their yellow-blue-red-yellow coupling order. The code can be
read by determining the amino acid sequence of the peptide or the nucleotide sequence of the
oligonucleotide tag. The use of oligonucleotide tags has advantages when compared to the
peptide ones. Because of the possibility of amplification, their sequence determination needs
much less quantity than peptides do.

Figure 4.3. Beads encoded by sequence (a) and binary code (b)

The encoded synthesis of organic libraries follows the general route of the split-mix
method with one exception. A fourth operation is added to the usual three ones of the coupling
cycle: coupling the units of the code to the beads.
61

1. Dividing the solid support into equal portions;


2,3. Coupling each portion individually with only one of the organic building blocks;
2,3. Coupling each portion with the encoding unit;
4. Mixing and homogenizing the portions.
The encoding unit can be coupled either before (2) or after (3) the organic building block.
Figure 4.4. shows the first cycle of such synthesis.

Figure 4.4. First cycle of an encoded synthesis. Green cycles: support, yellow, blue and red
cycles: organic building blocks, white, grey and black squares: units of the code.

In the binary encoding system the coding units are different organic molecules and their
combination forms the code. In one of the binary encoding method the encoding molecules are
halobenzenes carrying a varying length hydrocarbon chain (pink structures in Figure 4.3.b)
attached to the beads through a cleavable spacer.

( )n

HOOC

Ar

O
NO2

Linker

Electrophoretic Tag

Figure 4.5. Structure of a binary encoding molecule

The structures of some aryl groups that appear in the electrophoretic tags are shown below.

62

Cl

Cl

Cl

Cl

Cl

Cl

Cl

Cl

Cl

Cl

It is characteristic for the binary encoding technique that the coding units do not form a
sequence. It is simply their presence which codes for the organic building blocks and their
position. In their original paper Ohlmeyer and his colleagues11, demonstrated the method for
encoding peptide sequences. By 18 different coding units arranged according to a binary coding
format the authors were able to code all sequences in a 117,649 member peptide library, formed
by varying 7 amino acids (D, E, I, K, L, Q and S) in six positions. The presence of the coding
units could be determined after cleavage in a single step by electron capture gas chromatography.
Table 4.2. shows a simple example constructed in order to demonstrate the principle of
binary encoding. The nine different tags (T 1-T9) are used to encode the structure of 343 organic
molecules synthesized by using the building blocks A1-A7, B1-B7 and C1-C7 in the first, second
and third coupling step, respectively.

Table 4.2. Binary encoding


Coupling #1
Blocks Codes
A1
T1
A2
T2
A3
T3
A4
T2T1
A5
T3T1
A6
T3T2
A7
T3T2T1

Coupling #2
Blocks Codes
B1
T4
B2
T5
B3
T6
B4
T5T4
B5
T6T4
B6
T6T5
B7
T 6T5T4

Coupling #3
Blocks Codes
C1
T7
C2
T8
C3
T9
C4
T8T 7
C5
T9T 7
C6
T9T 8
C7
T9T8T 7

It can be seen that that the tags T1, T2 and T3 and mixtures formed from them are
encoding A1 to A7. Similarly T4, T5 and T 6 encode B1 to B 7 and T7, T8 and T9 for C1 to C7. It can
be read from the table that the code for the compound formed from the building blocks A1B1C3,
for example is T9T4T1. Similarly A2B3C4 and A7B7C7 are encoded by T8T7T 6T2 and
T9T8T7T 6T5T4T3T 2T1, respectively.
The table shows that a building block in most cases is coded by more than one tag. These
tags are attached to the beads in a single operation using the mixtures of the tags as reagents. The
binary encoding system proved to be very successful in practice.
Encoding tags other than halobenzenes have also been proposed and successfully used in
practice.

63

4.1.1.3. Realization of the split-mix synthesis


In order to experimentally realize the split-mix synthesis a simple device has been
constructed in the authors laboratory that is still in use. The photo of this device is seen in Figure
4.6.

Figure 4.6. Manual device for split-mix synthesis. Vertical and tilted position.

The device is an aluminum tube mounted on a laboratory shaker. On one side of the tube
there are two rows of altogether 20 holes to which reaction vessels can be attached. The reaction
vessels that are normally used for solid phase synthesis (Figure 2.3.) were inserted into the holes
and tightened by applying plastic rings. The unused holes were stopped. One end of the
aluminum tube was attached to a waste container and the system could be evacuated by a water
pump. The tube could be twisted around its axis. The Figure shows the tube in two positions. The
left and right photo shows the reaction vessels in nearly vertical and tilted positions, respectively.
The vertical position of the reaction vessels was used when the resin was portioned into
them, when reagents or solvents were added and when solutions were removed. The reaction
vessels stayed in tilted position and shaking was applied during the coupling reactions and the
removal of protecting groups.
In the synthesis of peptide libraries 200-400 mesh resin (capacity 0.5 mmol/g) was used
and swelled in DMF prior to portioning. This resin contains about 10 million beads per gram. The
following operations were typical in a coupling cycle.
Portioning. The resin was suspended in DMF/DCM (2:1 v/v) in a round bottom container
and was continuously mixed by bubbling nitrogen though it. The density of the solvent mixture
was very near to that of the resin so the slurry could be kept homogenous during the portioning
operation which was carried out by pipetting equal volumes of the slurry into the reaction vessels.
After the first round of pipetting a small volume remained in the flask. This was diluted with the
solvent and the pipetting was repeated in order to transfer all of the resin into the reaction
vessels.

64

Removal of the terminal protecting group. The protecting group was removed by shaking
with a solution of TFA (Boc strategy) or piperidine (Fmoc strategy) then washed several times
with solvents.
Coupling. A DMF solution of protected amino acid containing HOBt and a solution of
DIC was added then shook for about one hour. After removing the solution, the resin was washed
several times with DMF and DCM.
Mixing. After addition of DCM/DMF (2:1 v/v) the reaction vessels were removed from
their place and their content was poured into the round bottom container. The remainder was also
washed into the container where the slurry was mixed by nitrogen bubbling.
At the end of the synthesis a deprotecting cocktail was added to the thoroughly washed
resin and after shaking then the solution was separated from the resin by filtration and dried.
Productivity. In the synthesis of peptide libraries from 20 amino acids (although cystein
was usually omitted) and using the manual device (Figure 4.6.) one elongation step, that is one
coupling with each of the 20 amino acids, could easily be realized in one day. Taking this speed
as standard, the number of synthesized peptides is shown in Table 4.3.

Table 4.3. Productivity in peptide synthesis


Peptides
Number of peptides Number of days
Dipeptides
400
2
Tripeptides
8,000
3
Tetrapeptides
160,000
4
Pentapeptides
3200000
5
Hexapeptides
64,000,000
6
Heptapeptides
1,280,000,000
7

The data of Table 4.3. are striking. In as short time as a week more than 1 billion peptides
can be prepared. This really shows the exceptionally high productivity of the method. Before
using the method, we did not even dream about anything comparable to this.
It is worthwhile to note that during the synthesis of a peptide library of a given length all
the libraries of the shorter peptides are also formed. Of course, if needed, samples of all these
libraries can be separated from the resin mixtures. If a pentapeptide library is synthesized in 5 g
resin, and in the tripeptide and tetrapeptide phase 12.5 mg and 250 mg resin is removed,
respectively, we end up with a tripeptide, a tetrapeptide and a pentapeptide library and in these
libraries the components are present in practically the same molar quantities.
Identification of the components in mixtures containing thousands or millions of peptides
is impossible. Therefore, when the split-mix synthesis was first tried experimentally, very simple
libraries were prepared containing only 9 to 180 peptides. The components of the synthesized
peptide mixtures were identified by two dimensional high voltage paper electrophoresis. In order
to facilitate the identification, a software was developed. Using this software, the sequences of
the expected peptides could be generated by computer. Based on the sequences the computer also

65

calculated the molecular weights, the electric charges of the peptides in two different (pH 2, and
pH 6.5) buffers. Based on these data, the relative electrophoretic mobilities were derived and
transformed into two dimensional electrophoretic maps. The computer predicted maps were
compared with the experimental ones so the products of the synthesis could be identified.
The software made possible to generate all components of huge peptide libraries. These
were the first examples of what are called today virtual libraries. Figure 4.7. shows the predicted
electrophoretic map of the haxapeptide library containing 64 million components. Migrations in
horizontal and vertical directions are supposed to occur at pH 6.5 and pH 2, respectively.
YYYYYY is the last generated sequence.

Figure 4.7. Predicted two dimensional electrophoretic map of 64 million hexapeptides

4.1.1.4. Automation of the split-mix synthesis


Quite shortly after publishing the split-mix method in 1991, an American company,
Advanced ChemTech Inc. constructed and manufactured an automatic machine that was capable
to carry out all the operations of the split-mix synthesis automatically under computer control.
This ACT 357 machine is the only as yet commercialized automatic split-mix synthesizer. At
present the synthesizer is produced and commercialized by aapptec. The device was designed to
be used for preparation of peptide libraries but can also be applied for the synthesis of organic
libraries if the reactions can be run at room temperature and at atmospheric pressure. The photo
of the synthesizer is seen in Figure 4.8. The machine has:
(i)
(ii)
(iii)

A Teflon reaction block (1) with 36 reaction vessels and one collection vessel for
combining and mixing the resin.
A rack (2) for the bottles for the solutions of monomers that can be either protected
amino acids or other kinds of building blocks.
Two arms each moving in x,y directions and holding a probe. The needle like probe of
Arm 1 (3) transfers solvents, solutions of the monomers and solutions of reagents into
reaction vessels. This probe is able to spray the solvent and so it can be used to wash

66

(iv)
(v)
(vi)

the walls of the reaction vessels and that of the collection vessel. The probe of Arm 2
(4) has a wide tip and transfers slurries of resin from the reaction vessels to the
collection vessel and back. This probe is also capable to transfer solvent into the
collection and reaction vessels.
Five small bottles to hold solutions of reagents (5)
The synthesizer has 3 bottles (6) for storing solvents and a waste container (7).
The computer seen in the photo controls all operations.

The computer can easily be programmed to control the synthesis of different kinds of
libraries using different kinds and different number of monomers in each step, applying reagents
in different molar concentrations. Double or triple coupling and different coupling times etc. can
also be programmed.

3
4

7
2

Figure 4.8. The ACT 357 automatic split-mix synthesizer


The arrangement of the tabletop of the machine can be better viewed in Figure 4.9. The
Teflon reaction block (1) contains the conic collection vessel (2) and the reaction vessels (3).
5
4

6
R3
2

2
1

R5
R2

R4

3
R1

Figure 4.9. The tabletop of the ACT 357 synthesizer


67

Both the collection vessel and the reaction vessels are equipped with frits at the bottom so
their liquid content can be removed by applying vacuum. The whole reaction block can be shaken
at adjustable speeds by an orbital shaker.
The solutions of the monomers, usually protected amino acids, are placed into bottles (4)
that are stored in places defined by a rack. The bottles are closed by septum. A group of 5 bottles
is also found in defined places in the table top. The coupling reagents are stored in these bottles
(R1-R5) and they are also closed by septum. There are also two cleaning-waste stations in the
table top: one for Arm 1 (5) and another one for Arm 2 (6). The stand by position of the arms is
above the center of these stations. In the left cleaning station (5) the tip of the probe of Arm 1 can
be cleaned by solvent. This is very important. When Arm 1 transfers solution of a building block
into a reaction vessel the needle like probe penetrating through the septum of the container is
immersed into the solution, removes the programmed volume of solution and transfers it into the
programmed a reaction vessel. In order to avoid cross contamination both the inside and the
outside of the probe must be washed. This happens at the station (5).
Before starting to work with the machine both arms must be calibrated: Arm 1 for
reaction vessels, collection vessel, monomer bottles, reagent bottles and the left cleaning station
(5) and Arm 2 for reaction vessels, collection vessels and the right cleaning station (6). These are
the places that are visited by the two arms in the course of the synthesis. As a result of calibration
the exact x,y coordinates of the reaction vessels, collection vessel, reagent bottles, monomer
bottles and the cleaning-waste stations are stored in the memory of the computer. The z
coordinates of the arms also need to be calibrated.
The content of each monomer container must also be defined as well as the content of the
5 reagent bottles and the 3 system fluids placed in the 3 solvent bottles.
The computer can be instructed to initialize the stepwise operations of the synthetic
procedure by commands that are entered into the software (ChemFiles). Sequential execution of
all commands of the ChemFile result in fully automatic realization of the synthetic process.
Commands.
Flush. The command is used to clean and prime the system fluid lines at the beginning of
a synthesis or when a system fluid bottle is changed.
Split. The probe of Arm 2 removes equal volumes from the resin slurry present in the
collection vessel and transfers them into the defined reaction vessels. The volumes and the
repetition of the whole process can be specified.
Combine. By this command the probe of Arm 2 removes a defined volume of resin slurry
from a defined group of reaction vessels and delivers it into the collection vessel for mixing. The
command can also be used to combine the liquid from several reaction vessels into a single
reaction vessel. In this case the number of the destination vessel also needs to be entered.
Mix. The command is used to shake the whole reaction block. It allows mechanical
(vortex) mixing and in addition nitrogen bubble mixing for the collection vessel.
Dispense sequence. A defined volume of liquid is dispensed from the containers of a
source rack into the containers of a destination rack.
Dispense system fluid. The command allows a specified amount of a selected system fluid
to be delivered to a range of vessels defined as destination rack.
Transfer. The command provides for transfer of a specified volume of reagent (from a
defined reagent bottle) to a range of reaction vessels. Also liquid can be moved between any
calibrated position on the tabletop.
68

Empty. The empty command establishes a time value in hours, minutes or seconds for
applying vacuum for emptying the liquid from the reaction vessels or the collection vessel.
Wash. The wash command is usually applied to wash down the resin from the walls of the
collection and reaction vessels after mixing. The liquid is sprayed to the walls and the vessel is
simultaneously emptied.
Wait for. The command offers a timer with a range from seconds to hours, during which
all operations are paused.
Repeat. The repeat command allows the user to develop loops within the ChemFiles in
order to repeat an operation or a sequence of operations.
Besides construction of the ChemFile, preparation for the synthesis of a peptide library
involves the assignment of the amino acids to the bottles of the rack and filing the bottles with the
solutions of the protected amino acids containing HOBt. The reagents also need to be assigned to
their bottles and fill into them. An example of the assignment is shown in Table 4.4.

Table 4.4. Assignment of reagents to bottles


Bottle
R1
R2
R3
R4
R5

Reagent
DIC
HBTU
DIEA
MeOH
Piperidine

Solvent
NMP
DMF
NMP
DMF

The system fluids (solvents) also have to be assigned and filled into the bottles. A possible
assignment is seen it Table 4.5.

Table 4.5. Assignment of solvents to their bottles


Bottle number
1
2
3

Solvent (SF)
DMF
DMF/DCM 2:1
DCM

A peptide library is built up from amino acids. In construction of the ChemFile the
Library Builder function of the software can be used to assign the amino acids to the different
coupling positions and define the reaction vessels to where they are delivered for coupling.
Coupling position 1 means the C-terminus of the peptides. The amino acids occupying this
position are coupled first to the support. In order to prepare a full library the same amino acids
need to be assigned to all coupling positions. Table 4.6. shows the data of the Library Builder
when a partial pentapeptide library is prepared. It can be seen that the number of amino acids
assigned to the different coupling positions is different. In the coupling positions 1 and 4, for

69

example, 18 and 14 amino acids are used, respectively. Although it is not seen in Table 4.6. the
library builder calculates the expected number of peptides in the library (884,520) and based on
the quantity and mesh size of the resin also gives the number of beads per peptide.
The ChemFile is a series of commands arranged in the order of execution. Table 4.7.
shows the first ten rows of a ChemFile. The execution involves the swelling and washing of the
resin, diluting the resin with solvent and the first round of splitting the resin transferring 2.5 ml of
slurry into each of 18 reaction vessels.

Table 4.6. The Library Builder


Number of
Reaction
vessels
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

C o u p l i n g p o s i t i on
8 7 6 5 4 3 2 1
A
R
N
D
K
L
M
F
P
S
T
V
W

A
R
N
D
Q
E
G
M
F
P
S
T
V
W

A
R
N
D
Q
E
G
H
I
K
L
M
F
P
S
T
V
W

A
R
N
D
Q
E
G
H
I
K
L
S
T
V
W

A
R
N
D
Q
E
G
H
I
K
L
M
F
P
S
T
V
W

The synthesis normally ends with the resin combined in the collection vessel in addition
to this there are other choices, too.
In order to be able to calculate the weight of full peptide libraries, let's suppose that only
one peptide is responsible for the biological activity. Let's also arbitrarily fix the quantity of this
peptide (and therefore all peptides in the mixture) to 1 pmol. The real quantity requirement,
depending on the sensitivity of the screening experiment and other factors, can easily be deduced
from this quantity.

70

Table 4.7. Example of commands in a ChemFile


CV: collection vessel, SF system fluid
Commands
1 Fill CV with 75.0 ml of SF1
2 Mix for 3 minutes on speed 2
3 Empty collection vessel
4 Fill CV with 75 ml of SF2
5 Mix for 10 minutes on speed 2
6 Empty CV
7 Wash CV
8 Fill CV with 63.0 ml of SF2
9 Mix for 3 minutes on speed 2
10 Split 1 thru 18 using 2500 l

4.1.1.5. Preliminary considerations when planning experiments with peptide libraries


Full peptide libraries are often prepared from 19 amino acids leaving out cysteine. The
forthcoming considerations are based on such peptide libraries. The peptide libraries have an
intrinsic feature that is advisable to take into account when planning experiments with them: the
number of their components increases exponentially with the number of the varied positions and,
as a consequence, both the weight of the libraries and the weight of the solid support needed for
their preparation also increase exponentially. The effect of the number of the varied positions on
the number of components of the full libraries is shown in Table 4.8.

Table 4.8. Number of peptides in libraries depending on the number of varied positions
Number of
Number of peptides
varied positions
2
361
3
6,859
4
130,321
5
2,476,099
6
47,045,881
7
893,871,739
8
16,983,563,041
9
322,687,697,779
10
6,131,066,257,801

In order to be able to calculate the weight of full peptide libraries, let's suppose that only
one peptide is responsible for the biological activity. Let's also arbitrarily fix the quantity of this
peptide (and therefore all peptides in the mixture) to 1 pmol. The real quantity requirement,
depending on the sensitivity of the screening experiment and other factors, can easily be deduced

71

from this quantity.


Table 4.9 shows the weight of full libraries depending on the number of their varied
positions. It can be seen that if the number of the varied positions is near 10, the weight of the
libraries is so high that solubility problems may arise in screening. As a consequence, the weights
of the libraries are certainly one of the factors that need consideration in planning.

Table 4.9. Approximate weight of libraries containing each peptide in 1 pmol quantity
Number of
Weight Units
varied positions
2
92
ng
3
3
g
4
65
g
5
2
mg
6
35
mg
7
765
mg
8
17
g
9
353
g
10
7
kg

The quantity of the resin that is needed for the synthesis is expected to be - and really is even a bigger problem. Table 4.10. shows the weight of the resin needed to prepare all peptides in
1 pmol quantity. In practice, these quantities are expected to be even higher than indicated in
Table 4.10 because the libraries are usually prepared not for a single but for a series of
experiments and the screening tests may also have lower sensitivity. Problems may occur in
handling such large quantities of resin and, if the number of the varied positions is high enough,
it is practically impossible to carry out the synthesis. Consequently the weight of the resin needs
to be considered carefully.

Table 4.10. Approximate weight of the resin needed to prepare libraries


containing each peptide in 1 pmol quantity
Number of
Sum of moles
varied positions
2
361
3
7
4
130
5
2
6
47
7
894
8
17
9
323
10
6

72

Units
pmol
nmol
nmol
mol
mol
mol
mmol
mmol
mol

Weight of resin Units


720
14
261
5
94
2
34
645
12

ng
g
g
mg
mg
g
g
g
kg

Another problem which deserves consideration before beginning a synthesis is the ratio
of the number of beads of the resin to the number of the expected peptides. Since only one
peptide forms in each bead, the maximum number of peptides is limited by the number of beads.
Furthermore, two essential operations of the method, mixing and portioning, are influenced by
probability. As a consequence, if the number of the beads is equal to the number of peptides not
all peptides are expected to form and also deviations from the equimolarity are expected. For this
reason, formation of all expected peptides, as well as their near equimolarity, is ensured only if
the number of beads well exceeds the number of peptides. A ten fold excess of the beads can be
considered quite safe.
For reasons outlined above, when very complex libraries are prepared, it is desirable to
choose as small bead size as possible, for example, 200-400 mesh (diameter: 38-75 m) resin.
Each gram of this resin contains about 10 million beads. Table 4.11. shows the quantity of resin
needed if the number of beads equals or exceeds 10 times the number of peptides. The data in
Table 4.11. clearly demonstrates that, due to practical reasons, the number of varied positions in
full libraries is limited to about 6 or 7.
The difficulties arising from the overwhelmingly large number of peptides in some full
libraries can be circumvented by preparing their partial libraries. One may follow two different
approaches for doing this:
1. Reducing the number of the varied amino acids;
2. Reducing the number of the varied positions.
It is, of course, possible to combine the two approaches. It seems worthwhile to consider in some
detail both possibilities.

Table 4.11. Approximate weight of the resin


if 1 or 10 beads are assigned to each peptide
Number of
Weight Units Weight
Units
varied positions (1 bead)
(10 beads)
2
36
g
361
g
3
686
g
7
mg
4
13
mg
130
mg
5
248
mg
2
g
6
5
g
47
g
7
89
g
894
g
8
2
kg
17
kg
9
32
kg
323
kg
10
613
kg
6
t

Table 4.12. shows that the number of components in the libraries can effectively be
reduced by reducing the number of the varied amino acids .

73

Of course the chemist is not restricted to use the same number of amino acids in all
positions. An example of an octapeptide library is demonstrated in Table 4.13. that is constructed
by varying different number of amino acids in different positions.

Table 4.12. Number of peptides in partial libraries depending


on the number of varied amino acids
Nnumber of
N. of amino acids N. of amino acids N. of amino acids
varied positions
5
10
15
2
25
100
225
3
125
1,000
3,375
4
625
10,000
50,625
5
3,125
100,000
759,375
6
15,625
1,000,000
11,390,625
7
78,125
10,000,000
170,859,375
8
390,625
100,000,000
2,562,890,625
9
1,953,125
1,000,000,000
38,443,359,375
10
9,765,625
10,000,000,000 576,650,390,625

Table 4.13. Partial octapeptide library deduced by varying


different number of amino acids in different positions
Position

Number of varied
amino acids
1
10
2
8
3
12
4
9
5
4
6
19
7
4
8
12
Total number of peptides
31,518,720

Intuition plays an important role when one decides which amino acid can be omitted in
the synthesis. One has to be aware, however, that if a partial library is prepared and an amino acid
critical to the activity of the potential active peptide happens to be among the omitted ones the
active peptide and the activity of the whole library is lost.
It is very convenient to prepare less complex libraries by reducing by one or two or even
more the number of the varied positions. Each fixed position reduces the number of peptides by a
factor of 19. The partial heptapeptide library of Table 4.14, for example, that has three non-varied
positions has only 130,321 components.

74

Considerations of the abovementioned features of peptide libraries may, perhaps, help the
potential user to be aware of the limitations of the library method, formulate a realistic research
plan and, when possible, circumvent the difficulties.

Table 4.14. Partial heptapeptide library with 3 non-varied positions


Position
1
2
3
4
5
6
7
Number of peptides

Number of varied amino acids


19
1
19
1
1
19
19
130,321

The examples in the considerations made in this section were peptide libraries. The
conclusions, however, hold for organic libraries, too.

4.1.1.6. Full and partial libraries


Although peptide libraries that are constructed from 20 or 19 amino acids in all coupling
positions are usually considered full libraries, precise definition for such libraries that is generally
accepted not yet exists. One may also consider any library synthesized by the split-mix method to
be full library. Any other library that contains a smaller or larger fraction of the components of
the full library and no extra components is a partial library.
It is not easy to give an exact and at the same time short description of a library. For exact
description all building blocks used in all coupling positions have to be indicated. This can be
done in the form of a table. This is shown for a simple peptide library in Table 4.15. containing
400 components.

Table 4.15. Description of a tetrapeptide library


Coupling position
1
2
3
4

Amino acids
A, F, G, H, R
H, I, K, L
D, E, F, T
A, G, K, T, W

If a full pentapeptide library, composed in every step from the twenty natural amino acids,
is represented as shown in Figure 4.10. the sequences of the peptides can be read along the lines

75

drawn through one of the amino acids found in the five rows of the figure. In the case of
pentapeptides 3.2 million different lines can be drawn each representing one of the theoretically
possible 3.2 million pentapeptide sequences.
Both libraries represented in Table 4.15. and Figure 4.10. can be considered full libraries.
Any library in construction of which the same amino acids are used but their number is reduced
in one or more coupling positions relative to those found in Table 4.15. or Figure 4.10. can be
considered a partial library.
1

A C D E F G H I K L M N P Q R S T V W Y

A C D E F G H I

A C D E F G H I K L M N P Q R S T V W Y

A C D E F G H I

A C D E F G H I K L M N P Q R S T V W Y

K L M N P Q R S T V W Y

K L M N P Q R S T V W Y

Figure 4.10. Pentapeptide sequences represented by lines

The library of Table 4.16. for example, is a partial library of the full library of Table 4.15.
In the synthesis of the partial library of Table 4.16. R and F are omitted in coupling positions 1
and 3, respectively and so the number of components is reduced from 400 to 240.

Table 4.16. Partial library of the full library of Table 4.15.


Coupling position
1
2
3
4

Amino acids
A, F, G, H
H, I, K, L
D, E, T
A, G, K, T, W

The library of Figure 4.11. is a partial library of the full library represented in Figure 4.10.
In the synthesis of this library in the coupling steps 1, 3, 4 and 5 all of the 20 amino acids are
varied. In coupling step 2, however, a single amino acid glycine is used that is coupled to the
resin without previous portioning. Coupling position 2 is a non-varied position and glycine is the
amino acid occupying coupling position 2 in all peptides. All sequence lines cross glycine. The
number of sequences, and consequently the number of possible sequence lines, is only 160,000.
This is the number of the components of the full library divided by 20.
As it will be shown later the partial libraries that have a single non-varied position play an
important role in screening. They are often called sub-libraries.13
The synthetically easiest accessible and at the same time the simplest sub-libraries are

76

those ones that form at the end of a split-mix synthesis omitting the final mixing. Their nonvaried position is the last coupling position. If 20 amino acids are varied in the synthesis, a single
split-mix run need to be carried out (without the last mixing) and the process ends up with 20
sub-libraries. These sub-libraries have another interesting feature: if they are mixed a full library
is formed. As 4.12. shows, this feature is the same for any full sets of sub-libraries having the
same non varied position. The non-varied positions in sets a, b and c are coupling positions 1, 2
and 3, respectively, and it can be seen that each of the three sets form a full library.
1

A C D E F G H I K L M N P Q R S T V W Y

A C D E F G H I K L M N P Q R S T V W Y

A C D E F G H I K L M N P Q R S T V W Y

A C D E F G H I K L M N P Q R S T V W Y
Figure 4.11. A partial pentapeptide library with one non-varied position

Figure 4.12. Full and sub-libraries of a 27 component tripeptide library.


The non-varied positions are at coupling position 1 (a), coupling position 2 (b) and coupling
position 3 (c), respectively.

77

As it was pointed out, the synthesis is a single run process of the sets of sub-libraries in
which the non-varied position is the last coupling position, like in set c of Figure 4.12. In sets like
a and b the non-varied positions are in first and intermediate coupling positions, respectively. The
simplest way to prepare the components of these sets is to synthesize them separately, one by
one.
There are sub-library sets that do not form a full library. An example is demonstrated in
Figure 4.13. In sub-libraries c, b and a the blue amino acid occupies the first, second and third
coupling position, respectively. Both the all yellow and the all red sequences (marked by an
arrow), for example, and also other ones would be missing from the mixture of the three sublibraries, while other trimers are present in duplicates or triplicate (all blue).

Coupling position
3
2
1

Figure 4.13. A set of sub-libraries that do not form a full library

As already pointed out, when the number of components in a full library is too large to
synthesize it in a single run it is practical to prepare it in portions. It has to be taken into account,
however, that some partial libraries are unpractical to prepare because their completion to a full
library requires too much work.13 It is unpractical for example to prepare L2 as a portion of the
full library L1 because L3 does not complete it to L1. The total number of components in L2 and
L3 is only 32 while L1 contains 256 tetramers. Several other libraries would have been needed to
be prepared in order to complete L2+L3 to L1. L4 and L5, however, are two reasonable choices
for portions of L1. Both L4 and L5 have 128 components that add up to 256.

78

L1

L2

L3

L4

L5

A,D,E,F
A,D,E,F
A,D,E,F
A,D,E,F

A,D
A,D
A,D
A,D

E,F
E,F
E,F
E,F

A,D
A,D,E,F
A,D,E,F
A,D,E,F

E,F
A,D,E,F
A,D,E,F
A,D,E,F

256

16

16

128

128

4.1.1.7. Unusual partial libraries


As previously showed, in a split-mix synthesis all those compounds form that can be
deduced by combination of the applied building blocks in their preparation. This is an important
advantage of the method. It may occur, however, that not the whole library is needed only an
arbitrarily selected subset. Such series of compounds, of course are not combinatorial libraries,
can not be directly prepared by the method. This is a disadvantage. It is possible, however, to
design and synthesize a combinatorial library that contains all the wanted compounds and in
addition other components.
Construction of a combinatorial library that contains an arbitrary set of compounds is
demonstrated by a simple example. Suppose that the components of the set are the following five
pentapeptides.
54321
ADKLL
ADMLG
GIFGP
GIFLP
GDMGL
Next, the amino acids appearing in the five coupling positions are recorded.
Pos. 1
Pos. 2
Pos. 3
Pos. 4
Pos. 5

L, G, P
L, G
K, M, F
D, I
A, G

A combinatorial synthesis using these amino acids as building blocks produces a library that
contains all of the five pentapeptides. The number of components of the library is 3x2x3x2x2=72.
This means that in addition to the 5 wanted peptides 67 extra peptides are formed. The number of
coupling cycles in this synthesis is 3+2+3+2+2=12. If the 5 peptides are prepared by parallel
synthesis the number of the coupling cycles is 25.
Cohen and Skiena showed 49 that the total number of components of the libraries can be
reduced by increasing the number of the coupling cycles and by properly designing the synthesis.
They developed software that makes possible to optimize the total number of components of the

79

libraries that contain the wanted compounds vs the number of the coupling cycles needed in the
synthesis.
One of their examples is the synthesis of an arbitrary set of 496 pentapeptides. One may
think whether for the preparation of such a set the parallel method or the split-mix synthesis is
more advantageous. The parallel synthesis of such a set needs 2480 coupling cycles that involves
much work. They showed that by application of their optimization method the synthesis of a
20,000 member pentapeptide library that contains all of the 496 arbitrarily selected pentapeptides,
needs only 324 coupling cycles. One may decide what is better: the reduction of the number of
the coupling cycles by a factor of ca. 8 and accepting the presence of additional ca. 19,500
components in the mixture or preparation of the individual compounds in parallel by investing
several times more in labor. The choice certainly depends on additional factors, too. The use of
modified versions of the split-mix synthesis that applies macroscopic solid support units (see
later) offers a much better choice for the preparation of arbitrarily selected series of compounds.

4.1.1.8. Binary synthesis using the split-mix procedure


The concept of the binary synthesis was introduced by Fodor and his colleagues in a
paper describing a new technique for combinatorial synthesis14 that will be discussed later. The
split-mix method also proved to be applicable for carrying out binary synthesis.15
In order to realize a binary peptide synthesis the operations of a cycle of the split-mix
procedure need to be modified as follows:

1/2

Resin

1/2

E
No coupling
1/2

Mix

1/2

R
No coupling
1/2

Mix

1/2

G
No coupling
1/2

Mix

1/2

L
No coupling

Mix

Figure 4.14. Flow diagram of a four step binary peptide synthesis

80

Divide the resin into two equal parts


Couple an amino acid to one part of the resin
Mix the two parts
This cycle of operation is repeated in the process a pre-determined number of times using
in each coupling step a different amino acid. It has to be emphasized that only one of the two
portions of the resin is submitted to coupling and nothing is done with the second part.
Figure 4.14. shows the scheme of a simple binary peptide synthesis coupling successively
with the following amino acids: E, R, G and L.
In the steps of the binary synthesis one part of the resin and all the peptides already
formed in the resin remain unchanged. In each synthetic step only the second part of the resin
undergoes coupling, and in this process, all the peptides formed in the previous steps are
elongated with one amino acid. Table 4.17. shows the products formed in the process shown in
Figure 4.14.
Table 4.17. Peptides formed in a four step binary synthesis
Coupl
. step
1
2
3
4

Peptides in no coupling part

Coupl.
with
0
L
0, L
G
0, L, G, GL
R
0, L, G, GL, R, RL, RG, RGL
E

Peptides in coupling part after coupling


L
GL
R, RL, RG, RGL
E, EL, EG,EGL, ER, ERL, ERG, ERGL

The zero in the table means unchanged empty resin. The final mixture contains the
products found in the last row of the table including a fraction of the resin that remains
unchanged. One of the products is ERGL, that is, the peptide formed from the four amino acids
used in the synthesis and its sequence reflects the coupling order of the amino acids. The other
products can be derived from the sequence of this tetrapeptide. All the sequences and amino
acids that can theoretically be derived from this root tetrapeptide sequence by deletion of partial
sequences or amino acids are found in the mixture. The components of the mixture are present in
equimolar quantities.
As it was pointed out by Fodor an his colleagues, the number of peptides formed in the
binary synthesis (N) can be calculated from the number of coupling cycles (c) according to a
simple formula.
N=2c
This indicates a very high efficiency since N grows exponentially with the number of
coupling cycles. In 22 coupling cycles, for example, more than 4 million components form. The
number of components in the groups of peptides of different lengths, follow a binomial
distribution. When c=10 the total number of components is 1024. The length of peptides varies
from 0 to 10. The number of peptides belonging to different length is indicated in brackets: 0(1),
1(10), 2(45), 3(120), 4(210), 5(252), 6(210), 7(120), 8(45), 9(10) and 10(1).

81

One has to note that the N=2c formula calculates the maximum number of components
that are formed only when in the root sequence every amino acid is represented only once. In
many cases this requirement can not be maintained, for example, when the 20 L-amino
acids are used as building blocks in more than 20 coupling cycles. Multiple occurrences of amino
acids in the root sequence have two consequences:
(i)
(ii)

the number of components in the synthesized library is less than calculated from
the formula and,
the equimolarity of components is no longer maintained since some compounds
form in multiple molar quantities.

If an amino acid appears more than once in the root sequence, some peptides may form
from more than one source. This happens, for example, if the binary synthesis is carried out on
the basis of the following root sequence: EGGL. The products derived from this sequence are:
0, L, G, GL, G, GL, GG, GGL, E, EL, EG, EGL, EG, EGL, EGG, EGGL
The products indicated in bold appear twice in the list so their quantity relative to the
other components of the mixture is doubled. In the paper of Sebestyn et al.15 tricks are described
how to avoid formation of the products in non-equal quantities.
The binary synthesis may prove useful for exploration whether or not deletions in a region
of a longer peptide lead to bioactive fragment(s).

4.1.2. Combinatorial synthesis using amino acid mixtures


As pointed out in Chapter 1 Section 2. replacement of the single amino acids in the
coupling cycles of the Merrifield solid phase peptide synthesis by a mixture of the 20 amino acids
would, in principle, lead to the formation of a peptide library containing all the theoretically
expected components. Geysen and his colleagues16 published such synthesis in 1986. They used
amino acid mixtures in every coupling step of their solid phase synthesis and successfully applied
the Iteration method (see details in Chapter 5) in the analysis of the library.
The method is even more efficient than the split-mix procedure. In every coupling step of
the synthesis only a single coupling operation is executed in a single reaction vessel in contrast
with the 20 reaction vessels and 20 coupling operations needed in the split-mix procedure. In the
split-mix procedure a total of 100 couplings are needed to prepare the 3.2 million pentapeptides.
The amino acid mixture method needs only 5 coupling steps.
There are, however, disadvantages too. One of the disadvantages was pointed out in
Chapter 1 Section 2. It is known that the coupling rates of the amino acids differ from each other.
As a consequence, formation of the peptides in 1 to 1 molar ratio can not be assured. Some
peptides form in significantly higher molar quantity then others and some peptides do not even
form. Efforts have been made to compensate these differences. Rutter and Santi described in their
patent17, that the differences can in part be compensated by proper adjustment of the
concentrations of amino acids in the coupling mixtures. The amino acids showing a slower
coupling rate were represented in higher concentrations in the mixture. Full compensation,
however, can not be achieved because in addition to the acylating components, the coupling rates
are also effected by the acylated peptides themselves.
82

Another disadvantage relative to the split-mix method is that the one bead one product
feature is completely lost. Since mixtures of amino acids are used in every coupling step, instead
of single products, mixtures are formed in the beads. The synthetic history of every bead is the
same. As a consequence, within the limits of statistics, the content of the beads is the same. This
means that every bead contains all components of the library. The loss of the one bead one
product feature is a very significant disadvantage. The library can be analyzed only as a mixture.
The individual components of the library are absolutely not accessible.
The method has been applied mainly in preparation of peptide libraries but non-peptide
libraries have also been synthesized.19

4.2. Combinatorial synthesis using soluble support


The very large majority of the classical methods developed for preparing organic
compounds work in solution phase. The advent of combinatorial methods induced a fast
development in the area of solid phase synthetic procedures. Nevertheless, most organic synthetic
methods that are found in the literature are still applicable only in solution phase. In addition, the
solid phase reactions are significantly slower than those in solution. A dissolved reagent molecule
that is outside resin beads can react with a molecule attached to the solid support inside a resin
bead only after diffusion into the solvent bound within the particle. The diffusion is a slow
process so the solid phase reactions take a considerable longer time than those in solution phase.
There were attempts by Shemyakin and others20-22 to substitute the cross linked
polystyrene support introduced by Merrifield (Chapter 2) by linear soluble polystyrene to achieve
homogeneous phase in reactions while preserving some advantages of the solid phase approach.
Han et al.23 applied polyethyleneglycol (PEG) as support in synthesis of peptide libraries
following the split-mix approach.
One of the two hydroxyl groups of the polymer is blocked by a methyl group. The first
amino acid or other building block can be attached directly or via a linker to the remaining free
hydroxyl group at the other end of the polymer chain.
MeO-CH2-CH2-O-(CH2-CH2-O)n-CH2-CH2-OH
PEG proved very suitable for this purpose since it is soluble in a wide variety of aqueous
and organic solvents and its solubility provides homogeneous reaction conditions even when the
attached molecule itself is insoluble in the reaction medium. Separation from the reaction
medium of the polymer and the synthesized compounds bound to it can be achieved by
precipitation and filtration. The precipitation requires concentrating the reaction solutions then
diluting with diethyl ether or tert-butyl methyl ether. Under carefully controlled precipitation
conditions the polymer precipitates in crystalline form.
The above cited authors prepared 1024 pentapeptides in order to show the applicability of
their method. Other types of compounds like polysaccharides and oligonucleotides have also
been synthesized on PEG.24
In addition to providing conditions for faster and smoother reactions, application of the
soluble supports has another advantage. The support does not have a bead form it is rather
represented by a collection of individual molecules. As a consequence, the reactions are
absolutely unaffected by statistics.

83

The method has disadvantages, too. The very important feature of the split-mix method
that a single compound forms in one bead, is completely lost. In addition, the separation of the
support from the reaction mixture is not as simple as filtering out the bead form resin.

4.3. Combinatorial synthesis on solid surface


A very remarkable combinatorial synthesis was developed by Fodor and his colleagues by
combining the solid phase synthesis with the photolithographic procedure applied in the
fabrication of the computer chips. The method was published in 1991 under the title The lightdirected, spatially addressable parallel chemical synthesis.25
The method makes possible to prepare an array of peptides or other kinds of
molecules on the surface of a small glass slide. At the beginning the full surface is functionalized
with aminoalkyl groups that are protected by the photo-labile 6-nitroveratryloxycarbonyl (Nvoc)
groups. These protecting groups can be removed from definite regions of the surface by
irradiation. The deprotected amino groups can be acylated with N-protected amino acids. The amino groups of the amino acids are also protected by the photo-labile Nvoc groups. The
principle of the method is demonstrated in Figure 4.15.
The example shown in Figure 4.15. demonstrates the synthesis of nine dipeptides from
the amino acids A, G and K. Before each coupling step one or more areas of the slide are
irradiated through a mask in order to remove the protecting groups from those areas. Then the
slide is submitted to coupling with the indicated amino acid. This can be done by immersing the
slide into the solvent containing the protected amino acid and the coupling reagent. Although the
full slide is submitted to coupling reaction, coupling occurs only in the irradiated areas where the
free amino groups are found. By completing a coupling cycle the full area of the slide becomes
again protected. Before the next coupling cycle a new area have to be irradiated in order to
produce free amino groups.

A
A

AA AG AK
GA GG GK
KA KG KK

Figure 4.15. Formation of nine dipeptides in the light directed synthesis

In Figure 4.15. the irradiated areas are white and those shadowed by the mask are gray.
The synthesis of the 9 dipeptides is completed in 6 cycles irradiation and coupling (a to f). It is

84

remarkable that more peptides (32=9) form than the number of the executed coupling cycle (6)
that is characteristic in the combinatorial processes. The dipeptides are formed in the locations
shown in Figure 4.15/g.
It is worthwhile to note that in fact only two different masks are needed in the synthesis:
those two shown in Figures 4.15/a and 4.15./b. Mask positions d, c and f can be produced by
rotation of mask a by 90o, 180o and 270o, respectively.
If the synthesis is continued, in each elongation step finer masks need to be used and in
each elongation step the number of the components of the library increases exponentially as in
the split-mix synthesis. In the next elongation step, for example the masks 4.15./h and 4.15./i
would be needed and the other mask positions could be presented by rotation of these two. Mask
position j, for example, could be brought about by rotation of mask 4.15./h by 180o. After
completing the next 6 coupling cycles with the amino acids A, G and K (at mask positions h, i, j,
and at those positions brought about by rotating these by 90o, 34=81 tetrapeptides would form. As
expected in a combinatorial synthesis, among these 81 tetrapeptides all sequences would be
represented that can be deduced as a result of inserting the three amino acids (A, G and K) into
coupling positions 3 and 4. After completing the couplings, before the library undergoes testing,
the protecting groups, of course, have to be removed. In the testing experiments the products
remain attached to the slide.
The light directed synthesis in some respects is very similar to the split-mix method. In
the synthesis of peptide libraries the invested work, that is the number of the executed coupling
steps (Nc), linearly increases with the lengths of the peptides (n, the number of amino acids in the
peptides), while the number of the components of the library (Np) increases exponentially with
the length. If 20 amino acids are used in every step of the synthesis, the following formulae
express the invested work and the number of the peptides formed in the process.
Nc = 20n
Np = 20n
There are also differences relative to the split-mix method. One of the differences is that
in the light directed method the couplings of one elongation step can not be executed in parallel,
like in the split-mix procedure. The couplings with the single amino acids need to be carried out
serially, one after the other. Furthermore, the light directing method is, of course unsuitable to
prepare the libraries in large quantities.
Another difference provides a very significant advantage for the light directed method.
The identity of every product formed on the surface of the slide is exactly known. There is no
need for a separate analytical process in order to identify the products. If the masks, their
positions and the order of their application as well as the coupling order of the amino acids are
known, the identity of the products in every location of the slide can be deduced. This makes
application of the libraries in the testing experiments very simple.
It has already been mentioned that the binary synthesis was first introduced and
demonstrated in conjunction with the light directed synthesis. Its principle is that in each
elongation step only half of the slide is submitted to coupling the other half remains unchanged.
Figure 4.16. demonstrates the synthesis based on the same ERGL root sequence that was used in
demonstrating the binary synthesis realized by the split-mix method. The white regions of the
slide are irradiated then coupled with the indicated amino acid. The gray regions remain
unchanged. Four elongation steps are executed (a, b, c and d) as shown in Figure 4.16. The
85

products are found in Figure 4.16/e. It can be seen that the products are the same as those formed
in the split-mix binary synthesis (row 4 in Table 4.17).

E
G
L

R K
E
K

c
EGL
L

ERG

EG

RGL

GL

RG

ERL

EL

ER

RL

ERGL

Figure 4.16. Binary synthesis on the basis of the root sequence ERGL using the light directed
method

In the introductory publication around one thousand peptides were synthesized on the
surface of the slide. Since that time the number of the substances produced on a slide was very
significantly increased. In practice, the method is applied for making oligonucleotide chips that
are extensively used in nucleic acid analysis.26 On the surface of a chip less than 1.5 cm2 about
500,000 different oligonucleotides can be synthesized and a single silicon wafer may contain 49
to 400 different oligomer arrays. The light directed synthesis was developed at an American
company, Affymax, and the chips are manufactured and commercialized by Affymetrix. More
details of the method can be found in the home page of the company.27

4.4. Combinatorial peptide synthesis by biological methods


In 1990 three different research groups introduced a new biological approach for
producing peptide sequence libraries28-30. This approach is briefly exemplified by phage display
libraries.31 First, an oligonucleotide library is synthesized chemically by a series of couplings
with equimolar nucleotide mixtures. The formed oligonucleotides are then inserted into the DNA
of phages. In the next stage the phages infect the host bacterium (usually Escherichia coli) and
replicate together with the inserted foreign DNA segment. A library of phage clones forms.
Each clone carries in its DNA a different foreign sequence segment which is expressed as a
partial sequence of its coat protein. Every phage particle carries a many identical coat protein
molecules with the same (foreign) peptide sequence fused to the outer end. In this respect the
phages resemble to the beads in PM synthesis each containing an individual compound. The

86

DNA of the phage can be considered as an encoding tag since the sequence of the peptide can be
determined (after ampflification) by sequencing the proper portion of the DNA.

4.5. Combinatorial synthesis using macroscopic solid support units


The split-mix combinatorial synthesis, as it was shown, is a very efficient procedure and
in addition it produces individual compounds since each resin bead contains a single compound.
The quantity of the compound, however, that forms in a bead may prove too low (it is in the
range of nanogram-microgram quantities) when compared to the needs. The total quantity of a
compound that is formed in a split-mix synthesis is not necessary low because usually a large
number of beads contain the same compound. Since, however, the content of the individual beads
is not known, it is absolutely impossible to select those of them that contain the same product
from millions of beads, cleave the product from the selected group of beads and so produce a
larger quantity of the desired substance. In a split-mix synthesis a very large number of
compounds are formed. We exactly know what compounds are present in the library and, again,
it is not possible to pick out a desired compound or a pre-determined set of compounds for
testing. These disadvantages of the split-mix method stimulated the efforts to modify the
procedure. The goal of the modifications was to preserve the high productivity of the method and
besides that
(i)
(ii)

produce the individual compounds with known identity and


produce the compounds in large (multi-milligram) quantities.

In order to fulfill this goal the microscopic solid support units (beads) applied in the
original method had to be replaced by macroscopic ones. In addition these macroscopic units had
to be labeled somehow in order enable the experimenter to identify the product formed in the
units. This means that the number of the support units applied in the synthesis need to be the
same as the number of the products. Before starting the synthesis a product have to be assigned to
each unit and properly labeled them. Assignment of the product involves assignment of the
building blocks and their coupling order. In addition, during the synthetic process the units have
to be distributed one by one into the reaction vessels according to the structure assigned to the
products.
The simplest way to label the units is to assign numbers to them from 1 to n where n is the
number of compound to be prepared. The building blocks and their order can be recorded in a list
or in a computer. Once the number of the unit is known, the operator can read from the list or
from the computer which building block need to be coupled into the unit in a given reaction step.
In other words the operator can identify the reaction vessel into which the unit has to be
transferred in the given phase of the synthesis. This is demonstrated in figure 4.17.
Box a in the figure shows 9 dipeptides assigned to the units 1 to 9. The sequences to be
synthesized in the units 1, 2 and 3, for example, contain G in the first coupling position
consequently they have to be placed into reaction vessel b where in the first synthetic step G is
coupled into all units. The amino acid G coupled into the units is indicated in bold in the
sequences. The arrows show where the units need to be transferred for the second coupling. The
numbers on the arrows show the numbers of the units. So the units 1, 2 and 3 are transferred into
the reaction vessels e, f and g, respectively. Similarly, the units 4, 5 and 6 from reaction vessel c
are transferred into reaction vessels e, f and g, respectively.
87

Two essentially different approaches have been developed to solve the problem of the
identification of the units. In one of the approaches physical labels are attached to the units. In the
other approach no labels are used. The units are encoded by the position they occupy in space.

b
GG-1
LG-2
AG-3
GL-4
LL-5
AL-6
GA-7
LA-8
AA-9

GG-1
LG-2
AG-3

7
8

GG-1
GL-2
GA-3

GA-7
LA-8
AA-9

GL-4
LL-5
AL-6

L
3

LG-1
LL-2
LA-3

AG-1
AL-2
AA-3
g

Figure 4.17. Sorting of macroscopic units

4.5.1. Encoding by attached labels. The radiofrequency and optical encoding methods
The first approach for the modified split-mix synthesis using macroscopic solid support
units was developed independently in two laboratories.32,33 In the suggested methods the solid
support units are permeable capsules containing resin and the labels are radiofrequency tags that
are also enclosed into the capsules. The method was commercialized by IRORI34.
A capsule enclosing the resin and the radiofrequency (R f) tag is demonstrated in Figure
4.18. The capsules are made of polypropylene and named MicroKans at IRORI. Their length and
diameter are 18 mm and 7 mm, respectively. They can enclose 25-30 mg resin making possible to
produce in them around 25 mol compound. The R f tag is a small microelectronic device in glass
cover, its length is 13 mm and the diameter is 3 mm. They have a permanent 40 bit code etched
into their memory and can receive and emit radiofrequency signals. When placed in
radiofrequeny field they re-emit their code.35 The MicroKans are available in 5 different sizes.
Their volume varies from 250 L to 660 L.
Rf tag

Resin

Figure 4.18. Permeable capsule enclosing resin and a radiofrequency tag.

88

Other kinds of support units have also be introduced. One type of them is the Micro Tube.
Figure 4.19. demonstrates a MicroTube that is a plastic tube also containing an Rf tag. The
surface of the plastic tube is covered with a radiolitically grafted and functionalized polystyrene
layer. The length of a Micro Tube is 15 mm and the diameter is 6 mm. The capacity is about 30
mol.
Rf tag
Plastic tube
covered with
grafted
polystyrene

Figure 4.19. Micro Tube

A third kind of support unit carries an optical coding system: the "Laser Optical Synthesis
Chips". The supports are 1x1 cm polystyrene grafted square plates. The medium carrying the
code is a 3x3 mm ceramic plate in the center of the synthesis support (Figure 4.20.). The code is
etched into the ceramic support by a CO2 laser in the form of a two dimensional bar code that can
be read by a special scanner.36

Figure 4.20. Support units labeled by a two dimensional bar code


Sorting of the units can be done either manually or by using an automatic sorting
machine. The sorting process named Directed Sorting is guided by the software named
Synthesis Manager that tracks all the units during the synthesis.
In manual sorting the reading of the code and transferring the units into the proper
reaction vessel are done manually. Figure 4.21. demonstrates a manual sorting step. During the
chemical synthetic step the units are in the reaction vessels A-E. After finishing this step the units
are pooled for washing.
After washing each unit is scanned. The Synthesis Manager identifies one of the reaction
vessels F-J where the unit has to be delivered. In the reaction vessels F-J the units are coupled
with a different building block identified by the Synthesis Manager. After sorting all units, the
next chemical step can be started. The Synthesis Manager continues tracking the units even after
the synthesis is completed. It determines the place of the units in the cleavage station, too.

89

Cleavage stations are also available at IRORI together with other items that are needed in the
synthesis including, for example, a device that makes possible to easily fill the MicroKans with
exact quantities of dry resin.

Pool
and wash

Scan then deliver into a reaction vessel

Figure 4.21. Directed Sorting manually

The key operation in the synthesis is sorting. Since every unit has to be scanned and
delivered separately, the manual sorting process is relatively slow. Only several hundred or a
maximum of 1000 compound is usually prepared using this method. Definitely does not make
possible to prepare in a single run thousands of compounds. The automatic sorting machine
developed at IRORI solves this problem.
The principle of the automatic sorting is outlined in Figure 4.22. After each chemical step
the capsules are transferred from the reaction vessels into a larger vessel and thoroughly washed.
The pooled and washed units are then further transferred into the vibratory bowl (D) of the sorter.
Vibration of the bowl then forces the units into a tube (E). At the solenoid gate (B) the antenna
(C) reads the code and the computer (A) determines the destination of the unit. The destination is
one of the containers (F) that represents a reaction vessel, into which the capsule need to be
delivered for the next synthetic step. The delivery is executed by the X-Y movement of the
delivery mechanism. After sorting, of course, the capsules collected in each vessel are reacted
with a different monomer.
The automatic sorter can accommodate up to 10,000 units that can be sorted into a
maximum of 48 containers. The sorting speed is 1000 units per hour.

90

E
D
B

Figure 4.22. Automatic sorting machine.


A: computer, B: solenoid gate, C: antenna, D: vibratory bowl, E: tube, F: containers

The radiofrequency tagging and the visual coding of the support units are used in manual
sorting systems developed by other companies, too. The Australian company, Mimotopes, offers
two kinds of solid support units shown in Figure 4.23: SynPhase Crowns (a) and SynPhase
Lanterns (b). Their surfaces are grafted and functionalized. Both kinds of units can be coded by
attaching to them Rf tags (c and d). One end of the R f tag fits into the holes in the crowns and
lanterns and so it can be firmly attached to them. Scanning of the units goes as described at the
IRORI method. A color tagging system has also been developed at the company that uses 8
different colors in the form of colored rings. The color system can be applied to both crowns and
lanterns. The stems are firmly attached to the crowns and lanterns by inserting them into their
holes. These stems hold the code forming rings (Figure 4.23. e and f).

c
d
e
f
a
b
Figure 4.23. Radiofrequency and color coding of crowns and lanterns:
a: SynPhase Crown, b: SynPhase Lantern, c: SynPhase Crown with Rf tag, d: SynPhase Lantern
with Rf tag, e: color coded SynPhase Crown, f: color coded SynPhase Lantern

Position of the ring on the stem encodes the reaction step and its color encodes the
building block. A list needs to be prepared in advance in which building blocks are assigned to
the positions and colors of the rings. The codes of the units are read visually and are distributed
manually among the reaction vessels of the next reaction steps according to the data of a list. The
8 colors in 4 reaction steps allow encoding 84=4096 units.

91

4.5.2. Units without labels. Encoding by position in space


Methods developed independently in two laboratories by Smith et al.37 and Furka et al.38
made the use of physical labels on the solid support units unnecessary. In both methods the
macroscopic solid support units were stringed and the position of the units on the string encoded
the identity of the unit.
In the method of Smith et al. in addition to the position on the strings, colors and reaction
vessels were also parts of the code and the authors termed their technique Encore for Encoding
by Necklace, Color and Reaction vessel. The method of Furka et al. is termed String Synthesis.
The units are identified by the string number and their position on the strings.

4.5.2.1. The Encore technique


The described version the Encore technique39 is suitable for preparing up to 960
compounds using 10, 8 and 12 building blocks in the first, second and third synthetic step. The
solid support units are SynPhase Lanters.

96

96

96

96

96

96

96

96

96

96

b
c
d

12

12

12

12

12

12

12

12

12 numbered reaction vessels

Figure 4.24. The Encore technique.


a: reaction vessels of the first coupling step, b and d: strings labeled with color rings, c: the
stringing tool, e: reaction vessels of the second coupling step showing the color of the strings, f:
one of the 12 reaction vessels of the third coupling step.

92

The Encore technique is demonstrated in Figure 4.24. The 960 lanterns are evenly
distributed among 10 reaction vessels (a). The content of each reaction vessels is reacted with
one of the 10 building blocks of the first reaction step. At the end of this step the content of all
the 96 lanterns placed in a reaction vessel is the same.
Before coupling with the second building block the lanterns are stringed on stainless steel
or polyethylene stringing tools (c). Each string contains 10 lanterns and a labeling color ring. The
positions of the lanterns are counted from the ring. Each lantern of a string comes from a
different reaction vessel (b). This way 96 identical strings are formed. Each string contains 10
lanterns and the content of each lantern is different. The strings are labeled by color rings. There
are 8 different colors.
The 96 strings are distributed into 8 groups of 12 strings labeled with the same color. The
figure shows two examples (b and d). The 12 strings having the same color are transferred into
the same reaction vessel in order to react in step 2 with the same building block. As a result, each
of the 8 reaction vessels contains 12 strings (e). After the second synthetic step the strings are
rearranged into 12 numbered reaction vessels (f). One string from each of the 8 reaction vessels is
transferred into one numbered vessel. This way each of the 8 strings of one reaction vessel carries
a different color. After the third synthetic step the lanterns are transferred into ten 96 well plates
for cleaving the formed compounds from the support. One lantern goes into each well of the
plate. The content of each well is identified from three recorded data: position of the lantern on
the string, color of the string and the number of the third step reaction vessel. The position on the
string, the color of the string and the number of the reaction vessel identify the first, second and
third building block, respectively.
The Encore technology has been commercialized and a number of tools were developed
and made available for simplifying the operations.

4.5.2.2. The String Synthesis


The String Synthesis introduced by Furka et al.38,40,41, like the Encore technology, also
uses stringed macroscopic solid support units and the units are identified by their position
occupied on the string. The other aspects, however, are entirely different. First of all only one
string is assigned for every building block in the synthesis. Consequently, the content of the
strings coming out from a synthetic step must be redistributed into the strings of the next step.
The units are not pooled. The redistribution follows the combinatorial distribution rule (see
4.1.1.1.): all products formed in a synthetic step are equally divided among all reaction vessels of
the next synthetic step. This means that the units of any string that contain the same product have
to be evenly distributed among the strings of the next step. This can be done without pooling the
units that would result in loosing the information embodied in the strings. The units are directly
transferred from the old (source) strings into the new (destination) ones. This has two important
consequences:
(i)

(ii)

If the redistribution follows a predetermined pattern the information stored in the


sequences of the units is preserved during the whole synthetic process and the
route of every unit can be tracked by computer.
The units can be transferred in groups (except the last redistribution) that make the
process much faster than the one by one sorting.

93

(iii)

The computer controlled redistribution offers a possibility for automation of the


sorting process.

The inventors of the method considered different shapes for the units, different patterns
for their redistribution and different sorting devices.40 In the following pages the use of two kinds
of units and two kinds of manual sorting devices are described applying a fast redistribution
pattern named Semi-Parallel Sorting.
The support units are Mimotopes SynPhase Crowns and SynPhase Lanterns demonstrated
in Figure 4.25 (see also Figure 4.23. a and b). The crowns are used attached to stems (a and c)
that makes possible stringing. The stems are available at Mimotopes in different colors. Lanterns
can also be used attached to stems (d) but they can also be used in themselves (g) since they have
a hole in their center that provides possibility for stringing. The commercial stems are modified.
They have a drilled hole to allow the string to be passed, and they are carved to keep the holes
parallel and facilitate threading while they are in the sorting device (b). The modified stems can
be used repeatedly. An empty stem (e) is used to label the head of the strings and a half stem (f)
at the tail of the strings.

Figure 4.25. Solid supports used in String Synthesis.


a: stem, b: carved stem, c and d: crown and lantern attached to carved stem, respectively,
e: full stem with scratches to label the head and the number of the string, f: half stem for labeling
the tail of the strings, g: lantern used without stem.
Tail

Head

10

15

20

10

25

15

Figure 4.26. Stringed crowns and lanterns. Their position is counted


from the head of the strings
94

Figure 4.26. shows stringed crowns and lanterns with full stem labels at the heads and
half stems at the tails. The two ends of the string must be distinguishable. Labeling at least one
end of the strings makes possible to unequivocally define the position of the crowns. Position of
the units on the string is counted from head to tail.
The number of strings that are formed in a synthetic step depends on the number of
building blocks used in that step since every building block needs a different string. Each string is
placed into a different reaction vessel for coupling. The strings themselves must be numbered or
otherwise labeled. The simplest way to label the strings is to make visible scratches on the stem
marking the head (Figure 4.25./e). Using colored stems is also a possibility.
The string itself must be resistant to solvents and other reaction conditions occurring in
the synthesis. In preparation of peptide libraries a polyethylene fishing line proved applicable.
The use of the String Synthesis is demonstrated with preparation of a tripeptide library on
SynPhase Crowns. Five amino acids are used as building blocks in each coupling step.
Consequently 5 strings need to be formed in each step and the number of the expected tripeptides
is 5x5x5=125. The number of crowns on one string is 25. After threading the crowns, each string
is placed into a reaction vessel carrying the same number as the string (Figure 4.27.). In each
reaction vessel the coupling is done with a different amino acid.

1
5

Figure 4.27. Strings in reaction vessels showing below them scratched


full stems that indicate the number of the strings
Manual devices for redistribution of the units. The strings coming from the reaction
vessels after completed couplings are named source strings. Their support units, crowns or
lanterns, are then redistributed into the strings of the next reaction step denoted as destination
strings.
Crown sorter

Source tray

Destination tray

Lantern sorter

Source tray

Destination tray

Figure 4.28. Devices for sorting crowns and lanterns


95

Redistributions can be carried out using very simple devices that can be easily made by a
machine shop. Two different devices are constructed for sorting: one for crowns and another one
for lanterns. Both devices operate on the same principle and both contain two identical pieces: a
source tray and a destination tray (Figure 4.28.). Both pieces of the crown sorter are metal plates
with several numbered parallel slots and bent at the two edges. The pieces of the lantern sorter
are polymer plates with numbered grooves. Before sorting, the crowns hang in the slots and the
lanterns stay in the grooves of the source tray as shown in Figure 4.29.

Crowns

Lanterns
Figure 4.29. Crowns and lanterns in the sorting device

In the sorting process the crowns or lanterns are pushed into the slots and grooves,
respectively, of the destination tray. It is important to note that in this operation the units preserve
their positions relative to each other when they are redistributed in groups. Figure 4.30 shows the
top view of the crown sorter after the delivery of a group of 5 crowns from the slot 5 of the
source tray into the slot 1 of the destination tray.

Head

5
4
3
2
1

Tail

5
4
3
2
1

Head

Source tray

Tail

Destination tray

Figure 4.30. Top view of the crown sorter

The slots and groves of the source and destination trays of the sorter are numbered. It is
important to place each source string into the slot of the source tray carrying the same number.
The heads and tails of the source strings must be positioned into the slots of the source tray as
indicated in the figure otherwise the software (see later) can not be used. After sorting, the units
are restrung.
The destination strings must be numbered according to the numbers of the destination
slots or grooves and render their heads and tails to the heads and tails of the destination slots or
grooves.

96

a
Tail

Head
b

Figure 4.31. Crowns in the slot of the source plate. a: after loading, b: after removing the
string

The units are loaded into the sorter in stringed form. Figure 4.31. shows the crowns
loaded into the slots while still attached to the string (a). When the units are in their place in the
source plate the string is cut and removed (b).
The units are sorted in a string free form. After sorting they are found in the destination
plate. Before the delivery into the reaction vessels they are restrung (Figure 4.32.).

Figure 4.32. Stringing the crowns in the slot of the destination tray

Semi-Parallel Sorting (SPS). In the string synthesis one solid support unit is assigned for
each product. Except the last elongation step, the products form in groups (product groups) and
occupy a defined region on the string. The number of units in each product group is the same.
The units of each group need to be evenly distributed among the strings of the next step. Except
the last distribution step the units are also transferred in groups (delivery groups). In order to get
the number of units of the delivery groups the number of units of the product groups is divided by
the number of the destination strings. This calculation is done by computer.
Source trays

Destination trays
Figure 4.33. Semi-Parallel Sorting. Sorting the units of 3 source strings into 3 destination
ones in 5 relative positions of the source and destination trays.
97

The simplest way of the distribution would be to first transfer all the units of one source
string to the destination strings and then follow with the next source string. The Semi-Parallel
Sorting, however, outlined in Figure 4.33. is faster.
In positions 1, 2, 3, 4 and 5 of the figure, one, two, three, two and one slots of the source
and destination trays are in alignment (indicated by enhanced lines). From each aligned slot of
the source tray one delivery group of units is transferred into the corresponding slot of the
destination tray. The deliveries in these positions are repeated until all units are transferred.
A

Starting Data for Semi-Parallel Sorting


SORTING: Delivery in every cycle starts from the highest number (rightmost) source
slot into the first (leftmost) destination slot.
Run: Ctrl + S
The number of monomers and their symbols (A,C,L etc.) must be entered
The calculated sequences are reversed, they reflect the coupling positions from left to right
Use the red numbers in the sorting process
Do not delete blue cells! Enter data only into yellow cells!
Number of building blocks in
Crowns
Number of slots
Crowns in slots Identical
coupling steps
Sort
to
(maximum number is 52)
Number
Source
Destin. Source Destin. Crowns move
CP 1
5
1
5
5
25
25
25
5
CP 2
5
2
5
5
25
25
5
1
CP 3
5
3
0
0
0
0
1
0
CP 4
4
0
0
0
0
0
0
CP 5
5
0
0
0
0
0
0
CP 6
6
0
0
0
0
0
0
CP 7
7
0
0
0
0
0
0
CP 8
8
0
0
0
0
0
0
CP 9
9
0
0
0
0
0
0
CP 10
Maximum number of building blocks in one step 20
Total number of
crowns
125
Maximum number of sorter slots 20
Number of coupling
Steps:
3
Maximum number of crowns 1,000
Pause (in seconds):
MONOMERS
IN
COUPLINGS
String number
1
2
3
4
5
6
7
8
CP 1
I
F
L
V
G
CP 2
E
F
W
Y
S
CP 3
E
F
W
Y
S
CP 4
CP 5
CP 6
CP 7
CP 8
CP 9
CP 10

Figure 4.34. Datasheet of the Excel Book where the starting data are entered.

98

The software. The software is written in Visual Basic and the data appear in Microsoft
Excel sheets. It can handle up to 1000 crowns, up to 20 reagents (building blocks) and up to 9
reaction steps. Figure 4.34. shows the datasheet of the Excel book where the starting data are
entered. Among the starting data are the number and symbols of the building blocks (monomers)
used in the coupling steps. The symbols are single letter abbreviations. In the case of peptide
synthesis the symbols correspond to the respective amino acids.
In the Excel sheets the areas of data entrance are yellow. Several data are instantly
calculated and appear in the blue regions of the screen.

First coupling
Str 1

Str 2

Str 3

Str 4

Str 5

Str 1

Str 2

Str 3

Str 4

Str 5

Str 1

Str 2

Str 3

Str 4

Str 5

First sorting

Second coupling

Second sorting

Third coupling

String 1

String 2

String 3

Products
String 4

Figure 4.35. The flow diagram of the synthesis

99

String 5

Among the instantly calculated data are the total number of crowns (or lanterns) needed in
the synthesis and the number of coupling steps (column B), the number of source and destination
slots (or grooves) used in the first and subsequent sorting steps (D and E) and the number of
crowns occupying these slots (F and G). The number of crowns in product groups, that is the
number of units that contain the same product appear in H. The number of crowns in a delivery
group that have to be moved in every sorting cycle from a source to a destination slot can be seen
in column I.
The program can be started by pressing together Ctrl and S. The result of calculations
appears (depending on the number of redistributions) in sheets Sort #1 through Sort #9. The
sheets show a block containing the products present in the crowns of the source slots and, below
these, a second block showing the content of the crowns sorted into the destination slots.
Positions of the crowns are counted downward from the top. The number of sheets
showing the results of couplings and sortings is equal to the number of sortings plus one. The last
sheet contains the predicted product distribution on the final strings.
The software is free and can be downloaded via the Internet from the following Web site:
http://szerves.chem.elte.hu/furka
by clicking on the title Excel Book appearing on the lower part of the main page. The software is
available only for those who have Excel installed in their computer.
Experimental example. Synthesis of a library of 125 tripeptides. The synthesis was carried
out using 125 Mimotopes SynPhase Crowns (capacity 5.3 mol each) derivatized with FmocRink amide linker. The procedure was started with the formation of 5 strings by threading 25
crown units on Berkley Fire Line fishing line. Five Fmoc protected amino acids were used in
each coupling position. The flow diagram is demonstrated Figure 4.35. The symbols of amino
acids used in the couplings are those found in the Datasheet demonstrated in Figure 4.34. and that
are also indicated in Figure 4.35. below the reaction vessels.
Coupling. Couplings were carried out with strings placed in 100 ml flasks. The protecting
groups were removed by adding 10 ml 1:1 v/v DMF-piperidine then mixed on an orbital mixer
for 30 minutes. After the cleaving the protecting groups the solutions were decanted from the
strings then washed with 3x15 ml DMF, 15 ml DCM, 15 ml DMF, 15 ml DCM and 2x15 ml
DMF. The deprotection operation was once more repeated then finally washed with 2x15 ml
DCM. The strings were dried then 10 mmol Fmoc amino acid, 10 mmol HOBt and 15 mmol DIC
was added in 10 ml NMP solution then mixed on orbital mixer for 2 hours. The solution was then
decanted and washed with 3x40 ml DMF, 40 ml DCM and 2x40 ml DMF. The above coupling
operation was once more repeated. The strings were finally washed with 2x40 ml DCM. The
crowns, still on the strings were dried in an oven then submitted to sorting.
Sorting. After entering the starting data into the Datasheet (Figure 4.34), it can be read
from the last column, that in the first sorting the crowns need to be moved from each slot in
groups of 5. In the second sorting the crowns are moved one by one. The first sorting is
demonstrated in Figure 4.36.
The redistribution of the 125 crowns was completed in the nine stages each representing
different relative positions of the source and destination trays. In the stages 1, 2, 3, 4, 5, 6, 7, 8
and 9 the number of the transferred groups (of 5 crowns) was 1, 2, 3, 4, 5, 4, 3, 2 and 1,
respectively.

100

Stage

Source tray

Stage
5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1
5
4
3
2
1

Destination tray

Source tray

Destination tray

5
4
3
2
1

Figure 4.36. First sorting


The second sorting is demonstrated in Figure 4.37. In this case the delivery groups
contained only a single crown, so one crown was transferred from each slot. The redistribution
was completed in 5 cycles each containing 9 stages. Figure 4.37. shows only the first cycle. In the
9 stages of the first sorting cycle altogether 25 crowns were delivered into the destination tray.
The rest of the crowns were redistributed in additional 4 cycles not shown in the figure.
Cleavage. The crowns were separately placed in numbered test tubes. The string numbers,
positions on the strings and the numbers of the test tubes were simultaneously recorded. 1 ml 1:1
v/v piperidine-DMF was added to each test tube and left to stand for 30 minutes. The crowns
were then washed with 3x2 ml DMF, 2 ml DCM, 2 ml DMF and 2x2 ml DCM. After adding 1 ml
95% TFA/H2O the tubes were allowed to stand for 30 minutes. The solutions were decanted into
vials numbered according to the numbers of the test tubes. The crowns were washed with 1 ml
95% TFA/H2O and the solutions added to the same vials then dried in a rotawap.
Product distribution. The product distribution on the strings during the synthesis
predicted by the computer appeared in sheets Sort #1, Sort #2 and Sort #3 of the Excel Book.
Some of the predictions concerning the String No. 1 are summarized in Table 4.18. It can be seen
that after coupling 1 on String 1 - as expected - all units contained I. After the first redistribution,
String 1 contained five products in groups of five crowns. The product distribution in the rest of
the strings not shown in the figure was exactly the same.

101

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

Stage

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

Source tray

Destination tray

Stage
5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

Source tray

Destination tray

5
4
3
2
1

Figure 4.37. Second sorting. The first cycle

After coupling 2, String 1 contained five dipeptides in groups of five crowns. The rest of
the strings (not shown) differed from the first one since different amino acids were coupled into
them. After the second sorting, as the table shows, all products in crowns of String 1 were
different. Again, the product distribution in the rest of the strings (not shown) were exactly the
same. It is typical in String Synthesis that after redistributions the product distribution on all
strings is the same.
It is also typical that after couplings the strings are different. After the third that is the
last coupling not only the strings differ from each other but the content of the crowns within the
strings is also different. Positions of the formed tripeptides on the five strings after the third
coupling are shown in Table 4.19.
Since the redistribution process is directed by computer, the String Synthesis is suitable
for automation. Although no automatic machine has yet been constructed, an automatic sorter
designed according to Figure 4.38. would be capable to sort very fast tens of thousands of support
units placed in vertical source tubes (a) arranged circularly.42 In the sorting process the units
would be dropped through computer controlled electronic gates into the destination tubes (b)
stepwise rotated. This arrangement of the tubes would make possible to transfer the units
simultaneously from all tubes (parallel sorting).
102

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

5
4
3
2
1

Table 4.18. Content of strings No. 1 (Str.1)


after first and second coupling (Cpl. 1 and Cpl.2) and first and second sorting (Sort 1 and Sort 2)
Position
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Cpl. 1
Str. 1
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I

Sort 1
Str. 1
I
I
I
I
I
F
F
F
F
F
L
L
L
L
L
V
V
V
V
V
G
G
G
G
G

2
7

Cpl. 2
Str. 1
EI
EI
EI
EI
EI
EF
EF
EF
EF
EF
EL
EL
EL
EL
EL
EV
EV
EV
EV
EV
EG
EG
EG
EG
EG

4
5

b
1

2
7

3
6

4
5

Figure 4.38. Parallel sorting

103

Sort 2
Str. 1
EI
FI
WI
YI
SI
EF
FF
WF
YF
SF
EL
FL
WL
YL
SL
EV
FV
WV
YV
SV
EG
FG
WG
YG
SG

Table 4.19. Position of products on the final strings


Str. 1
Products
EEI
EFI
EWI
EYI
ESI
EEF
EFF
EWF
EYF
ESF
EEL
EFL
EWL
EYL
ESL
EEV
EFV
EWV
EYV
ESV
EEG
EFG
EWG
EYG
ESG

Str. 2
Products
FEI
FFI
FWI
FYI
FSI
FEF
FFF
FWF
FYF
FSF
FEL
FFL
FWL
FYL
FSL
FEV
FFV
FWV
FYV
FSV
FEG
FFG
FWG
FYG
FSG

Str. 3
Products
WEI
WFI
WWI
WYI
WSI
WEF
WFF
WWF
WYF
WSF
WEL
WFL
WWL
WYL
WSL
WEV
WFV
WWV
WYV
WSV
WEG
WFG
WWG
WYG
WSG

Str. 4
Products
YEI
YFI
YWI
YYI
YSI
YEF
YFF
YWF
YYF
YSF
YEL
YFL
YWL
YYL
YSL
YEV
YFV
YWV
YYV
YSV
YEG
YFG
YWG
YYG
YSG

Str. 5
Products
SEI
SFI
SWI
SYI
SSI
SEF
SFF
SWF
SYF
SSF
SEL
SFL
SWL
SYL
SSL
SEV
SFV
SWV
SYV
SSV
SEG
SFG
SWG
SYG
SSG

4.5.2.3. String synthesis of cherry picked libraries43


The software developed to guide sorting in String Synthesis can be used only when
complete combinatorial libraries are prepared. Very often, however, only selected components of
the full libraries are needed. These non-complete combinatorial libraries are often called cherry
picked libraries. In order to make possible preparation of such libraries by String Synthesis,
modified software has been constructed. Like the software described in the previous section this
software is also written in Visual Basic and can be downloaded free of charge via the Internet
from the same address: http://szerves.chem.elte.hu/furka by clicking on the title Excel Book 2
appearing in the lower part of the main page.
When using the software first of all the sequences of the cherry picked (input) library
have to be entered into the computer (e.g. copy the sequences into column A of the Input
Sheet).The software then analyses the sequences then generates a virtual library that is in fact a
full combinatorial library that contains all the actual members of the input library. This is

104

followed by rearranging the components of the input library according to their order in the virtual
library then distributed into the starting source strings. Table 4.20 shows a part of an input
library, the generated virtual library and the rearranged input library.

Table 4.20. Sequences in the input, virtual and the rearranged input libraries

1
2
3
4
5
6
7
8
9
10

Cherry P.

Virtual

Cherry P.

Input

Library

Rearranged

CITW
CITA
CITF
DGPV
DGPW
DGPA
DGPG
DGPF
DGRV
DGRW

CITW
CITA
CITF
CITV
CITG
CIPW
CIPA
CIPF
CIPV
CIPG

CITW
CITA
CITF
CITV
CIPW
CIPF
CIPV
CIPG
CIRW
CIRF

Since the library to be synthesized is not a complete combinatorial library, the delivery of
the support units from the source strings into the destination ones can not occur in equal groups.
The software generates tables that guide the redistribution operations in every phase of the
synthetic process. They also provide possibility to check for potential errors of the operator.
The building blocks of the library are coded using both the lower case and the capital
letters of the English alphabet (all together 52 symbols). The sequences of these letters encode
the compounds to be synthesized. The order of coupling positions - that is the order of the
characters in the sequences - go left to right. These are practically inversed peptide sequences.
When entering the input sequences into the computer there are no restrictions concerning
the order of library members but no gaps in column A are allowed. Column A can accept a
maximum of 15,000 sequences. The order of characters in the input sequences can be reversed (if
for example peptide sequences are used) by pushing Ctrl + i (Figure 4.39.)). The maximum
number of characters in the sequences, that is the maximum number of building blocks, is 10.
Execution of the sorting program can be started by pushing Ctrl + e (Figure 4.39). The
execution time depends on the size of the library and on the speed of the computer. In the
execution process the sequences are assigned to support units then the units are grouped into
strings. The tables guiding the redistributions are then calculated and displayed. Execution of the
program stops at Sheet 13, showing the position of the products on the final strings. Figure 4.40.
shows a part of Sheet 13.
Rearranging the order of the components of the input library. The order of the
components of the input library in column A is usually accidental. For this reason they can not be
directly arranged into strings that are submitted to coupling with the same building blocks.

105

I N P U T

1 CITW

Copy the library sequences into column A

3 CITF

Execute sorting: Ctr+e

Clear input column A: Ctr+j

4 CITV

Invert sequence: Ctr+i

Save original library: Ctr+b

CIPW
CIPF
CIPV
CIPG
CIRW
CIRF
CIRV
CISW
CISA
CISV
CISG

CP2
CP3
CP4

Sort the virtual library: Ctr+s

206 Number of monomers in sequences


M O N O M E R S (BLOCKS) O F T H E L I B R A R Y
Num
1
2
3
4
5
6
7
8
9
10
5
C
D
E
F
A
49
46 25
32
54
3
I
G
H
61
76 69
4
T
P
R
S
50
45 62
49
5
W
A
F
V
G
52
33 48
46
27

Number of compounds

CP1

S H E E T

2 CITA

5
6
7
8
9
10
11
12
13
14
15

4
11

Figure 4.39. Part of the Input Sheet.

The strings that need to be formed usually do not even contain the same number of units.
As a consequence, the order of the components of the input library has to be rearranged into a
form that allows regular redistributions.

PRODUCTS ON THE STRINGS


Unit
number
1
2
3
4
5
6
7
8
9
10

Str.
1
AHSW
AHRW
AHPW
AHTW
AGSW
AGRW
AGPW
AGTW
AISW
AIRW

Str.
2
AHSA
AHRA
AHTA
AGSA
AGRA
AGPA
AGTA
AISA
AIRA
AITA

Str.
3
AHSF
AHRF
AHPF
AHTF
AGSF
AGRF
AGPF
AGTF
AISF
AIRF

Str.
4
AHSV
AHRV
AHPV
AHTV
AGSV
AGRV
AGPV
AGTV
AISV
AIRV

Str.
5
AHSG
AHRG
AGSG
AGRG
AGPG
AISG
AIRG
AIPG
FHRG
FGPG

Figure 4.40. Products on the strings

A partial tetrapeptide library is used to illustrate the operations executed by the program.
By analyzing the input library the program first determines the crucial starting data and displays
106

them in the Input Sheet (Figure 4.39.):


(i)
(ii)
(iii)
(iv)
(v)

The number of input sequences (F5)


The length of the sequences (N5)
The number of amino acids used in the different coupling positions (CP1 to CP4)
Codes of the amino acids (section Monomers of the library).
The number of units into which a particular amino acid has to be coupled
(displayed below the code of the amino acid).

Based on the above data, a full (virtual) peptide library is generated in which all
components of the input library are present. The number of components in the virtual library is
limited to 30,000. The sequences of the input library are then arranged into the order they appear
in the virtual library. Sheet 13 shows the sequences of the virtual library and those of the input
cherry picked and the rearranged cherry picked libraries. The first 10 sequences of these libraries
are reproduced in Table 4.20. The components of the original cherry picked library in column A
of the Input Sheet is replaced by the rearranged library. All further manipulations are based on
this rearranged library: first the sequences are assigned to support units and then the units are
distributed into the starting strings. The occupancy of the starting strings appears in Sheet 3. The
same sheet, in its lower part, contains the guiding tables for the first redistribution. In the
experimental realization of the synthesis the starting strings need to be formed manually by
placing the indicated number of support units on the strings then submit them to the first coupling
step. The symbols of the amino acids that need to be coupled into strings appear in red. Those of
the other amino acids in the sequences are black. The sets of destination strings that are formed in
redistribution steps occupy one of the Sheets 4 to 12. The products appear in Sheet 13.

Third coupling with monomers:


T

N u m b e r of u n i t s o n t h e s t r i n g s
50

45

62

49

Third sorting
Str 1
CITV
CITF
CITA
CITW
CGTV
CGTF
CGTA
CGTW
CHTV
CHTF

Str 2
CIPG
CIPV
CIPF
CIPW
CGPG
CGPV
CGPF
CGPW
CHPV
CHPF

Str 3
CIRV
CIRF
CIRW
CGRG
CGRV
CGRF
CGRA
CGRW
CHRG
CHRV

Str 4
CISG
CISV
CISA
CISW
CGSG
CGSV
CGSF
CGSA
CGSW
CHSG

1
2
3
4
5
6
7
8
9
10

Figure 4.41. Sequences on the strings undergoing the third coupling step

107

As an example, Figure 4.41. shows a part of Sheet 5. This sheet demonstrates the 4 strings
that undergo the third coupling. The figure includes the first 10 tetrapeptide sequences from each
string. The codes of amino acids that need to be coupled with the respected strings appear at the
top and the number of units in each string are found below them. The numbers in the last column
show the position of the units on the strings. The sequences of the strings are printed in three
different colors: The codes of amino acids already coupled into the units in the previous coupling
steps are blue. The amino acids of the actual coupling steps appear in red and the codes the amino
acids that need to be coupled into the units in the forthcoming coupling steps are black.
The products of the synthesis and their positions on the strings appear in Sheet 13 (Figure
4.40). The sequences of the products are also shown in reversed form in the lower part of the
sheet (not shown in the table). This makes possible to read the orders of the building blocks as
peptide sequences, too.
Guiding tables for redistribution experiments. As already mentioned, in the case of the
cherry picked libraries some components that are present in a full (or virtual) library are missing.:

Data for checking potential errors in sorting

Data for guiding redistribution


No. of units
to deliver
from source
1
2
3
4
3
4
4

2
5
2
3

5
5
4
4

5
5
5
4

206

Units in
source
troughs
1
2
3

Units in
destination
troughs
1
2
3
4

0
4
7
11
15
15

0
0
5
10
14
18

0
0
0
5
9
12

50
50
50
50
46
42

45
45
45
41
37
34

62
62
59
54
50
50

49
45
40
35
35
35

5
4
3
4

5
5
5
5
5
5

/
/
/
/
/
/

6
5
4
3
2
1

3
5
3
4

4
4
4
4
4
4

/
/
/
/
/
/

6
5
4
3
2
1

15
17
22
24
27
27

18
18
23
28
33
37

16
16
16
19
24
27

38
38
38
38
35
31

34
34
34
32
27
24

50
50
45
40
35
35

35
33
28
25
25
25

4
3
3

3
3
3
3
3
3

/
/
/
/
/
/

6
5
4
3
2
1

27
27
30
30
32
32

37
37
37
39
43
47

31
31
31
31
35
38

27
27
27
27
25
21

24
24
24
24
20
17

35
35
32
30
26
26

25
25
25
25
25
25

3
2
4
4

Cycle/stop
position

Figure 4.42. Part of an Experiment guiding table

108

This has two consequences


(i)

The numbers of support units belonging to the strings usually differ. This is
clearly seen, for example, in Figure 4.41 (number of units on the strings).
(ii) The number of units within the groups of identical products may also differ.

(ii)

For these reasons the transfers in redistributions can not be realized in equal groups. Even
empty groups may occur. As a consequence, in order to be able to execute the redistributions the
number of units of every delivered group has to be calculated and displayed by the computer. The
Experiment guiding tables are presented in the lower parts of Sheets 3 to 12. A part of the table
found in Sheet 4 is demonstrated in Figure 4.42.
The table guides the redistribution after the second coupling. This is the second sorting
process when the units of three source strings are redistributed into four destination strings. The
guiding data are found in the columns below the title: Data for guiding redistribution. The third
sorting is realized in 5 cycles. The figure shows only those guiding numbers that belong to cycles
4 and 5. In each cycle the deliveries occur at 6 different relative positions (stops) of the two trays
of the manual sorter. The 6 relative tray positions of the six stops of a cycle are demonstrated in
Figure 4.43. Taking cycle 5 as example, the cycle/stop positions change from 5/1 to 5/6. The
same relative tray positions are repeated in all cycles. The support units (crowns or lanterns) are
delivered from the slots/troughs of the upper source tray into those of the lower destination tray.
The slots/troughs of the trays appear as vertical lines. The enhanced lines represent slots/troughs
from which and to which the units are delivered in a particular stop position.

12 345 67

12 345 67

5/1

1234567

1234567

5/2

1234567

1234567

12 345 67

1234567

5 /3
5/4
Cycle/stop positions

1234567

1234567

12 34 56 7

1234567

5/5

5 /6

Figure 4.43. The 6 relative tray positions in the 6 stop positions of cycle 5

Figure 4.42. shows one column for each of the three source strings from which the units
need to be transferred. The cycle/stop numbers are found in the fourth column. The start position
(stop position 1) is at the bottom in all cycles. The numbers of support units that have to be
transferred at a stop position from the source slots/troughs into the destination ones are found in
the columns of the strings in the same row where the cycle/stop numbers are found. For example
in the stop position 2 of cycle 5 (5/2) 4 and 3 units are transferred from the slots/troughs 2 and 3,

109

respectively. These transfers are made from slots/troughs 2 and 3 into the destination
slots/troughs 1 and 2, respectively (see Figure 4.43).
The program also provides possibility to check the accuracy of the redistribution in every
phase of execution and discover a potential error made by the operator. This is made possible by
displaying the number of units that have to remain in the source slots and appear in the
destination slots after successful transfers. These numbers are found in Figure 4.42. below the
title: Data for checking potential errors in sorting. These numbers are also found in the same row
where the cycle/stop positions are. After the mentioned transfers in the stop position 5/2, for
example, 14 and 9 units remain in the source slots/troughs 2 and 3, respectively and 46 and 37
units appear in the destination slots/troughs 1 and 2, respectively.
The software developed to guide sorting in the synthesis of cherry picked libraries also
provides a possibility to automate the redistribution process.

4.6. Examples
4.6.1. Split-Mix Synthesis of an encoded benzimidazole library.44
The library synthesized at Affymax (an American company) had three diversity positions
using 36 building blocks in each position. The structure of the components can be described by
the following general formula.
R2
N

R1

R3

HN

N
O

Since 36 building blocks were used in three positions the number of components was
36x36x36=46,656. R1 and R2 were built into the structure by using amines as building blocks;
their structure is demonstrated in Figure 4.44.
R

R'

H3C
O,S,N
n
NH2

n
NH2

n
NH2

NH2

n
NH2

NH2

Figure 4.44. Structure of amines used in the synthesis of the benzimidazole library. Total
number: 71

R3 was introduced by aldehide building blocks. Their structures are represented in Figure
4.45.

110

CHO
R

CHO

R'

CHO

CHO

H3C

O,S,N
R

n
NH2

Figure 4.45. The aldehide building blocks. Total number: 35

The beads were encoded by a special binary type encoding developed at Affymax. The
encoding tags were secondary amines that were built in using the Alloc (allyloxycarbonyl)
protected monomers shown below. R and R are various length alkyl chains.
R
N

OH

R'

N
O

O
O

Encoding tags were used only at the first and second diversity positions. At the end of the
synthesis the samples were not mixed so encoding at this stage was unnecessary.
When a peptide library is prepared amino acids are used as building blocks. Their
reactivity is well known as well as the optimal reaction conditions. This is not the case when a
non-peptide library is synthesized. The reactivity of all building blocks has to be carefully
checked and the reaction conditions also need to be optimized. Some otherwise favorite building
blocks have to be excluded because their poor reactivity. It is not uncommon to spend much more
time with the pre-synthetic studies than with the synthesis itself. In the case of the benzimidazole
library synthesis the reaction conditions were also carefully optimized and the selected building
blocks showed good reactivity.
The synthesis was carried out using Tentagel HL NH2 resin as solid support. The first step
was conversion of a part of amino groups of each bead to be suitable to attach to them the coding
tags and convert the remaining amino groups for the acceptance of the first building block of the
product. For this reason a part of the amino groups were protected by Fmoc groups and the rest
was blocked by Boc protecting groups (Figure 4.46).
Fmoc
HN

NH2
NH2

NH
Boc

NH2
NH
Boc

Figure 4.46. Introduction of Fmoc and Boc protecting groups

This reaction was carried out with 72 g resin (27.4 mmol amine) in DCM in presence of

111

DIEA. The reagent was a mixture of 22.7 g ( 104 mmol) Boc2O and 0.79 g (3.1 mmol) Fmoc-Cl.
The resin was finally treated with piperidine that removed the Fmoc protecting groups and made
available a part (about 1/9 part) of the amino groups for attachment of the first coding tags.

Table 4.21. Encoding mixtures for the first and second diversity position
RV Code1 Code2 RV Code1 Code2 RV Code1
1
A
U
10
AE
UY
19
DE
2
B
V
11
AF
UZ
20
DF
3
C
W
12
BC
VW 21
EF
4
D
X
13
BD
VX
22 ABC
5
E
Y
14
BE
VY
23 ABD
6
F
Z
15
BF
VZ
24 ABE
7
AB
UV
16
CD
WX 25 ABF
8
AC
UW 17
CE
WY 26 ACD
9
AD
UX
18
CF
WZ
27 ACE

Code2 RV Code1
XY
28 ACF
XZ
29 ADE
YZ
30 ADF
UVW 31 AEF
UWX 32 BCD
UVY 33 BCE
UVZ 34 BCF
UWX 35 BDE
UWY 36 BDF

Code2
UWZ
UXY
UXZ
UYZ
VWX
VWY
VWZ
VXY
VXZ

Encoding. Six different tags were used to encode the 36 resin samples at coupling No. 1
(Code1), and another six ones at coupling No. 2 (Code2). Their stock solutions were labeled A, B,
C, D, E, F and U, V, W, X, Y, Z for Code 1 and Code 2, respectively.
Attachment of the Code1 tags for the R1s (Figure 4.47.). The resin was divided into 36
portions and place into reaction vessels RV1 to RV36. Samples of A, B, C, D, E, F solutions were
added to RV1 to RV6. To the rest of the reaction vessels (RV7 to RV36) mixtures were added
according to Table 4.21. The coupling reagents for the acylations were DIC and HOBt.
Tag1-Alloc
HN

NH2
NH

NH

Boc

Boc

Figure 4.47. Encoding at the first diversity position

Coupling the linkers to the resin. Two linkers (L1 and L2) were used in the synthesis.
Structure of both are seen in Figure 4.47. L1 is an acid-labile linker from which the product can
be cleaved off as an unsubstituted amide. In other words R1 in the product is H. The other linker
(L2) makes possible to attach to the resin the primary amines of Figure 4.44. using reductive
amination then cleave the product as substituted amides (R1H).

112

O
O

Fmoc
O

HN

OH
O

L1

OH

L2
Figure 4.48. The linkers

First the Boc protecting groups were removed in all the 36 reaction vessels with a solution
of 50% TFA (Figure 4.49/A). L1 was coupled only to RV1 DIC and HOBt. L2 was coupled to
RV2 to RV36 also by DIC HOBt (Figure 4.49/B).
Tag1-Alloc
HN

Tag1-Alloc

Tag1-Alloc

HN

NH

HN

NH

NH2

Linker

Boc

Figure 4.49. Coupling with the linkers.


A: removal of the Boc protecting group, B: coupling with the linker

Reductive amination with R1 amines (Figure 4.50). The R1 groups were built into the
products by submitting the content of RV2 to RV36 to reductive amination with amines selected
from those in Figure 4.44. RV1 was left unchanged since the L1 linker itself holds the amino
group in Fmoc protected form. The reductive amination was carried out by adding solutions of
the 35 amines and NaCNBH3 to RVs 2 to 36 and keeping the solution at 50 o for 12 hrs.
Tag1-Alloc

Tag1-Alloc
HN

HN

NH

NH

R1
Linker

Linker

NH

Figure 4.50. Introduction of the R 1 groups by reductive amination

113

Building in a scaffold. The next synthetic step was the attachment of a substituted
benzene scaffold by acylating the amine nitrogen with 4-fluoro-3-nitrobenzoic acid (Figure 4.51).
Tag1-Alloc

Tag1-Alloc
HN

HN

NH
Linker

R1

NH

R1

Linker

NH

N
F
O
NO2

Figure 4.51. Acylation with 4-fluoro-3-nitrobenzoic acid

Since the amino group in the L1 linker was protected by Fmoc group the content of RV1
was treated with piperidine to remove the protecting group. Then the couplings were carried out
in all the 36 reaction vessels by adding solutions of 4-fluoro-3-nitrobenzoic acid and DIC.
After the acylation the 36 resin samples were combined in solvent then mixed with
mechanical stirring and nitrogen bubbling. After washing the resin was dried then divided into 36
equal portions.
Nucleophilic displacement of fluorine by R 2 amines Figure (4.52). One of the 36 amines
(in 24x molar excess), solvent and DIEA were added to each of the reaction vessels, kept at 50o
for 12 hrs then washed.

HN

Tag1-Tag2-Alloc
NH

HN

R1

NH

Linker N
O

Tag1-Tag2-Alloc
R1

Linker N

R2
N
H

NO2

R2
N
H
NH2

Figure 4.52. Displacement of fluorine by R2 amines

Attachment of the second set of tags for encoding the R2 amines (Figure 4.53). For
encoding in the second diversity position a different set of encoding N-Alloc-Tag monomers
were used. Their labeled U, V, W, X, Y and Z. They were used individually and as mixtures
according Table 4.21. First the Alloc protecting groups were removed from the R1 encoding tags.
To each reaction vessel a solution of 1 M TBAF and TMSN3 was added followed by addition of a
solution of Pd(PPh3)4. After rapid mixing, the solution was left to stand at room temperature. The
liberated secondary amines of Tag1-s were acylated in the presence of DIAE and HATU. After
washing the 36 resin samples were combined. Before combining, however, usually 5 beads of
each sample were decoded to ensure the fidelity of coding.

114

HN

Tag1-Alloc
HN

Tag1

HN

A
NH

Tag1-Tag2-Alloc

R1

NH

Linker N

R2

Linker N

N
H

NH

R1
R2
N
H

NO2

R1

Linker N

R2
N
H

O
NO2

NO2

Figure 4.53. Second encoding.


A: removal of Alloc protecting group. B: coupling the encoding tags for the second diversity
position

Reduction of the nitro group and benzimidazole formation with the R3 aldehides. The
combined samples were thoroughly mixed, submitted to reduction with SnCl2 then divided again
into 36 equal samples. To each of 35 samples a different aldehide was added in 15 fold molar
excess to facilitate the ring formation. One sample was reacted with trimethylorthoformate to
make R3=H in the product. The reaction mixtures were heated at 50o for 12 hrs. The samples
were finally washed and dried in vacuo without mixing them.

HN

Tag1-Tag2-Alloc
NH
R1
LinkerN
O

HN

A
R2
N
H

Tag1-Tag2-Alloc

HN

NH
R1
LinkerN

NH
R2

N
H

NO2

NH2

Tag1-Tag2-Alloc
R1
R2

Linker N
N
O

Product

R3

Figure 4.54. Reduction and ring formation.


A: reduction, B: rection with the aldehides of Figure 4.45. and ring formation

Before screening the products were cleaved from individual beads. The products were
used in the screening process and their identity could be determined by decoding the remaining
beads. The beads were treated with 6M HCl in a glass tube. The released secondary amines were
dansylated and identified by HPLC.45,46
4.6.2. Synthesis of a 10,000 member piperazine 2-carboxamide library by Directed Sorting47
Application of the radiofrequency encoding method and automatic sorting described in
chapter 4.5.1. is exemplified by the synthesis of a large organic library containing 10,000 discrete
components represented by the following formula.

115

R3
N

HN
R1

N
R2

Representative members of the arrays of the building blocks used in the three diversity
positions are found in Figure 4.55.
O

R1

H2N

H2N

H2N

H 2N

H2H

H2N

NHBoc

O
O

R2

R2

Cl

O
O

R3

O
O

Cl

HO

NHBoc

O
O

HO

O
O

O O
S
Cl

CF 3
O C N

O
O S
OH Cl

OH

HO

HO

N
H

CF3

HO

NHBoc
HO

HO

O C N
F
FF

HO O C N
O

O
O S
Cl
O

R3

Cl

NH HO

Cl

O
O
O C N

HO
BocHN

NH

Figure 4.55. Building blocks used in the synthesis of the piperazine 2-carboxamide library

The scaffold was built in by coupling with the orthogonally protected piperazine-2carboxylic acid.
O

R3
N

HN
R1

N
R2

The steps of the synthesis can be followed in Figure 4.56. The procedure was started with
10,000 MicroKans filled with resin and also containing the RF-tag. The linker was already
attached to the resin. After splitting the MicroKans into portions directed by the Synthesis
Manager, the first combinatorial step (Figure 4.56/A) was attachment of the primary amines (R1)
by reductive amination in the presence of NaBH(OAc)3. The MicroKans were pooled then the
previously formed secondary amines were acylated in the presence of HBTU and DIEA with the
protected piperazine-2-carboxylic acid (Figure 4.56/B). After attachment of the scaffold the Fmoc
groups were removed by piperidine (Figure 4.56/C) then MicroKans were sorted again.

116

Sorting was followed by the second combinatorial step (Figure 4.56/D) in which the
deprotected nitrogen of the ring was reacted with the R2 building blocks, sulfonyl chlorides,
isocyanates, chloroformates and carboxylic acids using properly selected reagents and solvents.
The MicroKans were pooled again and the Alloc protecting groups were removed (Figure 4.56/E)
with Pd(Ph3)4.
After sorting the third combinatorial (Figure 4.56/F) step was executed using the R3
building blocks to functionalize the second amino group of the ring. Reaction conditions were
similar to the second combinatorial step. The products were removed from the resin (Figure
4.56/G) by treating the MicroKans with 50% TFA-DCM.
O

O
O

NH

Alloc

N
O

R1

R1

Fmoc
O

Alloc
N

N
O

O
O

Alloc
N

R1

N
H

R1

N
R2

O
H
N

O
N
O

R1

F
O

O
R3
N

N
O

N
R2

R1

R3
N

HN
R1

N
R2

R2

Figure 4.56. Scheme of the synthesis of the piperazine 2-carboxamide library

4.6.3. Synthesis of two libraries on one support


A synthesis of a very interesting library has been outlined in a patent application of
Geysen.48 According to the patent application, the library is prepared on a resin that has a built in
arm with two branches (Figure 4.57. 1 and 2).
1

1
3 steps

Protecting group

Protecting group

A1-10B1-10C1-10

A1-10B1-10C1-10

A1-10B1-10C1-10

3 steps
X 1-10Y1-10Z1-10
2

Figure 4.57. Synthesis of two libraries on one support

117

Both branches have appropriate functional group for attachment of building blocks. The
functional group on branch 1 is free that on branch 2 is protected. A three step split-mix synthesis
is executed using 10 building blocks in each step (building blocks A1-10, B1-10 and C1-10 in steps 1,
2 and 3, respectively). Thus 10x10x10=1,000 different trimers are formed on branch 1. After
mixing in the final combinatorial step the protecting group is removed from branch 2 then a
second three step split-mix synthesis is executed using again 10 blocks (X1-10, Y1-10 and Z1-10) in
each step. As a result a 10x10x10=1,000 component library forms on branch 2 of the support.
In the final product to branches 1 of the beads one component of the A, B, C library is
attached. Similarly, branches 2 hold one component of the library X, Y, Z. In both split-mix
processes a single compound forms in each bead. As a consequence if the beads are present in
large excess all possible combinations of pairs of components are formed. Since both libraries
have 1,000 components the total number of different pairs is 1,000x1,000=1,000,000. Of course
to prepare such a library at least of 10,000,000 beads are needed.
Such a library can be used for different purposes. If the two molecules are located at
appropriate distances (depending on the length of the branches) the potential interaction of the
two molecules can be studied. If one of the two molecules is a catalyst then the effect of the
catalyst on the other molecule can be tested. If both molecules are catalysts the experiments may
show which combination of the two catalysts is most effective on added substrate.

References
1. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherlands, 1988, Vol. 5, p 47.
2. . Furka, F. Sebestyn, M. Asgedom, G. Dib Proceedings of the 10th International
Symposium of Medicinal Chemistry, Budapest, Hungary, 1988, p 288, Abstract P-168.
3. . Furka, F. Sebestyn, M. Asgedom, G. Dib Int. J. Peptide Protein Res. 1991, 37, 487.
4. Peptide sequencer is an instrument in which the amino acids are cleaved stepwise from
the peptides starting at the N-terminus and the removed amino acids are identified.
5. R. Smuth, A.Trautwein, T. Richter, G. Nicholson, G. Jung In G. Jung (Ed)
Combinatorial Chemistry 1999, Wiley-VCH, Weinheim, 499.
6. S. Brenner and R. A. Lerner Proc. Natl. Acad. Sci. USA 1992, 89, 5381.
7. M. C. Needels, D. G. Jones, E. H.Tate, G. L. Heinkel, L. M. Kochersperger, W. J. Dower,
R. W. Barett, M. A. Gallop Proc. Natl. Acad. Sci. USA 1993, 90, 10700.
8. J. Nielsen, S. Brenner, K. D. Janda J. Am. Chem. Soc. 1993, 115, 9812.
9. V. Nikolaiev, A. Stierandova, V. Krchnak, B. Seligman, K. S. Lam, S. E. Salmon, M.
Lebl Pept. Res. 1993, 6, 161.
10. J. M. Kerr, S. C. Banville, R. N. Zuckermann J. Am. Chem. Soc. 1993, 115, 2529.
11. M. H. J. Ohlmeyer, R. N. Swanson, L. W. Dillard, J. C. Reader, G. Asouline, R.
Kobayashi, M. Wigler, W. C. Still Proc. Natl. Acad. Sci. USA 1993, 90, 10922.
12. 12.. Furka, F. Sebestyn, J. Gulys In
Proc. 2nd Int. Conf. Biochem. Separations,
Keszthely, Hungary, 1988, 35.
13. . Furka Drug Development Research 1994, 33, 90.
14. S. P. A. Fodor, J. L. Read, M. C. Pirrung, L. Stryer, A. Tsai Lu, D. Solas Science, 1991,
251, 767.
15. F. Sebestyn, G. Dib, A. Kovcs, A. Furka Bioorg. & Med. Chem. Letters 1993, 3, 413.
118

16. H. M. Geysen, S. J. Rodda, T. J. Mason Mol. Immunol. 1986, 23, 709.


17. W. J. Rutter, D. V. Santi U.S. Pat. 5,010,175 (1991).
18. J. M. Ostresh, J. M. Winkle, V. T. Hamashin and R. A. Houghten Biopolymers 1994, 34,
1681.
19. T. Carell, E. A. Winter and J. Rebek, Jr., Angew. Chem. Int. Ed. Eng. 1994, 33, 2061.
20. M. M. Shemyakin, Yu. A. Ovchinnikov, A. A. Kiryushkin, I. V. Kozhevnikova
Tetrahedron Letters 1965, 2323.
21. F . Cramer, R . Helbig, H . Hettler, K. H. Scheit, H .Seliger Angew Chem Int Ed 1966, 5,
601.
22. H . Hayatsu, H.G. Khorana J Am Chem Soc 1966, 88:31823183.
23. H. Han, M. M. Wolfe, S. Brenner, K. D. Janda Proc Natl Acad Sci USA 1995, 92:6419.
24. D. J. Gravert, K. D. Janda Trends Biotechnol 1996, 14, 110.
25. S. P. A. Fodor, J. L. Read, M. C. Pirrung, L. Stryer, A. T. Lu and D. Solas Science 1991,
251, 767.
26. S. P. A. Fodor Laboratory Automation News 1997, 2, 50.
27. http://www.affymetrix.com
28. J. K. Scott and G. P. Smith Science 1990, 249, 404.
29. S. Cwirla, E. A. Peters, R. W. Barrett and W. J. Dower Proc. Natl. Acad. Sci. USA 1990,
87, 6378.
30. J. J. Devlin, L. C. Panganiban and P. E. Devlin Science 1990, 249, 404.
31. G. P. Smith and V. A. Petrenko Chem. Rev. 1997, 97, 391.
32. E. J. Moran, S. Sarshar, J. F. Cargill, M. Shahbaz, A Lio, A. M. M. Mjalli, R. W.
Armstrong J. Am. Chem. Soc. 1995, 117, 10787.
33. K. C. Nicolaou, X Y. Xiao, Z. Parandoosh, A. Senyei, M. P. Nova Angew. Chem. Int.
Ed. Engl. 1995, 36, 2289.
34. http://www.irori.com
35. Xiao-Yi Xiao, K. C. Nicolaou In H. Fenniri (Ed) Combinatorial Chemistry 2000, Oxford
University Press, Oxford, 75.
36. X.-Y. Xiao, C. Zhao, H. Potash, M. P. Nova Angew. Chem. Int. Ed. Engl., 1997, 36, 780.
37. J. Smith, J. Gard, W. Cummings, A. Kaniszai, V. Krchk J. Comb. Chem. 1999, 2, 368.
38. . Furka, J. W. Christensen, E. Healy, H. R. Tanner, H. Saneii J. Comb. Chem. 2000, 2,
220.
39. V. Krchk, V. Padra In G. A. Morales and B. A. Bunin (Eds) Methods in Enzymology,
Combinatorial Chemistry Elsevier Academic Press, 2003, 369, 112.
40. . Furka, Comb. Chem. & High Throughput Screening 2000, 3, 197.
41. . Furka, J. W. Christensen, E. Healy In G. A. Morales and B. A. Bunin (Eds) Methods in
Enzymology, Combinatorial Chemistry Elsevier Academic Press, 2003, 369, 99.
42. . Furka US Patent 7/16/2002.
43. Furka, G. Dib, N. Gombosuren Drug Discovery Technologies 2005, 2, 23.
44. D. Tumelty, L-C. Dong, K. Cao, L. Lee, M. C. Needels In I. Sucholeiki (Ed) High
Throughput Synthesis, Principles and Practices, Marcel Decker Inc. 2000, 93.
45. D. Maclean, J. R. Schullek, M. M. Murphy, Zhi-Jie Ni, E. M. Gordon, M. A. Gallop Proc.
Natl. Acad. Sci. 1997, 94, 2805.
46. Z. J. Ni, D. Maclean, C. P. Holmes, B. Ruhland, M. M. Murphy, J. W. Jacobs, E. M.
Gordon, M. A. Gallop J. Med. Chem. 1996, 39, 16011608.
47. F. Herpin, G. C. Morton In G. A. Morales and B. A. Bunin (Eds) Methods in Enzymology,
Combinatorial Chemistry Elsevier Academic Press, 2003, 369, 75.
119

48. H. M. Geysen WO 01/40148 A2.


49. B. Cohen, S. Skiena J. Comb. Chem., 2000, 2, 10.

120

5. Screening methods
Compound arrays of individual compounds, combinatorial compound libraries as well as
arrays of new materials are prepared in order to find among their components new
pharmaceuticals, new insecticides, new fungicides, new plastics, new semiconductors etc. The
new useful compounds or materials can be found by examining the libraries looking for
components having pre-determined properties. This process is called screening. In order to be
able to do screening we need assays that unequivocally show the presence or absence of
components having the desired property. The development of the assay methods itself is an area
of intensive research, dealing with this subject, however, is not within the scope of this book. The
results of the assays often appear as changes in color, fluorescence, radioactivity, conductivity
etc.
Life on earth is largely dependent on pairs of molecules that perfectly fit together like
enzymes and substrates, antigens and antibodies, hormones and receptors (Figure 5.1). Detection
of binding of a component of a synthesized library (red in the figure) to a large target molecule
(green) is an often applied screening procedure. The binding can be detected by changing a color
appearance of fluorescence or radioactivity etc.

Figure 5.1. Fitting and non-fitting

There is an important difference between screening arrays of compounds produced by


parallel synthetic methods and the combinatorial libraries prepared by the split-mix procedure.
Components of the compound arrays are known compounds that are individually tested in
parallel format each component occupying a different test tube (Figure 5.2).
In combinatorial libraries also individual compound form but their identity is not known.
So in the screening process the identity of the active compound also have to be determined. There
is another difference: the whole library can be tested in a single experiment. It is the target
molecule itself that selects from the mixture the active component that is fitting to it (marked by
an arrow in Figure 5.3).

121

Figure 5.2. Parallel screening of components of compound arrays

Figure 5.3. Screening a combinatorial library

5.1. High throughput screening of arrays of individual compounds


The introducing by Taktsy1 of the parallel approach in his microbiological experiments
was a very important development in the history of the analytical methods. The use of his
microtiter plates made possible to carry out the analytical assays in parallel format and exploit its
advantages in improving the efficiency. The later huge increases in the speed of the experiments
and the appearance of the high throughput screening (HTS) methods are also based on the

122

application of his microtiter plates. Improvements in the sensitivity of the assay methods made
possible to further increase productivity by replacing the 96-well plates with 384 and even 1,536well ones and by applying automation. Appearance automatic high performance work stations
made possible screening of well over hundred thousand compounds per day. The SAGIAN Core
Systems (Figure 5.4.) is one example of a standardized integrated system. Both liquid handling
and reading of the results of assays are fully automated. Figure 5.5 shows a plate reader.

Figure 5.4. A high performance automatic work station.


The SAGIAN Core Systems
(Photo: www.beckman-coulter.com)

The above core systems can be integrated with devices produced by other companies.
Figure 5.5. shows the Analyst GT of Molecular Devices Corp., a microplate reader optimized
for HTS, integrated into Sagian Core Systems.

Figure 5.5. Analyst GT integrated into Sagian Core Systems


(Photo: www.moleculardevices.com)

123

5.2. Screening of combinatorial libraries. Deconvolution methods


The methods applied in screening of combinatorial libraries substantially differ from
those applied when dealing with libraries prepared by parallel synthesis. Parallel synthesis
produces arrays of individual compounds so their components as described in the previous
paragraph can be examined either individually or as individual components of arrays submitted
to high throughput screening processes. Combinatorial libraries, on the other hand, are mixtures
of a large number of compounds. When the concept of combinatorial synthesis was born, finding
a single useful component in a mixture of millions of compounds seemed an unrealizable task. As
it later turned out the problem could be solved by different experimental strategies that apply a
logically devised series of operations in order to identify the wanted components of the mixtures.
The first such strategy was described by the author in 1982 (see paragraph 1.2.). The strategies by
which the useful components of multi-component mixtures can be identified are called
deconvolution methods.
The combinatorial libraries are prepared in two different forms. In one form the
synthesized libraries are cleaved from the support. In this case the components form real mixtures
that are examined in solution. The combinatorial libraries can also be prepared in tethered form.
The libraries are not cleaved from the support. The components remain on the beads of the resin
and can be examined as individual compounds. The deconvolution methods applicable for
tethered libraries substantially differ from those developed for dissolved libraries.

5.2.1. Deconvolution methods for dissolved libraries


The libraries cleaved from the support then dissolved may contain thousands or even
millions of compounds. The deconvolution methods developed for dissolved libraries makes
possible to unequivocally identify the library component that is responsible for a biological
property. Of course, these deconvolution methods work only if an appropriate biological or
biochemical assay method is at hand for detection of the biologically active component in the
mixture.

5.2.1.1. The iteration method


The iteration method is a strategy that was developed for screening combinatorial libraries
cleaved from the support. These libraries are real mixtures and are screened in solution. The
strategy was first conceived by the author in 19822,3 and the same concept was published by
Geysen at al. in 19844, by Houghten et al. in 1991 5, and Janda in 1994 6.
The task of identifying a compound having a certain property in a mixture of millions of
other compounds can be compared to that the police face when have to identify a criminal among
millions of individuals that at least in principle can be considered as potential suspects. Using
the data supplied by an eyewitness, for example, the police can gradually reduce the list of
suspects to the one who committed the crime. On the basis of a single data, for example, that the
criminal is a man, the list of suspects can be reduced by 50%.
This kind of approach is demonstrated by a simple example. Lets suppose that three
features of the faces are used for identification: hair, mustache and beard (Figure 5.6.).

124

Head

Figure 5.6. Three features of faces: hair (1), mustache (2) and beard (9)

Each of the features has three variants: hairless, medium hair and long hair; no mustache,
small mustache and large mustache; no beard, small beard and large beard. From all the variants
of the features applied to the head of Figure 5.6, the 27 different faces of Figure 5.7. can be
deduced. This figure is the same as the number of components in a tripeptide library synthesized
by using the same three amino acids in all three coupling positions.

2
Group 1

3
1

2
Group 2

2
Group 3

Figure 5.7. The 27 different faces


Lets see now how a suspect can be identified. Suppose the witness says that the criminal
has medium hair. Based on this, all suspects of groups 1 and 3, respectively can be omitted from
the list. The reason: everybody in group 1 is hairless and nobody in group 3 has medium hair.
The perpetrator has to be found among those in group 2. If the witness saw no mustache, the
faces of columns 2 and 3 (group 2), respectively, can also be excluded remaining the three faces
of column 1 (group 2). If the witness remembers a man with large beard, the criminal can
unequivocally be identified as the suspect occupying row 3 in column 1 of group 2.
The iteration strategy follows essentially the same approach that was described above and
is also demonstrated with a simple example: finding a bioactive component in a tripeptide library
of 27 peptides synthesized from the same three amino acids in all of the three coupling positions.
125

In the optimized procedure the synthesis is modified the following way. Before mixing, in each
coupling step a sample is removed and preserved for later use (Figure 5.8). After removal of the
samples the products are mixed as normal. As an alternative approach, instead removing samples
after the last coupling operation before mixing, one may choose to leave unmixed the final three
products of the last coupling step.
It is a good practice to determine before beginning the iteration whether or not the library
contains the desired bioactive component. For this reason the full library is cleaved from the final
mixed product of the synthesis and the solution of the peptides is then tested for the presence of
the desired bioactive component. The iteration experiments, of course, are executed only if the
test is positive.
Iteration step No. 1. Determination of the amino acid occupying coupling position 3 in the
bioactive peptide. The iteration procedure begins with determination of the amino terminal
residue of the bioactive peptide. For this reason the samples removed in the final (third) coupling
step are separately submitted to cleavage. The components of the resulting three tripeptide
mixtures are demonstrated in the three columns of Figure 5.9.
Products of the coupling operations

Removed samples

Step 1

Step 2

Step 3

Figure 5.8. Unmixed products in the split-mix synthesis of a tripeptide library


and the samples removed from them.
Large circles: resin, small circles: amino acids. The color of the circles in the boxes of removed
samples illustrates the terminal amino acid of the peptides in the samples.
The peptides occupying the same row in the columns differ only in the amino acid
occupying coupling position 3 (the N-terminal position). All the peptides of a column have the
same N-terminal amino acid and this makes possible to identify the terminal residue of the active

126

peptide. If the sample marked by + shows activity in screening, it means, that the bioactive
peptide has to be among the peptides of the marked column otherwise there would be no activity.

+
Figure 5.9. Testing the three sub-libraries cleaved from the three samples removed after the third
coupling. The + sign shows the bioactive sub-library
Since all peptides of this column have red amino terminal, consequently the N-terminal
amino acid of the bioactive peptide is also the red one. The other two amino acids are, of
course, not yet identified. This is analogous to choosing the medium hair group in Figure 5.7.
Iteration step No.2. Determination of the amino acid occupying coupling position 2 in the
bioactive peptide. The amino acid residue occupying the coupling position 2 in the bioactive
peptide can be determined in three steps outlined in Figure 5.10.
Samples removed after second coupling
Step 1
Coupling

Coupling

Step 2

Step 3

Cleavage

Cleavage

Testing

Testing

+
Figure 5.10. Determination of the amino acid
occupying coupling position 2 in the bioactive peptide.
127

The samples removed after the second coupling operation contain dipeptides still attached
to the support. All peptides within a sample contain the same (yellow, blue or red) amino
acid at their terminal position. In the first step the red amino acid which is known to occupy
the coupling position 3 in the bioactive peptides is separately coupled to the three samples.
In the second step the peptides are separately cleaved from the support. The figure shows
that the three groups of peptides are differing from each other only by the amino acid (yellow,
blue or red) occupying the coupling position 2.
In the third step the three peptide mixtures are separately tested. If the test shows activity
in the mixture marked by the plus sign, for example, the bioactive peptide has to be in this
mixture (this is similar to selecting column 1 of group 2 in Figure 5.7.). Since all components of
the mixture have blue amino acid in coupling position 2 the bioactive peptide also has the
blue amino acid in this coupling position. As a result of iteration step No. 2 the second amino
acid of the bioactive tripeptide has been identified leaving only one amino acid to be determined.
Iteration step No.3. Determination of the amino acid occupying coupling position 1 in the
bioactive peptide.
The so far unknown third amino acid of the bioactive peptide can be
determined in four steps demonstrated in Figure 5.11. The three samples removed after the first
coupling operation are submitted to a two step elongation. First the Blue amino acid is attached
that is known to occupy coupling position 2 in the bioactive peptide. This is followed by
attachment of the red amino acid identified as the amino terminal amino acid in the sequence of
the active tripeptide. In the third step the peptides are separately cleaved from the support. As the
figure shows each of the three product samples contains a single tripeptide. These samples are
finally tested. If the sample marked by + sign is proves to be the active one, then the so far
unknown amino acid is the yellow one. The yellow amino acid occupies the C-terminal
position of the tripeptide. The amino acid sequence of the bioactive amino acid is: Red-BlueYellow. This step is analogous to identifying the suspects face as that of column 1, row 3 of
group 2 in Figure 5.7.
Step 1
Coupling

Coupling

Coupling

Coupling

Step 2

Step 3
Cleavage
Step 4

Testing

Cleavage
Testing

+
Figure 5.11. Determination of the amino acid
occupying coupling position 1 in the bioactive peptide.

128

In a real case the number of iteration steps depends on the length of the peptides in the
synthesized library. The number of the samples that need to be tested in an iteration step is the
same as the number of amino acids used as building blocks in the corresponding coupling step.
A synthesized library may contain a number of bioactive or other useful components that
can be identified. It may prove useful to take this into account when deciding the quantity of the
library to be prepared. The number and kinds of tests as well as their sensitivity have to play a
definitive role in planning. The quantity of the samples removed in the synthetic process as well
as the quantity of the final mixed full library is also a question that needs to be decided. It is
advisable to remove the samples in each synthetic step in the same molar quantity and leave after the last sample removal the same molar quantity for mixing. Since in the course of the
synthesis the number of components increases in each step their molarity is accordingly reduced.

Table 5.1. The quantity of samples removed in the synthesis of a tetrapeptide library for iteration
experiments
Coupling number Quantity in %
1
0.006
2
0.12
3
2.44
4
48.7
Left for mixing
48.7

If 20 amino acids are used in each step, the molarity of the components decreases by a
factor of 20. As a consequence, in order to keep the molarity constant in the removed samples,
their proportion have to be increased about 20 times in each step (not exactly 20 times since the
total quantity decreases after each sample removal). This is exemplified by a tetrapeptide library
synthesized using 20 amino acids in each step. The quantities of the removed samples are shown
in Table 5.1.
As already mentioned, when dealing with combinatorial libraries made by the split-mix
method, less labor is needed not only in the synthesis but also in screening. This is the case doing
the deconvolution by the iteration method, too. The efficiency can be shown by a simple
example. A tetrapeptide library made from 20 amino acids has 160,000 components. If this
library is prepared by the parallel method the screening process needs 160,000 experiments since
all components have to be tested separately. The iteration procedure for the same library needs 20
experiments after each coupling operation plus an experiment with the full library, altogether
only 81 experiments.

5.2.1.2. Positional scanning


When using the iteration strategy, besides the biologist who makes the screening tests, a
chemist is also needed for doing the chemical elongation steps on the removed samples according
to the intermediate results of the screening experiments. There was a need to develop, if possible,
a different screening strategy that could be used by biologists without the help of chemists. This

129

could have been achieved by pre-preparing sets of libraries that without any further modification
make possible identification of the bioactive component.
The screening strategy fulfilling the requirement described above was independently
developed in two laboratories. The principles and realization of the strategy later named
positional scanning, was first described in a patent application filed by Furka et al. in May, 19927
then published and practiced by Pinilla et al.8
The sub-libraries described in Chapter 4 and demonstrated in Figures 4.11. and 4.12. can
be considered as candidates for being components of pre-prepared sets of libraries in positional
scanning.
3 21

321

321

Figure 5.12. Sub-library (B)


and two components (A and C) of a full trimer library (Figure 4.12.)

The reason is demonstrated in Figure 5.12. The sub-library B is one of the 9 sub-libraries
of the full library of Figure 4.12. A sub-library is a special partial library of a full library. It is
prepared by using a single amino acid in one coupling position in the synthesis. In all other
coupling positions all those amino acids are varied that are used in the synthesis of the full
library. As a consequence, all those components are present in a sub-library that contain the same
(non-varied) amino acid in the non-varied position. The non-varied position in the sub-library
of Figure 5.12. is coupling position 3, and the non-varied amino acid is the red one. In the
sub-library B all those peptides are present which contain the red amino acid in coupling
position 3 (for example trimer A) and no peptide is present that has other amino acid in this
position (compare to Figure 5.13.). Trimer C, for example is not present in the sub-library
because it contains the yellow amino acid in coupling position 3.
If a bioactive peptide
of the full library happens to contain the non-varied amino acid in the non-varied coupling
position (like the red amino acid in coupling position 3 of the trimer A) then the bioactive
peptide has to be found among the components of the sub-library. Consequently, the sub-library
gives a positive response in the screening test. On the other hand, if a different amino acid
occupies the non-varied position a negative result is expected.
In order to be able to do positional scanning, all sub-libraries of a full library have to be
prepared and tested. Figure 5.13. shows the full set of sub-libraries and their compositions of the
full library A. The number of sub-libraries in a set is the same as the total number of couplings
executed in the synthesis of the full library. In the synthesis of library A, three couplings are

130

executed in each of the three coupling positions so the number couplings as well as that of the
sub-libraries is nine.

B2

B1

C1

B3

C2

C3

D2

D1

D3

Figure 5.13. Sub-libraries of the full trimer library A.


The non-varied amino acids in columns B, C and D are the yellow, blue and Red amino
acids, respectively. The indices show the non-varied coupling positions.

If a full pentapeptide library is made using 20 amino acids in all coupling positions the set
of set of sub-libraries contains 100 sub-libraries. A practical way for denotation of the sublibraries is to use a figure for indication of the non-varied coupling position followed by a one
letter symbol for the non-varied amino acid9 as demonstrated in Figure 5.14.

1A
2A
3A
4A
5A

1C
2C
3C
4C
5C

1D
2D
3D
4D
5D

1E
2E
3E
4E
5E

1F
2F
3F
4F
5F

1G
2G
3G
4G
5G

1H
2H
3H
4H
5H

1I
2I
3I
4I
5I

1K
2K
3K
4K
5K

1L
2L
3L
4L
5L

1M
2M
3M
4M
5M

1N
2N
3N
4N
5N

1P
2P
3P
4P
5P

1Q
2Q
3Q
4Q
5Q

1R
2R
3R
4R
5R

1S
2S
3S
4S
5S

1T
2T
3T
4T
5T

Figure 5.14. Components of a positional scanning kit


in the case of a peptapeptide library.

131

1V
2V
3V
4V
5V

1W
2W
3W
4W
5W

1Y
2Y
3Y
4Y
5Y

The first step in positional scanning is testing the full library. If the result is positive then
all components of the kit are tested. From the result, the amino acid sequence of the bioactive
peptide can be deduced. If the assays show that the sub-libraries marked by boxes in Figure 5.14.
are bioactive then the coupling positions 1, 2, 3, 4 and 5 are occupied in the bioactive peptide by
E, L, V, R and T, respectively. Taking into account that the order of amino acids in peptide
sequences is opposite of their coupling order, the sequence of the bioactive peptide is:
T-R-V-L-E
As it will be shown below, preparation of all sub-libraries, for example the 100 sublibraries of Figure 5.14., needs too much work and long time to prepare them for screening with a
single target. A company, however, can synthesize them in larger quantities and divide them into
smaller equimolar portions. The collections formed from these smaller quantities could be sold as
kits ready for use by biologists.
The synthesis of a single sub-library of Figure 5.14. needs 81 couplings: one coupling in
the non-varied coupling position plus 20 couplings in each of the remaining 4 positions. Since
there are 100 different sub-libraries in the figure, the total number of the required couplings is
8100!
The synthesis of the 100 sub-libraries, however, can be optimized in order to be able do
the preparation in less number of amino acid couplings. The optimization can be achieved by
doing as many couplings as possible with the combined form of the sub-libraries under
preparation. This is briefly described bellow.
Step 1. Preparation of the 1A to 1Y sub-libraries. The resin is divided into 20 equal
portions then a different amino acid is coupled to each portion. No mixing. One fifth part of each
portion is removed then four full split-mix cycles (split-couple-combine) are executed on each of
the 20 removed portions to get the 20 1X type sub-libraries. The total number of couplings is:
20 + 4x20x20 = 1620
Step 2. Preparation of the 2A to 2Y sub-libraries. The 20 portions remaining in Step 1 are
mixed, divided into 20 equal samples then each of them is coupled with a different amino acid.
No mixing. One fourth part of each sample is removed. The removed samples are separately
submitted to three full split-mix cycles. The result is the 20 2X type library. The total number of
couplings is:
20 + 3x20x20 = 1220
Step 3. Preparation of the 3A to 2Y sub-libraries. The 20 samples remaining in Step 2 are
mixed, divided into 20 equal portions then a different amino acid is coupled to each one. No
mixing. After coupling one third part of each sample is removed then separately submitted to two
full split-mix cycles. The product is the 20 3X type library. The number of the executed amino
acid couplings is:
20 + 2x20x20 = 820
Step 4. Preparation of the 4A to 4Y sub- libraries. The 20 samples remaining in Step 3 are
mixed, divided into 20 equal parts then a different amino acid is coupled to each part. No mixing.
Half of each sample is removed then one full split-mix cycle is executed on every sample. As a

132

result, the 20 4X type sub-libraries are formed. The total number of the executed amino acid
couplings is:
20 + 20x20 = 420
Preparation of the 5A to 5Y sub-libraries. The 20 samples remaining in Step 4 are not
mixed. Each of them is coupled with a different amino acid. The products are the 20 5X type sublibraries. The total number of couplings is:
20
The total number couplings executed in the whole process leading to the 100 sub-libraries
of the positional scanning kit of pentapeptides is: 4100. Compare this to the 8100 couplings
needed in the non-optimized process. The number of couplings could be reduced by almost 50%.
The fact that the synthesis of a positional screening kit is still to laborious prompted us to
think about other possibilities to circumvent the need for preparation of full positional scanning
kits. The result was the development of the omission libraries and the amino acid tester libraries.

5.2.1.3. Omission libraries.


The potential applicability of omission libraries in screening was first independently
realized in two laboratories.10,11 Peptide omission libraries can be prepared using the split-mix
method like in the synthesis of full libraries with an important difference: one amino acid is
omitted in all coupling positions. It is important: the same amino acid is omitted in all coupling
positions. This is illustrated in Figure 5.15.
1

A C D E F G

I K L M N P Q R S T V W Y

A C D E F G

I K L M N P Q R S T V W Y

A C D E F G

I K L M N P Q R S T V W Y

A C D E F G

I K L M N P Q R S T V W Y

A C D E F G

I K L M N P Q R S T V W Y

Figure 5.15. Amino acids used in the synthesis of an omission peptapeptide library. Histidine (H)
is omitted in all the five coupling positions.

As a consequence of the omission of histidine in the synthesis, no histidine containing


peptide forms. All histidine containing peptides that otherwise would be present in the full library
are missing. In the figure no H containing lines can be drawn. A short symbol can be used to
denote the omission libraries: a minus sign followed by the one letter symbol of the omitted
amino acid. The symbol of a histidine omission library, for example, is -H. An optional figure
can be appended to the symbol indicating the number of the amino acid building blocks in the
peptides (length). Accordingly, the symbol of a histidine omission pentapeptide library is: -H5.
133

The number of peptides in omission libraries synthesized from 20 amino acids, as well as
the number of peptides missing from them is summarized in Table 5.2.
Table 5.2. Number of peptides in full and omission libraries and the number of peptides missing
from the omission libraries
Length Full library Omission library Missing peptides
2
400
361
39
3
8,000
6,859
1141
4
160,000
130,321
29,679
5
3,200,000
2,476,099
723,901
6
64,000,000
47,045,881
16,954,119

Full
library

Omitted amino acid


Yellow
Blue
Red

Amino acids in the bioactive peptide:


Figure 5.16. Composition of omission libraries

Using a very simple example, Figure 16. shows how the composition of omission libraries
can be derived from that of the full one. The fact that all the peptides of the omitted amino acid
are missing from the library makes the omission libraries applicable in the identification of

134

bioactive peptides. If the bioactive peptide contains the omitted amino acid, for example, the
omission library gives negative result in testing, since the bioactive peptide is not present in the
omission library. On the other hand, if the bioactive peptide does not contain the omitted amino
acid the omission library gives positive test since all peptides except those of the omitted amino
acid are present. Based on these properties the omission libraries can be used for determination of
the amino acid composition of the bioactive peptide.
This is demonstrated in Figure 5.16. If two omission libraries (blue and red) give
negative result in screening test this means that the bioactive peptide contains blue and red
amino acids. The yellow amino acid is not present because the test with the yellow omission
library is positive.
By use of omission libraries the amino acid composition of the bioactive peptide can be
determined. Nothing is known, however, about the coupling position of these amino acids. This
has to be determined by additional experiments. This task, however, is much less complicated
than the original one. This can be illustrated by a simple example.
Suppose we deal with a full pentapeptide library and the result of using the 20 omission
libraries is that the bioactive peptide contains the following four amino acids: A, G, R and H.
Then a much less complex library can be defined that is built up using these 4 amino acids in all
coupling positions.
1
2
3
4
5

AGRH
AGRH
AGRH
AGRH
AGRH

If this simpler library that contains only 256 components instead of the 3.2 million ones
in the in the original pentapeptide library is prepared the bioactive peptide must be present in it.
This means that the original task is reduced to a much simpler one. This simpler library can be
named occurrence library. The amino acid sequence can be determined by application of the
iteration or the positional scanning method to the occurrence library. Practical example will be
shown in a separate paragraph.
The synthesis of the omission libraries is simple. The number of libraries in a kit is 20 if
the 20 amino acids are used in preparation of the full library. The number of components of the
kit does not depend on the length of the peptides. Preparation of the kit is simpler and less time
consuming than, for example, the synthesis of the 100 components of the positional scanning kit.

5.2.1.4. The amino acid tester libraries


The peptide mixtures missing from omission libraries contain all peptides of the omitted
amino acid irrespective of its number or position occupied in the sequences. In addition, there is
no peptide in the mixture that does not contain the omitted amino acid at least in one position. As
pointed out by Cmpian et al.12,13 these properties make possible to use these mixtures, as an
alternative method, for determination of the amino acid composition of bioactive components of
peptide libraries. The amino acid libraries form a kit. The number of components in an amino
acid tester kit is the same as the number of amino acid building blocks used in the synthesis of

135

the full library. If 20 amino acids are used in the synthesis, the number of components is 20 like
in the case of the omission library kit.
An amino acid tester library gives an opposite result in screening compared to an
omission library. An alanine tester library, for example, that comprises all alanine containing
peptides shows activity in screening only when alanine is present in the sequence of the bioactive
peptide. If the test is carried out with a tester library of an amino acid that is not present in the
bioactive component no or a reduced activity is expected.
Figure 5.17. shows the composition of simple amino acid tester libraries that can be
compared to the full one. If the libraries marked by the plus sign prove to be active then the
amino acid composition of the bioactive peptide is: yellow, red.
Full
library

Tester library for the amino acid


Yellow
Blue
Red

Amino acids in the bioactive peptide:


Figure 5.17. Composition of amino acid tester libraries
Figure 5.17. shows that the components of the amino acid tester libraries may contain the
amino acid to be tested in one, two, three etc. positions.
Table 5.3. shows how the components of an alanine tester library can be derived in the
case of a full tripeptide library prepared using 20 amino acids. The components of the tester
library are divided into 7 groups. The group 1 peptides, for example, contain alanine in coupling
position 1, and in the positions 2 and 3 the remaining 19 amino acids are varied. In group 2 and 3
peptides the single alanine occupies position 2 and 3, respectively. In groups 4, 5 and 6 alanine is
found in two positions. Group 7 comprise a single peptide with alanine in all three positions. The

136

total number of peptides in the seven groups is 1141, in accordance with the corresponding figure
of Table 5.2.
Considering the possibility of the synthesis, the groups of 1, 4, 5, 7 and groups 5, 6 of
Table 5.3. can be amalgamized into groups 1 and 2, respectively of Table 5.4. Group 3 of Table
5.3. remains alone and is transferred into Table 5.4. as group 3. The total number of peptides of
course remains unchanged: 1141.

Table 5.3. Groups of peptides occurring in a tripeptide alanine tester mixture


Coupling position
Group 1 Group 2 Group 3 Group 4 Group 5 Group 6 Group 7
1
A
19
19
A
A
19
A
2
19
A
19
A
19
A
A
3
19
19
A
19
A
A
A
Number of peptides
361
361
361
19
19
19
1

The synthesis of the alanine tester library needs the preparation and mixing of the three
partial libraries represented in Table 5.4. as groups 1 to 3. Although not demonstrated in the
tables, the number of component libraries to be prepared in the case of tetrapeptides and
pentapeptides is four and five, respectively.

Table 5.4. Groups of peptides to be synthesized


Coupling position
Group 1 Group 2 Group 3
1
A
19
19
2
20
A
19
3
20
20
A
Number of peptides
400
380
361

As mentioned in paragraph 4.1.1.6 preparation of the missing part of some partial


libraries, by which the partial library can be completed to a full one, is very complicated and time
consuming. The amino acid tester libraries that are the missing part of the omission libraries
clearly exemplify this.
The amino acid tester libraries are less complex than the omission libraries, they contain
less number of components (see Table 5.2.). This may be advantageous in screening. The more
complicated synthesis is, of course, a disadvantage. In the 5.2.1.6. paragraph an example will be
presented to show that the component libraries of an amino acid tester library can be prepared in
a single run using the automatic synthesizer aapptec 357.

137

5.2.1.5. Other methods for identification of the bioactive component of combinatorial


libraries
In addition to the deconvolution strategies described above other methods were also
published that makes possible the identification of the bioactive components of dissolved
combinatorial libraries. These methods allow separation of the bioactive molecules from the rest
of the components of libraries based on the specific binding of the bioactive component to the
target molecule.
In one of such approaches the target molecule is covalently attached to an insoluble
matrix. A chromatography column is filled with the matrix then the dissolved library is passed
through the column. The active components are retarded in the column since they bind to the
target while the inactive components pass. Like in other affinity approaches, after washing the
column the active molecules are eluted. The separated compounds are then submitted to structure
determination.
In another approach both the target protein molecule and the components of the library
are dissolved in the same medium. The active molecules are allowed to bind to the target protein
then the mixture is submitted to size exclusion chromatography. The large protein molecule
holding the attached active molecule readily separates from the inactive small molecule library
components. Applying appropriate conditions the ligand dissociates from the receptor and its
structure can be determined.

5.2.1.6. Dynamic combinatorial libraries


The dynamic combinatorial libraries introduced by Professor J.-M. Lehn25 differ from the
so far described static libraries. They are combinatorial libraries since they contain all
components that can be derived by combination of the building blocks. It is very important,
however, that their components are in equilibrium. The constituents undergo continuous
interconversion by recombination of their building blocks. This dynamic interconversion of the
components gives the specialty of these libraries and, at the same time, determines their
applicability. Addition of a target molecule to the mixture of molecules forming the equilibrium
favors the formation of the component that binds to the target. Binding of the component to the
target creates a strong driving force towards its formation. In principle, this method is capable to
accelerate the identification of lead compounds in drug discovery. It should be noted, however,
the applicability of the method in the pharmaceutical research is limited since the use of the
majority of the combinatorial synthetic methods leads to formation of static libraries.

5.2.1.7. Examples
The applicability of the screening strategies described before is demonstrated by a few
model experiments. The task in these experiments was to determine whether or not a synthesized
tripeptide library has a component that inhibits binding of LHRH14 to its antibody. The amino
acid sequence15 of the hormone is shown below:
pGlu-His-Trp-Ser-Tyr-Gly-Leu-Arg-Pro-Gly-NH2

138

The LHRH polyclonal antibody as well as the radioactively labeled LHRH were the products of
Advanced ChemTech. The competitive inhibition of LHRH to its antibody was determined by
radioimmunoassay16 (RIA).
Testing the full trippeptide library.11 Since the LHRH is a decapeptide amide, a tripeptide
amide library was prepared and used in screening. In the split-mix synthesis of the tripeptide
amide library 19 amino acids were used in the first and second coupling position (cysteine was
omitted) and, because LHRH has pyroglutamic acid at the N-terminal position, pyroglutamic
acid was added to this set in coupling position 3. The library was prepared on Rink amide resin
using the F-moc strategy.
The tripeptide amide library was added in increasing concentrations to the mixture of
radioactive LHRH and its antibody and their binding was determined by RIA. The result is
demonstrated in Figure 5.18. It can be seen that binding is strongly reduced by increasing the
concentration of the library. This makes probable that the library has component/s that inhibit
binding, that is, it is worthwhile to make further experiments in order to identify this component.
The result also suggests that the optimal concentration for the binding experiments should be
around 50 microgram/ml. In all the further experiments the libraries were applied in molarities
equivalent to this concentration.

Figure 5.18. Effect of concentration of the full tripeptide library


on binding of LHRH to its antibody
Iteration experiment.17 Before the final mixing in the split-mix synthesis of the full
tripeptide amide library, equal samples were removed and cleaved from the support. These
mixtures were suitable to demonstrate the first step in the iteration strategy.

139

100
90
100 - LH-RH Binding %

80
70
60
50
40
30
20
10
3p

3Y

3W

3V

3T

3S

3R

3Q

3P

3M

3L

3K

3I

3H

3G

3F

3E

3D

-10

3A

Fi
Figure 5.19. Inhibitory effect of sub-libraries used in the first iteration step.
3p denotes pyroglutamic acid in the N-terminal position

Figure 5.19. shows how strong the inhibitory effect of the sub-libraries is. It can be clearly
seen that sub-library 3R exhibits the far strongest effect. This means that the amino acid
occupying the coupling position 3 in the inhibitory tripeptide amide is arginine, R.
Application of omission libraries.10 The omission libraries were derived from the
tripeptide amide full library described above. Thus in the synthesis of 19 omission libraries (-A to
Y) one amino acid was omitted in all the three coupling positions. The remaining 18 amino
acids were built in into all positions. Pyroglutamic acid was also present in all omission libraries
in coupling position 3.
Pyroglutamic acid omission library could not be prepared since this amino acid can be
inserted only into coupling position 3 of the tripeptides. It was also important, however, to test
whether or not this amino acid is present in the active peptide. For this reason a full tripeptide
amide library was prepared by using 19 amino acids in each coupling position and the
pyroglutamic acid was omitted from position 3 (denoted by p).
The importance in inhibition of the amide groups of the peptides was also tested. In order
to do this one part of the full library of tripeptides was cleaved from the support in the form of
carboxylic acids instead of amides and tested in this form (denoted by -a).
When tested the omission libraries, the full tripeptide amide library (denoted by X) was
also included. The result is demonstrated in Figure 5.20. It can be seen that the omission libraries
that less reduces the competitive binding are: -G, -P, -R and a. This means that the amino acid
composition of the inhibitory tripeptide is glycine (G), proline (P) and arginine R. The peptides
that do not have amide groups are not effective inhibitors. This means that the amide group is
also essential part of the inhibitory tripeptide.

140

120

Binding of LH-RH %

100
80
60
40
20
0
X

-A

-D

-E

-F

-G

-H

-I

-K

-L

-M

-N

-P

-Q

-R

-S

-T

-V

-W

-Y

-p

-a

Figure 5.20. Effect of omission libraries on binding.


X and -a mean full tripeptide amide and full tripeptide acid libraries, respectively, while -p
denotes the library from which pyroglutamic acid was omitted. The other omission libraries are
represented by a minus sign followed by the one letter symbol of the
omitted amino acid

The results of the experiments carried out with omission libraries gave no indication
about the position of the amino acids within the sequence of the active peptide. Despite this, the
information gained by only 21 screening experiments is very valuable. They define an amino
acid occurrence library that can be synthesized by varying only three amino acids, Gly, Pro, and
Arg in all of the three coupling positions. The inhibitory tripeptide is expected to be present
among the 27 components of this tripeptide amide library. In other words, by screening with
omission libraries, the complexity of the library in which the active peptide is found could be
reduced from the original 7220 to only 27.
The positions of the identified amino acids could be determined by using one of the
following three possibilities:
1. Preparation by parallel synthesis and screening of the 27 components of the occurrence
library.
2. Application of positional scanning to the occurrence library (preparation and screening of
nine sub-libraries).
3. Positional scanning with nine sub-libraries of the full library (if available).
Preparation and use of amino acid tester libraries.12 Amino acid tester libraries offer an
alternative choice besides omission libraries for determination of the amino acid composition of
active peptides. The set of libraries used in the experiments were derived from the full libraries
prepared from 19 amino acids plus pyroglutamic acid in coupling position 3. Each library of the
set contains 1064 tripeptide amides and can be formed like the alanine tester library in Table 5.5.
by mixing the groups 1 to 3. Instead of preparing separately the three groups, however, they were
synthesized then mixed in a single optimized process using the ACT 357 automatic synthesizer.
141

Table 5.5. Components libraries of the alanine tester library


Coupling position
Group 1 Group 2 Group 3
1
A
18
18
2
19
A
18
3
20
20
A
Number of peptides
380
360
324
The flow diagram of the synthesis of the alanine tester mixture is demonstrated in Figure
5.21.
The procedure was started with 1.064 g Rink resin distributed in the reaction block
(Figure 5.22.) according to the number of peptides in Group 1 (0.380 g in one of the reaction
vessels) and Groups 2 plus Group 3 (0.684 g in the collection vessel). The Group 1 resin was first
coupled with alanine, distributed into 19 samples and each was coupled with one of the 19 amino
acids (including alanine), and then mixed. The Group 2 + 3 resin was distributed into 18 equal
parts, coupled with one of the 18 amino acids (no alanine among them) then mixed. The mixture
was split into two parts in quantities corresponding to the number of peptides in Group 2 (0.360
g) and Group 3 (0.324 g). The Group 2 resin was coupled without portioning with alanine then
mixed with the resin sample containing the Group1 peptides. The Group 3 resin was portioned
into 18 equal parts, coupled with one of the 18 amino acids (no alanine among them), mixed then
coupled with alanine without portioning.
1.064
0.684

0.380

Group 1
A

18
Group 2
0.360

19

Group 3
0.324

18

Groups 1+2

A
A
20

ATL

Figure 5.21. Flow diagram of the synthesis of the alanine tester library (ATL).
A: Coupling with alanine; 18, 19, 20: Portioning, then coupling with 18, 19 and 20 different
amino acids, respectively, then mixing.

142

The Groups 1 + 2 resin was distributed into 20 equal parts, each coupled with one of the
20 amino acids (including alanine and pyroglutamic acid) then mixed. Finally, the Groups 1+2
resin was mixed with the Group 3 sample to give the alanine tester library (ATL). All operations
were preprogrammed and at the end the product was accumulated and mixed in the collection
vessel.
The applicability of the synthesized amino acid tester libraries in determination of the
amino acid composition of active peptides was tested under reaction conditions described at
omission libraries. Figure 5.23. shows that the inhibition of binding of LHRH to its antibody is
strongest in the case of the glycine (G), proline (P) and arginine (R) tester libraries. Consider that
on the y axis not LHRH binding% but 100- LHRH binding% is plotted. Consequently the active
tripeptide amide contains glycine, proline and arginine.

Reaction vessel
0.38 g resin
Collection vessel
0.684 g resin

Figure 5.22. The reaction block of the ACT 357 Synthesizer and the quantities and places of the
resin at the start of the process

100-LHRH Binding%

Applicability of the amino acid tester libraries is the same as that of the omission libraries.
80
70
60
50
40
30
20
10
0
A

Figure 5.23. Effect of amino acid tester libraries on binding of LHRH to its antibody

Synthesis and use of positional scanning libraries. Screening with both omission and
amino acid tester libraries led to the same result: binding of LHRH to its antibody is inhibited by
a tripeptide amide having composition glycine, proline and arginine. As outlined before, based on

143

this result an occurrence library can be defined. If this library is synthesized, it contains among its
components the inhibitor tripeptide amide (Table 5.6.).

Table 5.6. Building blocks of the occurrence library


Coupling step
1
2
3

Amino acids
G P R
G P R
G P R

The position of the three amino acids in the sequence of the inhibitor tripeptide amide is
determined by synthesizing and testing of the nine component omission library kit of the
positional scanning library. The synthesis is optimized to make possible to prepare the nine
component libraries of the kit (1G, 2G, 3G, 1P, 2P, 3P, 1R, 2R and 3R) in a single run on the
automatic synthesizer ACT 357. The solid support was again Rink resin. The synthesizer was
pre-programmed that made possible to execute the whole process automatically.
The flow diagram is demonstrated in Figure 5.24. The starting resin, placed into the
collection vessel was first divided into three portions then coupled with glycine, proline and
arginine, respectively.
Start

Split
Mix
Mix & Split
Coupling with G

1/3

1/3

1/3

Coupling with P
Coupling with R

1/2

1/2

1/2

3G

2G

2P

3P

2R

3R

1G

1P

1R

Figure 5.24. Flow diagram of the synthesis of the 9 positional scanning sub-libraries
of the occurrence library

144

Before mixing, 1/3 part of each reaction product was transferred into a separate reaction
vessel then each of them was individually submitted to two consecutive portioning-mixing cycles
coupling in each cycle with glycine, proline and arginine, yielding 1G, 1P, and 1R as end
product. The remainder was mixed, divided into three parts then each part coupled with one of
the three amino acids. Again, before mixing, 1/2 part of each sample was transferred to a separate
reaction vessel then individually submitted to a full portioning mixing cycle, using again glycine,
proline and arginine in couplings. These operations resulted in formation of 2G, 2P and 2R. The
remainder was mixed, divided into three portions then each coupled with one of the three amino
acids. The three products were 3G, 3P and 3R.
The synthesized nine first order sub-libraries were used to determine the position of
glycine, proline and arginine in the tripeptide responsible for competitive inhibition of binding of
LH-RH to its antibody (Figure 5.25).

100 - LH-RH binding %

100
80
60
40
20
0
1R 2R 3R

1G 2G 3G

1P 2P 3P

Figure 5.25. Positional scanning by sub-libraries of the occurrence library

Since on the y axis not LHRH binding% but instead 100- LHRH binding% is plotted,
Figure 5.25. shows that the inhibition of binding of LHRH to its antibody is strongest in the case
of the 1G, 2P and 3R sub-libraries. Consequently, arginine, proline and glycine occupy the
coupling positions 3, 2 and 1 in the tripeptide, respectively. The sequence of the inhibitor
tripeptide is Arg-Pro-Gly-NH2. This sequence happens to be identical with the C-terminal
sequence of LHRH.

5.2.2. Deconvolution methods of libraries not cleaved from the solid support
The components of the tethered libraries are found in the beads of the solid support as
individual compounds. Consequently, they can be tested as individual substances. It has to be
taken into account, however, that the structure of the compounds present in any particular bead is
unknown. For this reason the deconvolution process has to solve two problems:

Identify the bead that contains the component showing the wanted property
Identify the compound tethered to the bead.

145

The beads containing the individual components of the combinatorial libraries can be, and
are, tested in two different ways:

The components of the libraries are tested in tethered form


The screening tests are carried out with compounds cleaved from individual beads.

Both approaches have to make possible identification of the bead containing the active
compound and determination of the structure of the wanted compound. As it will be shown both
deconvolution processes need less number of assays then are needed in the determination of the
activities of compound arrays prepared by parallel synthesis.

5.2.2.1. Screening of combinatorial libraries in tethered form


When the beads are immersed into solution the tethered compounds are available to
interaction with dissolved molecules. They can specifically bind to receptors, antibodies,
enzymes, viruses, etc. For this reason binding tests offer themselves as assays in deconvolution
processes. The first example of carrying out binding tests with individually synthesized peptides
tethered to the beads of the solid support was described by Smith and his colleagues.18 After the
split-mix combinatorial synthetic method became available19 this approach was applied to
tethered peptide libraries.20

Figure 5.26. Identification of the beads that specifically bind to target molecules. The beads
containing specifically binding peptide are colored

In application of the method the beads containing the full tethered peptide library, or a
fraction of it, is immersed in the solution of a target molecule. The beads containing peptide that
binds to the target are identified, separated from the rest of the beads then the peptide they
contain is sequenced. The binding have to be visualized somehow. This can be achieved by
labeling the target molecule before or after the binding experiment. The target protein can be

146

labeled by attaching to it a color, fluorescent or radioactive residue. If the target protein is labeled
by color and the beads are examined through a microscope the binding beads can easily be
distinguished from rest of the beads by their color as shown in Figure 5.26. The beads are colored
because the target molecules are colored that are attached to the peptide molecules of the bead.
The colored beads are manually separated from the rest of the beads then by washing with
an appropriate solvent the attached labeled target protein is removed from them. After washing,
the sequence of the peptides is determined using automatic sequencing machine.
A more effective and faster selection process can be applied if the target molecule is
labeled by fluorescence. The beads can be sorted by fluorescence-activated cell sorting
instrument. A special automatic machine was developed by Morten21 for sorting fluorescent
beads.
A different approach, infrared termography, was applied by Taylor and Morken22 to
identify catalysts in non-peptide tethered libraries synthesized by the split-mix procedure. The
method is based on the heat that is evolved in the beads that contain a catalyst when the tethered
library immersed into a solution of a substrate. The heat increases the temperature of the catalyst
containing beads. In the beads that do not contain catalyst no heat is evolved so their temperature
remains unchanged. Although the increase of the temperature is very low, when the beads are
examined through an infrared microscope the catalyst containing beads appear as bright spots as
demonstrated in Figure 5.27.
The screened library was an organic one and the beads were encoded. The bright beads
were separated and the identification process ended with determination of their code.

Figure 5.27. Identification of catalyst containing beads by infrared thermography

5.2.2.2. Screening of combinatorial libraries by releasing the content of individual beads


into solution
An alternative way of screening the tethered libraries is releasing the content of the
individual beads into solution then the libraries can be screened like the compound arrays
prepared by the parallel synthetic methods. If the bead containing the active compound is
identified, however, the procedure has to be continued by determination of the structure of the
released compound. If only a fraction of the content of the bead is released for screening the rest
147

can be used for structure determination. If encoded libraries are used the synthesized compounds
and the codes can be released separately. It is also a possibility to release the content of the beads
and identify the bead containing the active compound in several stages.
The procedure developed at Pharmacopeia23 uses this latter possibility for screening
libraries of small organic compounds. The libraries are prepared applying the binary encoding
technique and using a photolabile linker which allows a two stage release of the organic
substance. After portions of beads are distributed into small containers (Figure 5.28/A), the first
portions of the substances are released by irradiation. The content of each vessel is then
submitted to screening. If one of them proves to be active (marked by + sign in the figure), the
beads of this container are re-distributed into vessels each containing a single bead (Figure
5.28/B). After releasing the second portion of the substances, a second screening identifies the
bead which carried the active substance (marked by + sign). Finally the encoding molecules are
released from the identified bead and determined by electron capture gas chromatography thus
determining the structure of the organic molecule responsible for the biological activity.

+
A

+
Figure 5.28. Identification of the bead containing the bioactive component in two stages.

It is worthwhile to note that this two stage process needs less number of screening
experiments then does the one stage process when all beads are tested individually. The 11
containers of Figure 5.28/A contains a total of 110 beads. The one stage process would need 110
screening experiments. The two stage process needs as shown in the figure only 21 assays.

5.2.2.3. Examples
The application of the screening methods developed for libraries not cleaved from the
solid supports are demonstrated by examples described in the literature.
Identification of bioactive peptides with enzyme-linked colorimetric assay.24 The
experiment is carried out with tethered peptide libraries synthesized on TentaGel resin. Binding
of the target molecule to the beads containing the active peptide is indicated by the blue color of
an indigo derivative that forms from 5-bromo-4-chloro-3-indolyl-phosphate (BCIP)
dephosphorylated by the enzyme alkaline phosphatase. The target protein is biotinylated before
the binding experiment and the alkaline phosphatase is derivatized with streptavidin. The target
protein binds strongly the reporter enzyme that converts BCIP to the blue indigo. If the protein
binds to a bead containing the active peptide then a solution of BCIP is added, then insoluble

148

indigo forms on the bead staining its surface to blue. The experiment is carried out with 5,000 to
50,000 beads at room temperature or 4 oC. The goal of the experiment is to identify only those
beads that bind the target protein by strong specific interaction. Weak, non-specific interactions
may also occur. In order to make these interactions invisible, the beads are incubated with gelatin
that coats the beads by weak, non-specific interaction. The strong specific binding of the target
molecule to the beads is expected to displace the gelatin from the surface. Non-specific binding,
however, may also occur with the alkaline phosphatase, too. In order to eliminate the misleading
effect of this the beads are prescreened with streptevidin-alkaline phosphatase. The beads are
incubated with streptevidin-alkaline phosphatase and after washing a incubated with a BCIP
solution. As a result of interaction of the streptavidin-alkaline phosphatase with some peptides,
blue beads appear that are manually removed. Following the prescreening and washings, the
beads are ready for final screening. The beads are incubated with a solution of the biotinylated
target protein for 1-2 hours when the target molecules displace the gelatin from some beads and
bind strongly to the their peptides. This is an invisible process. In order to make it visible the
beads, also after washings, are first incubated with streptavidin-alkaline phophatase which is
attached to the immobilized target molecules via the strong biotin-streptavidin interaction.
Finally, the beads are washed with BCIP solution. Blue color develops on some beads. The beads
are examined in Petri dish under microscope. The blue beads are removed manually. The target
protein-enzyme complex is washed away from the beads with urea solution. The amino acid
sequences of the peptides that show specific binding property can be determined without cleaving
them from the beads. Automatic micro-sequencing machines that are based on the well known
Edman degradation can be used for this purpose.
Whole cell binding assay.24 In addition to target proteins, binding experiment can be
carried out with intact cells, too. This approach can be used to study the cell surface receptors and
their ligands. Sterilized tethered peptide libraries are used in the experiments. The beads
surrounded by binding cells can be distinguished from the inactive ones as seen in Figure 5.29.
The active beads are again manually separated and sequenced after removal of the attached cells.

Figure 5.29. Cell binding. The large circles are beads, the smaller ones are cells. The cell binding
beads are marked by arrow

149

References
1. Gy. Taktsy Acta Microbiologica Acad. Sci. Hung. 1955, 3, 191.
2. Furka (1982) Tanulmny, gygyszatilag hasznosithat peptidek szisztematikus
felkutatsnak lehetsgrl (Study on possibilities of systematic searching for
pharmaceutically useful peptides). Unpublished theoretical study written in Hungarian for
internal use, describing the PM synthesis and an iteration strategy for screening of soluble
libraries. Notarized on June 15, 1982, file number 36237/1982. See also paragraph 1.2.
3. . Furka Drug Discovery Today 2002, 7, 1.
4. H. M. Geysen, R. H. Meloen, S. J. Barteling Proc. Natl. Acad. Sci. USA 1984, 81, 3998.
5. R. A. Houghten, C. Pinilla, S. E. Blondelle, J. R. . Appel, C. T. Dooley, J. H. Cuervo
Nature 1991, 354, 84.
6. K. D. Janda Proc. Natl. Acad. Sci. 1994, 91, 10779.
7. . Furka, F. Sebestyn WO 93/24517.
8. C. Pinilla, J. R. Appel, R. A. Houghten, In C. H. Scneider, A. N. Eberle, (Eds) Peptides
1992, 1993, ESCOM, Leiden, 65.
9. . Furka Drug Development Research 1994, 33, 90.
10. T. Carell, E. A. Winter, J. Rebek Jr. Angew. Chem. Int. Ed. Engl. 1994, 33, 2061.
11. E. Cmpian, M. Peterson, H. H. Saneii, . Furka Bioorg. & Med. Chem. Letters 1998, 8,
2357.
12. E. Cmpian, J. Chou, M. L. Peterson, H. H. Saneii, . Furka, R. Ramage, R. Epton (Eds)
In Peptides 1996, 1998, Mayflower Scientific Ltd. England, 131.
13. E. Cmpian, H. H. Saneii and . Furka PharmaChem 2003, April, 43.
14. A. V. Schally et al. J. Biol. Chem. 1971, 246, 7230.
15. H. Matsuo et al. Biochem. Biophys. Res. Commun. 1971, 43, 1334.
16. C. Patrono, B. A. Peskar, (Eds), Radioimmuoassay in Basic and Clinical Pharmacology,
1987, Springer -Verlag, Heidelberg.
17. E. Cmpian, J. Chou, . Furka unpublished results.
18. J. A. Smith J. G. R. Hurrel, S. J. Leach Immunochemistry 1977, 14, 565.
19. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherlands, 1988, Vol. 5, p 47.
20. K. S. Lam, S. E. Salmon, E. M. Hersh, V. J. Hruby, W. M. Kazmierski, R. J. Knapp
Nature 1991, 354, 82 and its correction: K. S. Lam, S. E. Salmon, E. M. Hersh, V. J.
Hruby, W. M. Kazmierski, R. J. Knapp Nature 1992, 360, 768.
21. M. Meldal Biopolymers (Peptide Science) 2002, 66, 93.
22. S. J. Taylor, J. P. Morken Science 1998, 280, 267.
23. http://www.pharmacopeia.com
24. K. S. Lam, A. L. Lehman, A. Song, N. Doan, A. M. Enstrom, J. Maxwell, R. Liu, In G. A.
Morales, B. A. Bunin (Eds) Methods in Enzymology, Combinatorial Chemistry Part B,
2003, Elsevier Academic Press, 298.
25. J.-M. Lehn Chem. Eur. J. 1999, 5, 2455.

150

6. Combinatorial methods in materials and catalyst research


Besides the organic compounds utilized as drugs, pesticides etc. there exists another
important group of materials that have definitive effect on our every-days life. These are the solid
inorganic materials and polymers. These materials substantially differ from organic compounds
that have well defined molecular structure. In the inorganic solid materials the elementary
composition is not always stochiometric, the proportion of the component elements may be very
different. In addition a considerable part of the elements of the periodic table may occur among
their constituents. The polymers also differ from the small molecular organic compounds. They
usually contain a large but undefined number of building blocks. For this reason preparation and
examination of this class of materials needs special methods. Nevertheless discovery of the new
materials that have useful properties also require preparation and testing of a very large number
of samples. In order to speed up the research, application of the combinatorial thinking and the
combinatorial methods seems to be a realistic choice.
In 1970 J. J. Hanak1 reported a new methodology for fast screening of new electronic
materials. Hanak smoothly varied the concentration of components of the mixtures so the effects
of a large number of composition differences could be measured on a single sample. His
approach was disregarded for a very long time and only the advent of the combinatorial
methods2-5 speeded up the production and testing of the new materials. It was only in 1995 when
Xiang et al.6 prepared and tested parallel samples of materials. They also demonstrated that by
application of the combinatorial methods, the already known high temperature superconducting
compounds could be readily identified.
The field of combinatorial materials research is still rapidly expanding. The main research
areas that are using the combinatorial high-throughput approach are inorganic materials, catalysts
and polymers. Mapping of composition-property relationships, optimization by using of phase
diagrams and parameter space are an integral part of research.

6.1. Inorganic materials


The classes of inorganic materials that can be studied by combinatorial methods include
semi- and superconductors, dielectrics, phosphors, superalloys, magnetoresistive materials and
others. The main areas where the new inorganic materials may find application are electronic
devices, displays, memory devices, photonic devices, magnetic and optical data storage.
The inorganic solid materials are best investigated in the form of thin films. A number of
methods have been developed to fabricate these films. A few of these methods is outlined below.

6.1.1. Fabrication of thin film libraries


In the most inorganic libraries the primary combinatorial variable is the chemical

151

composition. Additional parameters, however, are also very important in library fabrication like
temperature, the atmosphere and the pressure that also should be varied.
One of the main methods in thin film library fabrication is the vapor deposition technique.
In these techniques like sputtering, pulsed laser deposition, electron beam evaporation or laser
molecular epitaxy, the target material is evaporated by a high energy source (ion gun, electron
gun, laser) then is deposited on the substrate. Another fabrication approach is to deliver the
components of the film into small wells in dissolved form then to evaporate the solvent.
The physical vapor deposition technique makes possible formation of two kinds of
libraries. In one kind of libraries the components are discrete films each having a definite
composition that differs from film to film. The other kind of library is formed in a single film so
that the composition is smoothly varied across the film. The composition of such films is
different in all of its points. The delivery of components in dissolved form into wells leads to the
formation of series of discrete films.
In one of the fabrication methods7 spatially addressable deposition can be performed that
in principle resembles to the earlier described light-directed, spatially addressable parallel
chemical synthesis invented by Fodor et al.8 Both methods are based on the use of masks that
cover parts of the solid surface at predetermined places. In fabrication of thin film libraries the
mask prevents deposition of the vapor on the covered parts of the solid surface on which the
films are made.

Figure 6.1. Quaternary masking system.9 Masks (a to d) and positions of the 1024 library
components (e)
152

The 5 masks (a trough d) of a quaternary masking system are shown in Figure 6.1. The
deposition process begins with the use of mask a in position shown in the figure. After the first
deposition the mask is rotated by 90o and the deposition is continued. This is followed by two
more rotations with deposition after each rotation. Then the deposition is continued by using the
remaining 4 masks each with 3 rotations after the depositions. By proper variation of the
precursors and their quantity (thickness of their deposited layers) in the deposition cycles a
library of 45=1024 discrete thin films is formed and the composition of the films differ in every
position (Figure 6.1/c). Such a thin film library can be composed on a 2.4x2.5 cm silicon plate.
Composition of a library by application of the masks is a fully combinatorial process and for this
reason is fast.
Each deposited film is composed of 5 layers that may differ in composition and/or
thickness. The multilayer films are postannealed at intermediate temperature (200-500 o) to
homogenize the composition then heated at elevated temperatures (800-1000 o) to promote
reaction of the constituents.
A parallel approach can also be applied to fabricate discrete film libraries: the constituents
of the films are transferred in solution into small wells formatted on plates. The solutions that
have different composition or concentration are delivered by automatic fluid dispensers or ink
jetting.10 The series of liquid samples are first heated to a moderate temperature to evaporate the
solvent then an elevated temperature is applied to bring about the reaction of the constituents and
finish the formation of the films.

Figure 6.2. Fabrication of discrete films by dispensing the constituents in solution


In Figure 6.3. a pulsed-laser deposition system is outlined.10 There are three different
targets (1, 2 and 3) from which the constituents of the film can be transferred into the substrate
(film). As the figure shows the targets can be rotated. The constituents of the target (1) are
evaporated by laser beam and deposit into the substrate. Evaporation of the targets 2 and 3 can be
executed after 120o and 240o rotations, respectively.
120 o

Targets
1,2,3

3
1

Laser beam
Substrate

Figure 6.3. Pulsed-laser deposition system with three rotatable targets (1,2,3)
153

In this deposition system the constituents of the films are deposited in layers that need to
be homogenized by post-annealing. There are in use other multi-target deposition systems in
which deposition from the targets occur simultaneously. In these co-deposition systems a
homogenous layer is formed.
As already mentioned it is possible to fabricate a multi-component library in a single
multi-layer film, too. The composition in each layer of the film is continuously changing along a
gradient.11 In Figure 6.4. such deposition system is outlined.
120o

120 o
Ca

Ba

Sr

d
Shutter

CaCO 3
a

Shutter

SrCO 3

Shutter

BaCO3
b

Figure 6.4. Gradient deposition on a triangle in three steps (a, b, c) using three targets (CaCO3,
SrCO3, BaCO3).
The substrate is an equilateral triangle shaped LaAlO3 piece (height 2.5 cm). The bottom
precursor is TiO2.In the first deposition step (a) the target is CaCO3. A shutter moves at constant
speed across the triangle in the direction of the arrow and gradually covers it. The thickness of
the deposited CaCO3 layer is continuously increasing (from 0 to 1225 A) and the maximum is at
the corner 1. Before SrCO3 deposition in the second step (b) the triangle is rotated anticlockwise
by 120o otherwise the process is the same. In the last step (c) BaCO3 is deposited again after 120o
rotation. In the final film (d) the composition is smoothly changing from point to point and the
maximum thickness of the CaCO3 (1225 A), SrCO3 (1475 A) and BaCO3 (1647 A) layers are at
the corners 1, 3 and 2, respectively. The chip is heated for homogenization at 400o for 24 hours
for homogenization and at 900 o for 1.5 hours for crystallization.

6.1.2. Screening
The methods used for screening of inorganic film libraries are very different since very
different properties have to be measured. The very large number of different compositions
present in a single chip, as well as the small size of the films represent special difficulties. Special
screening methods have been developed and are under development in order to face the problems
involved in the experiments. The non-destructive physical methods are preferred like optical
measurements. For example X-ray microbeam techniques available at synchrotron radiation
facilities are used with spot size of 3x20 m.
154

Some measurements are executed in serial mode. These are relatively slow since each
film has to be measured separately. Other screening determinations can be carried out in parallel.
These methods are much faster since in parallel mode, a large number of films, often the entire
library, can be measured simultaneously. This technique can be used, for example, in screening
of phosphor libraries.
The new film fabrication and screening methods are subjects of intensive research.
Appearance of even better and faster methods are expected in both synthesis and screening. But
according to same opinions even the use of the existing methods make the materials research
about 10,000 times faster than the conventional ones.
Detailed characterization of the different screening technologies falls outside the scope of
this book.

6.2. Heterogeneous catalysts


The heterogeneous catalysts belong to a very important class of materials since they are
used in the manufacture of a large number (about 7000) of chemicals and for this reason they
significantly contribute to the economy and to our living standards. Catalysts are used in about
60% of chemicals productions.12
Catalysts are complex materials. According to estimations about 50-70 elements of the
periodic table can be regarded as suitable components for heterogeneous catalysts. The activity
and specificity of catalysts depends not only on the elements they are built from but also on the
proportion of the elements and on the conditions of preparation. Despite the invested research
efforts in this area we still can not predict how the properties of a catalyst depend on
composition. For this reason the discovery of new catalysts can only be accomplished by trial and
error like that of other new materials and pharmaceuticals. This is the reason why the
combinatorial methods need to be applied in this area.
The multi-sample concept proposed by Hanak1 and his pioneering work in the 1970s
and 1980s were not followed but the advent of the combinatorial methods in the pharmaceutical
area2-4 initiated very intensive research for adaptation of the combinatorial methods to the catalyst
research. Today the principles of combinatorial approach are already accepted and widely
applied.
Computation methods
Information mining

High throughput catalyst


preparation

High throughput
testing

Combinatorial catalyst research

Figure 6.5. The three kinds of activities in catalyst research

When thinking about preparing new catalysts it has to be taken into account that the
possible number of different compositions of the elements is immensely high. It can easily be
155

calculated, for example, that if 6 elements are used as catalyst components and each of them in
any of 10 different concentrations, the total number of possible compositions is 106, one million
and this multiplied if different preparation conditions are applied. For this reason it is absolutely
impossible to consider testing all (50-70) elements in a catalyst search.
There are too many parameters that can be varied. This can be expressed by stating that
the parameter space is usually very high in catalyst research. As a consequence, computational
methods need to be applied in order to reduce the number of executable experiments. In
combinatorial catalyst research usually three kinds of activities are amalgamated (Figure 6.5.).
Catalyst discovery is a multi-step iterative process. It starts with library design that
involves data-mining from the literature and considers many variables like precursor materials
and their relative concentrations, support materials, mixing conditions, calcination temperatures,
the reactor applied in testing and analytical tools. The designed library is then synthesized and
tested. In these phases automated processes are usually applied. In a single step only a small
fraction of the huge parameter space can be explored. For this reason the experimental data
coming out from the first step are usually considered as preliminary results that are the basis of
further iterations. The further experiments are guided by computational methods that can make
predictions based on the already existing experimental data (Figure 6.6.).

Library design

Activity
Preparation

Testing

Information mining

Figure 6.6. Iterative scheme of catalyst library design, fabrication and testing

In the informatic platforms used so far following methods, some of them developed in the
area of artificial intelligence, have been applied:
Artificial neural network13
Genetic algorithm14
Holographic research strategy15
Support vector machines30
Decision trees31

156

6.2.1. Preparation and testing of catalyst libraries


There are different methods for preparation of catalyst libraries. The fabrication methods
depend on the form in which the catalysts are tested.
Thin film arrays. Like other solid materials, catalysts can also be prepared and tested in
the form of thin film arrays. In their preparation the same or similar methods can be used as those
described paragraph 6.1.1. for thin film materials. One approach is the use of vapor deposition
techniques: application of sputtering methods and masks. Parallel deposition of the components
in dissolved form is also used. The solutions are usually transferred to their place by automatic
liquid dispensers then are evaporated. Both kinds of thin films are tested in either calcinated or
reduced forms.
One analytical method applied in testing thin film libraries is mass spectroscopy. As
Figure 6.7. shows, each catalyst film of the library is sequentially heated by a CO2 laser beam to
the desired reaction temperature and the reactant gases are transported to the catalyst site by
larger diameter probe and the deflected reaction products go through the inner tube to the mass
spectrometer for analysis. By moving the probe and heating in x-y directions over the plate (or by
moving the plate itself) containing the films, the probe and the laser beam heating can be
positioned to every catalyst site.
The advantage of this approach is that not only the activity of the catalyst can be tested
but its selectivity too, since the concentration of the reaction products can be determined by this
method. It is a disadvantage, however, that the determinations can be done only sequentially, one
library member at a time. A similar method was applied for analysis of the activity of catalyst
powder libraries deposited on addressable positions of a heated substrate.17
To MS
Rection gas in

Catalyst film
Heating laser beam

Figure 6.7. Testing the activity of thin film catalyst library by analysis of the
reaction products by mass spectrometry

A different approach is demonstrated in Figure 6.8. The oxidizing activity of 16 catalysts


are tested that are deposited in small wells of a reactor in the format of pellets. The gas submitted
to the catalytic process is in contact with all catalysts and the activities are detected using infrared
thermography. The active catalysts appear as bright spots. See the explanation in paragraph
5.2.2.1.18

157

Ir
Bi

V
Ti

Zn

Rh

Gd

Pd

Cr

Fe

Co

Ni
Cu

Ag

Er
Pt

Figure 6.8. Detection by IR thermography of the catalytic activity of Ir and Pd in a library of 16


components.

The advantage of using the IR thermography is that the activities of the library members
can be determined in parallel, that is, in a single experiment. This is much faster than the serial
determination. The disadvantage is that nothing is known about the products and the selectivity
of the catalysts.
Microreactors. The traditional devices for testing new catalysts are single laboratory
reactors. These single reactors in combinatorial catalyst research are replaced by a parallel array
of microreactors. The catalysts to be tested are prepared in series of small vials using automatic
liquid dispensing systems. After evaporation of the solvents the residues are calcined, grinded
then filled into the microreactors.
A possible arrangement of 16 microreactors is demonstrated in Figure 6.9. The reactant
gas mixture is distributed among the 16 heated microreactors. A constant gas stream flows
through all reactors and finally leaves at the outlet. A probe is sequentially positioned to one of
the reactors and direct the reacted gas mixture to the analyzer. The analysis is usually done by
mass spectrometry, gas chromatography or by combination of both.
Heating
Positionable
probe
Inlet

Reactors,
end view

To MS, GC

Side view

Outlet

Figure 6.9. Schematic arrangement of 16 microreactors

This arrangement is advantageous because both the activity and the selectivity of the
catalysts can be determined by MS and/or GC. The effect of other parameters like that of the
158

temperature can also be tested. The disadvantage of this experimental set up is that the analysis
can be executed only sequentially that is relatively slow.
A different experimental arrangement and application of a different analytical method
makes possible the parallel analysis of the products of all reactors (Figure 6.10.).19
The device consists of a bundle of 49 catalyst cartridges each attached by a gold plated
nozzle to a 20 cm long analysis tube. The nozzle has at its center a small hole that allows the
gases to pass. At the end part of the bundle of the analysis tubes a CaF2 window, transparent to
IR radiation, is mounted. The distance from the end of the tubes is only 1 mm. Outside, at the
CaF2 window there is a semitransparent mirror that directs the IR radiation into the analysis
tubes.
The gaseous reaction mixture is fed into the catalyst cartridges through a common inlet.
The gas passes through the catalyst cartridges then enters through the nozzles into the analysis
tubes and finally leaves the apparatus in combined form at the outlet. The products are analyzed
while they are in the analysis tubes. The IR radiation reflected by the gold coated nozzles is
simultaneously determined by a focal plane array (FPA) detector.20 The FPA detector contains an
array of small detectors. The density is several thousand elements in a few square millimeters and
each element can record a full IR spectrum.
Gold-coated nozzle

Figure 6.10. Schematic view of 49 parallel reactors analyzing the products with IR measurement
in parallel mode. a: catalyst cartridge (1 cm), b: analysis cell (20 cm, diameter 4 mm), c: inlet of
gaseous reaction mixture, e: bundle of 49 catalyst cartridges, f: heater, g: bundle of analysis
tubes, h: outlet of gaseous products, i: CaF2 window, j: semitransparent mirror, k: IR beam
The parallel microreactor systems can be applied in monolith form, too.21 The catalyst
active layer is deposited on the walls of the monolith by a wash-coat procedure.
Catalysts on beads. As described by Schunk and his colleagues the catalysts can be
deposited on beads and can be tested in a new type of microreactor.22,23 As carriers of catalysts
uniform -Al2O3, -Al2O3, SiO2 or TiO2 beads are used. Their diameter is 1 mm and the each
bead carries a catalyst of different composition. The catalysts are deposited on the beads by wet
impregnation method using automated liquid dispensing. The beads are dried at 80 oC for 16 h
and calcinated at 420 oC for 3 h in air.
159

The catalysts carried by the beads are tested in a special single bead reactor developed
for this purpose. The reactor has two identical parts: the base and top (Figure 6.11.).
The wells in the upper and lower part of the microreactor are composed of silicon
membranes. The wells are etched into the membranes then the membranes are combined by
silicon fusion bonding. The number of wells in a microreactor is 384 or 625. After filling the
reactors with beads the upper and lower parts of the reactor are pressed together or are
permanently bonded.

Top
Bead
Base

Figure 6.11. Beads in the wells of the single bead reactor

Inlet

Positionable
sampling capillary

b
To MS
a

Figure 6.12. The one bead microreactor in the flange. a: top view, b: schematic side view

The reactors are mounted into a stainless steel flange (Figure 6.12.). The flange system
provides sealing and heating up to 450 oC. It is connected to the reaction gas feeding system and
provides a continuous flow of the gas through all individual reactor wells. It also contains a
sampling capillary that is sequentially and automatically positioned to the outlet of the reactor
wells and transfers the samples to a scanning mass spectrometer for analysis. The analysis time
per bead is about 25-80 sec.
The practice of preparation of catalyst libraries on inorganic beads offered the possibility
of speeding up the process by using the split-mix method.2-4 The first such approach was patented
in 200024 and applied for testing the single bead microreactor.25 A Mo-Bi-Co-Fe-Ni library was
prepared on 3000 -Al2O3 beads using the following solutions: (NH4)6Mo7O24.4H2O,

160

Bi(NO3)3.5H2O, Co(NO3)2.6H2O, Fe(NO3)2.9H2O and Ni(NO3)2.6H2O in molar concentrations of


0.025, 0.075, 0.25, 0.25 and 0.1, respectively.
The catalyst precursors were applied in the following order: Bi, Co, Fe, Ni. Before
addition of the precursor solutions the beads were split into 4 equal samples placed into small
dishes. The precursors were added in four different concentrations: 0, 0.002, 0.1 and 1 weight %.
After the impregnation period the solutions were evaporated the four samples of beads were dried
at 80 oC for 16 h, calcinated at 400 oC then mixed (Figure 6.13.).

BiIII

0.0
Drying

Calcination

0.002
Drying

0.1
Drying

1.0
Drying

Calcination Calcination Calcination


Drying

Figure 6.13. The first cycle of the split-mix process


In continuation three similar cycles were executed by adding sequentially Co II, FeII and
Ni in the same concentrations. Finally, the library was tested in the one bead microreactor. The
split-mix procedure was also used for preparation of catalyst libraries by others.26
II

6.2.2. Catalyst library design


In the field of combinatorial catalysis different approaches are used for library design.
Industrial companies, like Symix, Avantium, hte GmbH, are using their own proprietary
methods. In academic research the Genetic Algorithm (GA) is widely applied. The combination
of GA with Artificial Neural Networks (ANNs) has also been reported. Recently, a new
approach, the Holographic Research Strategy (HRS) and its combination with ANNs have been
described.
The holographic research strategy (HRS). HRS was developed with intention to make
possible the exploration of the huge parameter space in catalyst research and find the best catalyst
composition by a reasonable number of experiments. This approach has been developed by a
Hungarian group and the strength of this approach has been demonstrated using both
hypothetical and real experimental data.15

161

HRS is based on a special arrangement of multi-dimensional data in a 2D space. The


compositions of the catalysts are represented in two dimensions by plotting the substrates and
their quantities on the two (x,y) axes of a sheet. Before discussing HRS different possibilities for
plotting repeating quantities, as shown in Figure 6.14., has to be shown.
4

Figure 6.14. Two possibilities for plotting repeating quantities along an axis

The same quantities are arranged in the two plots but in different order. In plot a there are
large changes in the quantities along the horizontal axis. In plot b, however, the quantities are
changing smoothly in a wave-like manner. Any given quantity differs from its neighbors by only
one unit. In HRS this second plotting form is applied.
Lets suppose that in the hypothetical experiment six precursors are used represented by A,
B, C, D, E and F. Their concentration levels are shown in Table 6.1.

Table 6.1. The concentration levels of the precursors


Level
1
2
3
4

A B
0.0 0.0
0.2 0.2
0.4

C
0.0
0.2
0.4
0.6

D E
0.0 0.0
0.2 0.2
0.5
1.0

F
1.0
0.2
0.5

The precursors A, B, C, D, E, F are used in 3, 2, 4, 4, 2 and 3 concentration levels,


respectively. So the number of catalyst compositions is 3x2x4x4x2=576. In HRS the different
catalyst compositions can be represented in a holographic sheet demonstrated in Figure 6.15.
Each rectangle corresponds to one of the 576 catalyst compositions. The parallel lines above and
left to the axes represent the concentration levels of the precursors incorporated into the catalysts
represented by the rectangles below and right to the lines, respectively. The order of precursor
plotting is read upward and right to left at horizontal and vertical axes, respectively (Figure
6.15.).
As shown in Figure 6.15 moving from left to right along the X-axis the value of A
increases in 3 steps to its maximum thus forming a half wave. At the first level of A the level of
B increases in the possible two steps to maximum then at the second level of A decreases in two
steps to minimum. At the third level of A in additional two steps B increases again to maximum
forming altogether one and a half wave. Analogously, the four possible levels of the C are plotted
162

above a single fixed level of B so forming three full waves. The full combination of the levels
of variables along the X-axis leads to 24 data points along the X-axis.
The remaining D, E and F variables are similarly plotted along the Y-axis leading to 24
data points. Eventually all of the 574 catalyst compositions appear in the 2D holographic
representation.
C
B
A

F E D

Figure 6.15. Holographic representation of catalyst compositions by HRS.


The A, B and C precursor concentrations are plotted along the horizontal axis, those of D, E and
F along the vertical axis. The different concentrations are represented by the lines parallel with
the axes.
HRS helps to find the best catalytic performance by testing as less as possible number of
compositions from the potential 576 ones. The initial step is the determination of the first catalyst
library as shown in Figure 6.15. Its members are highlighted by gray rectangles.
After determination of the best catalytic performance a special two dimensional
transformation is applied. This means that the order of the precursors is changed in plotting.
The A, B, C and D, E, F order of the precursor plotting is replaced by C, B, A and F, D,
E, respectively (Figure 6.16/a). As a result of the transformation the best composition (hit) has a
new neighborhood. In the close vicinity of the hit there are compositions that have not been
tested. As the next step of iterations an area of 5x5 rectangles a so called experimental window
- around the black rectangle (showed by the enhanced large rectangle) is assigned for
experimental testing. The best catalyst coming out from these experiments is shown by the small
enhanced rectangle. The two dimensional transformation is repeated again: orders of C, B, A and
F, D, E are replaced by C, B, A and F, E, D, respectively.
The black rectangle in Figure 6.16/b shows the new position of the best catalyst. In the
next iteration again a 5x5 catalyst composition is tested and the result is represented by the
enhanced small rectangle in the figure. The results of two more iterations are seen in Figures
6.16/c and d. The small enhanced rectangle appearing in Figure 6.16/d is supposed to represent
the global maximum of the catalyst activities. By applying HRS, the maximum catalyst activity
can be determined by testing only a fraction of the full parameter space. However, the algorithm
can be stacked in a local optimum if the experimental window is too small. This potential
difficulty can be overcome by selecting not only the single hit, but also the best second and third
composition. Additionally, the efficiency of the optimization can be enhanced by combined
163

application of artificial neural networks and HRS.

B
A
C

E D F

A
B
C

D E F

B
C
A

E F D

C
B
A

F E D

Figure 6.16. The holographic presentations after the two dimensional transformations. Black
rectangle: location of the best catalyst in the transformed representation, large enhanced
rectangle: catalyst compositions selected for the next catalyst generation, small enhanced
rectangle: best experimental catalyst composition within the catalyst generation

6.3. Polymers
Polymers are an important class of materials. Their application is so widespread that our
life today could not be imagined without them. They are used as structural, packaging and
coating materials, they are components of our clothes and they are applied even in
microelectronics and nanotechnology. Their properties depend not only on composition but to a
high degree on conditions of their processing that is effected by a large number of variables. The
combinatorial methods that are introduced and used in this area help to faster determine the
influence of the mentioned variables. In this respect the polymer libraries prepared in the form of
continuous thin films are very important. The dependence of the properties of the films, like
164

dewetting, phase behavior, surface morphology and crystallization, can be studied by optical
means. Application of studies of continuous gradient films instead of conventional approaches
makes the research faster, cheaper and more successful. Below, fabrication of some thin film
libraries is described.

Fifure 6.17. Principle of the flow coating process generating continuous gradient thickness films.
S: substrate, : blade angle, G: height of the blade above the substrate (typically from tens of
microns to hundreds of microns), H: thickness of the wet film, h: thickness of the dry film. The
substrate is moving in the direction of the white arrow.
(Reprinted with permission from C. M. Stafford et al. Rev. Sci. Instrum. 2006, 77, 023908)

Generating thickness gradient libraries. Thickness is an important factor in the behavior


of thin films. Figure 6.17. shows a velocity-gradient knife coater that can be applied to prepare
coatings and thin films containing continuous thickness gradient.27 . The substrate can be, for
example, polished silicon wafer or glass slide. As the figure shows a polymer solution is spread
on a substrate under a knife edge by moving the substrate at constant acceleration. The result is a
dried film with controllable thickness. The thickness is controlled by the instantaneous velocity
of the substrate relative to the blade. Lower velocities generate thinner film, while high
acceleration results in a relatively steep gradient.
The thickness of the films is varying in one dimension. The thickness at different
positions can be determined by using a UV-visible interferometer.28
Composition-gradient libraries. The properties of thin polymer films largely depend on
composition. Preparation and examination of composition-gradient thin film libraries makes
possible to speed up the determination of composition - property relationship. The compositiongradient libraries are prepared in a process involving three steps illustrated in Figure 6.18.28
The first step is preparation of the gradient (Figure 6.18/a). Before starting preparation the
small vial is filled with polymer solution B. After starting preparation two syringe pumps begin
to operate. One of them introduces polymer solution A at rate v1 and the other one withdraws
polymer solution at rate v2. As a result, the composition in the vial continuously changes. A small
amount of the solution is continuously extracted with an automated sample syringe. The
composition of the solution entering into the syringe is also continuously changing. At the

165

beginning the solution is rich in component B and at the end is rich in polymer A. So at the end
the syringe contains a gradient along its length.

B-rich
A-rich

substrate

B
v1

v2
stirrer
a

Figure 6.18. Preparation of a composition-gradient library from polymers A and B. a: mixing A


and B and extracting the gradient, b: deposition of the gradient, c: film spreading

In the second step (Figure 6.18/b) the content of the syringe is deposited on a substrate as
a stripe. The composition of the deposited stripe forms a continuous gradient.
The third step (Figure 6.18/c) is spreading the composition-gradient stripe on the substrate
orthogonal to its direction by using a knife-edge coater. After the solvent evaporates a continuous
linear gradient film remains on the substrate. The remaining solvent is removed under vacuum
during annealing.

sample with thickness gradient

thickness gradient

heating

cooling

temperature gradient

Figure 6.19. Introducing the temperature gradient


166

Temperature-gradient libraries. Two dimensional gradients.28,29 The thin polymer films


deposited on the substrate can be submitted to annealing at continuously changing temperatures.
If the film is a thickness- or composition-gradient library then the direction of the temperature
gradient is chosen to be orthogonal to the primary one. This way two dimensional gradient
libraries can be formed. One of the two parameters is changing in x the other one in y direction.
The film differs from point to point by at least one of the two parameters and so it can be the
source of a very large number of experimental data.
Figure 6.19. shows the how a thickness-temperature-gradient library can be formed. The
direction of the temperature gradient is perpendicular to the direction of the thickness gradient.
The device is an aluminum T-gradient stage. It uses a heat source and a heat sink to produce a
linear gradient ranging between adjustable end point temperatures (160 to 70 oC over 40 mm).
The end point temperatures can be adjusted within the limits of the heater, cooler and the
maximum heat flow through the aluminum plate.

Figure 6.20. High-throughput screening by computer controlled microscope


(Reprinted with permission from J.C. Meredith et al. Macromolecules 2000, 33, 9747
Copiright (2000) American Chemical Society)

High-thoughput screening. The simplest way to study the one or two dimensional
gradient libraries is optical microscopy. Figure 6.20. exemplifies this. A digital camera coupled
to an optical microscope takes 1024x1024 pixel images and sends them to computer for analysis.
The computer also controls the x-y movement of the sample stage over a predetermined grid that
divides the sample area into a virtual array of cells. The cells are photographed in serial manner
and the magnified images are sent to computer.

167

Figure 6.21. Composite images of a thickness-temperature gradient polystyrene library


(Reprinted with permission from J.C. Meredith et al. Macromolecules 2000, 33, 9747, Copiright
(2000) American Chemical Society)

As an example, Figure 6.21. shows the result of a dewetting experiment carried out with a
thickness-temperature gradient polystyrene library. The thickness range was from 33 to 90 nm.
The endpoint temperatures were 135.0 0.5 and 75.0 0.1 oC over 40 mm (gradient 2 oC/mm).
Figure 6.22. shows a composite picture.
Figure 6.22. shows magnified images of the A, B and D boxed regions of Figure 6.22.
These photos illustrate how the structures within the library depend on thickness and temperature.

Figure 6. 22. Close-up images of the boxed areas in Figure 6.22.


(Reprinted with permission from J.C. Meredith et al. Macromolecules 2000, 33, 9747, Copiright
(2000) American Chemical Society)

The methods demonstrated above represent only a few examples of the numerous
combinatorial approaches applied in the field of polymer research. Even these few examples
convince the reader about the applicability of the combinatorial methods in this area.

168

References
1. J. J. Hanak J. Mater. Sci. 1970, 5, 964.
2. . Furka, F. Sebestyn, M. Asgedom, G. Dib, In Highlights of Modern Biochemistry,
Proceedings of the 14th International Congress of Biochemistry, VSP. Utrecht, The
Netherlands, 1988, Vol. 5, p 47.
3. . Furka, F. Sebestyn, M. Asgedom, G. Dib Proceedings of the 10th International
Symposium of Medicinal Chemistry, Budapest, Hungary, 1988, p 288, Abstract P-168.
4. . Furka, F. Sebestyn, M. Asgedom, G. Dib Int. J. Peptide Protein Res. 1991, 37, 487.
5. H. M. Geysen, R. H. Meloen, S. J. Barteling Proc. Natl. Acad. Sci. USA 1984, 81, 3998.
6. X.-D. Xiang, X. Sun, G. Briceno, Y. Lou, K.-A. Wang, H. Chang, W. G. WallaceFreedman, S.-W. Chen, and P. G. Schultz Science 1995, 268, 1738.
7. J. Wang, Y. Yoo, C. Gao, I. Takeuchi, X. Sun, H. Chang, X.-D. Xiang, P. G. Schultz
Science 1998, 279, 1712.
8. S. P. A. Fodor, J. L. Read, M. C. Pirrung, L. Stryer, A. T. Lu and D. Solas Science 1991,
251, 767.
9. X.-D. Xiang In I. Sucholeiki (Ed) High Throughput Synthesis, Principles and Practices,
Marcel Decker Inc. 2000, 231.
10. J. D. Hewes, D. Kaiser, A. Karim, E. Amis Combinatorial Chemistry
http://polymers.msel.nist.gov/.
11. H. Chang, X.-D. Xiang In I. Sucholeiki (Ed) High Throughput Synthesis, Principles and
Practices, Marcel Decker Inc. 2000, 251.
12. Selim Senkan Angew. Chem. Int. Ed. 2001, 40, 312.
13. Z. Hou, Q. Dai, X. Wu, G. Chen Appl. Catal. A.: General 1997, 161, 183.
14. V. Nissen Evolutionre Algoritmen, Deutscher Univeritatsverlag, Bamberg, 1994.
15. L. Vgvri, A. Tompos, S. Gbls, J. L. Margitfalvi Catal. Today 2003, 81, 517.
16. P. Cong, R. D. Doolen, Q.Fan, D. M. Giaquinta, S. Guan, E. W. McFarland, D. M.
Poojary, K. Self, H. W. Turner, W. H. Weinberg Angew. Chem. Int. Ed. 1999, 38, 484.
17. M. Orschel, J. Klein, H. W. Schmidt,W. F. Maier Angew. Chem. Int. Ed. 1999, 38, 2791.
18. F. C. Moates, M. Somani, J. Annamalai, J. T. Richardson, D. Luss, R. C. Willson Ind.
Eng. Chem. Res. 1996, 35, 4801.
19. P. Kubanek, O. Busch, S. Thomson, H. W. Schmidt, F. Schth* J. Comb. Chem. 2004, 6,
420.
20. C. M. Snively, G. Oskarsdottir, J. Lauterbach Angew. Chem., Int. Ed. Engl. 2001, 40,
3028.
21. M. Lucas, P. Claus Appl. Catal. A.: General 2003, 254, 35.
22. S. A. Shunk, C. Baltes, J. Klein OIL GAS European Magazine 2/2005, 77.
23. T. Zech, G. Bohner, O. Laus, J. Klein Rev. Sci. Instrum. 2005, 76, 062215-1.
24. WO 002002043860A2
25. J. Klein, T. Zech, J. M. Newsam, S. A. Schunk Appl. Catal. A.: General 2003, 254, 121.
26. Y. Sun, B. C. Chan, R. Ramnarayanan, W. M. Leventry, T. E. Mallouk, S. R. Bare, R. R.
Willis J. Comb. Chem., 2002, 4, 569.
27. C. M. Stafford, K. E. Roskov, T. H. Epps III, M. J. Fasolka Rev. Sci. Instrum. 2006, 77,
023908.
28. J. C. Meredith, A. Karim, E. J. Amis MRS Bulletin April 2002.
29. J. C. Meredith, A.P. Smith, A. Karim, E.J. Amis, Macromolecules 2000, 33, 9747.
30. L. A. Baumes, J. M. Serra, P. Serna, A. Corma J. Comb. Chem., 2006, 8, 583.
169

31. L. A. Baumes, M. Moliner, A. Corma, QSAR & Comb. Sci., 2007, 26, 255.

170

7. Computational aspects of library design and synthesis


The appearance of combinatorial and HTS methods made the synthesis and screening of
millions of compounds a reality. This generated a so huge amount of data that conventional
bookkeeping proved unable to handle. In order to overcome this situation new software
companies were founded that produced many products to serve the need of combinatorial and
HTS methods. Since application of combinatorial chemistry began in pharmaceutical research the
computational tools were developed keeping in mind the needs of drug research. Some software
are used in library design, others help chemistry and again others proved helpful in data
recording, analysis and data retrieval.
Drug research is a long and expensive process. The chemical part of the discovery of a
drug usually begins after the therapeutic target has been identified. First a lead compound has to
be discovered that shows at least a limited effect on the target. Then comes the optimization
process when, by modifying the structure of the lead, a more effective compound has to be found
and, at the same time, the unwanted side effects has to be reduced to a possible minimum.
Although candidate molecules isolated from natural sources are still very important, the
synthetic small organic molecules are even more important. Taking into account the high
effectiveness of the combinatorial synthetic processes they are the preferred preparation methods
that are applied in both the lead discovery and optimization. The best approach to find a lead
compound is to synthesize compound libraries and screen them. It is still a big problem, however,
what to synthesize. In other words how to design the libraries that are then prepared using the
combinatorial methods. In contrary to the peptide libraries that have a limited number of
components that can easily be calculated, the potential number of the small organic compounds is
very difficult, if not impossible to calculate. Their number is estimated 1 to be in the range of
10 200. It is very likely that if one could synthesize and test them all one or even more effective
molecules could be found for every potential therapeutic target. The problem is that even if all the
matter of the visible universe converted into single molecules only a very small fraction of the
total number of the possible structures could be accessible. To put it in a different aspect by
synthesizing every year 1,000 libraries containing 1 million components each, the synthesis
would require 10191 years while our universe is about only 1010 years old.
Taking into account the problem outlined above one concludes that the libraries to be
prepared need to be carefully designed. A generally accepted approach is that in the lead
discovery phase highly diverse libraries are generally prepared. In the lead optimization phase, on
the other hand, the preferred library components are structurally similar to the lead molecule. The
problem is that neither the molecular diversity nor the structural similarity can be exactly defined.
Another problem is that structural similarity in many cases does not mean biological similarity.
In many cases a very slight modification in the structure of the lead results in complete loss of the
biological activity.
In the early phase of the pharmaceutical discovery process usually a high diversity
compound library is needed because this kind of library offers a good possibility to find a lead. In
the optimization stage, on the other hand, the similarity to the lead of the components is
important and not the high diversity of the library.

171

A considerable theoretical effort has been devoted to make the characterization of


molecular diversity and similarity computable. Hundreds of indices and descriptors have been
introduced for this purpose. The 2D descriptors,2 for example, are based on predefined structure
fragments. The presence or absence of these fragments in a molecule is expressed as a binary
bitstring. The 3D descriptors3 by using distances between atoms and bond angles reflect the 3
dimensional shapes of molecules, and as such, is expected to better model the biological behavior
then the 2D descriptors. The conformational flexibility of the molecules, are, however, source of
serious difficulties in applications. Other descriptors like topological indices,4 2D and 3D
fingerprints,5 steric fields,6 atom pair fingerprints7,8 and others are also in use. The software
developed to help library design apply these descriptors and indices.
In library design two main strategies can be distinguished:
1. Reactant based approach
2. Product based approach.
In reactant based approach an optimized subset of building blocks are selected
considering primarily the experimental possibilities of the synthesis and not the properties of the
products. Using this approach the synthesis usually produces combinatorial libraries. In product
based approach, on the other hand, the properties of the produced molecules are taken into
account and the reactants are accordingly selected. Typically huge virtual combinatorial libraries
are enumerated and then the compound arrays to be synthesized are selected by applying
different filters to remove the unwanted components. Since the selection is a cherry picking
process the resulting compound array is not a combinatorial type library and can not be
efficiently synthesized by the split-mix method. As examples, two applications are mentioned:
DiverseSolutions and Selector developed by Tripos.
DiverseSolutions assesses the chemical diversity of libraries, selects diverse or
representative subsets and compares their diversity. It also identifies those molecular metrics that
best distinguish differences between compounds.
Selector characterizes, compares and samples sets of compounds. The available
descriptors include fingerprints and atom pair distances. Clustering tools identify relationships
between compounds based on their similarity. Compound selection includes Tanimoto
Dissimilarity and the Reciprocal Nearest Neighbor approach.
Drug-likeliness of the library components is again a preferred requirement. Despite that
its exact definition is not possible. Physical and chemical properties of the molecules like
solubility, hydrophobicity, number of H-bond donor and acceptor centers, reactivity biological
relevance and others are also taken into account. Lipinski and his colleagues analyzed the
structure of a large number of marketed drugs and summarized their finding in the very important
Lipinskis Rule9 that has to be considered in library design:

Molecular weight > 500


Number of hydrogend bond donors > 5
Number of hydrogen bond acceptors > 10
ClogP >5 or MlogP > 4.15

According to the rule if any two of the above conditions are satisfied it indicates a poor
absorption or permeation of the compound.

172

The brain blood barrier penetration is again another property that needs to be considered.
Methods of quantitative structure activity relationship (QSAR) including artificial neural
networks and a genetic algorithm based approaches are also widely applied.
Other software that are used in library design and drug search are: CombiLibMaker,
ChemEnlighten, TOPKAT and ChemSpace.
CombiLibMaker performs virtual combinatorial chemistry. Libraries can be defined and
enumerated with full control of stereochemistry. The generated virtual libraries can be stored in
databases for subsequent searching and retrieval. The libraries can be submitted to virtual
screening including docking. CombiLibMaker reads and writes all of common 2D and 3D
structural database formats.
ChemEnlighten is a decision support program for scientists who work with lists of
compounds to set priorities for synthesis, screening and purchase, provides access to vital
information and analysis tools. Different databases can be searched and standard descriptors can
be calculated as molecular weight, hydrogen donors and acceptors, and works with ClogP/CMR
to calculate log P, molar refractivity, molecular connectivity, shape and topology metrics.
Different subsets can be quickly selected.
TOPKAT offers quantitative structure-toxicity relationship (QSTR) models that predict
toxicity of a compound solely from its structure.
ChemSpace helps to decide which chemistry should be used in the synthesis of a library
and which chemistry will most likely result in activity. A typical virtual library contains at least
50 million compounds. They can be screened according their physicochemical properties,
novelty, drug-likeness, diversity, therapeutic relevance and synthetic feasibility.
There are other kinds of software that help the combinatorial chemist in practical
realization of synthesis of libraries. What synthesis route to choose? Are the starting materials
available? Where to buy them at reasonable price? These are important questions. Software of
MDL Information Systems, Inc. helps to solve these problems.
Available Chemical Directory (ACD) is a very important database where the starting
materials and reagents are found. There is a list of 435,000 chemicals in the ACD that can be
purchased from 680 suppliers. The database is continually updated. Once the list of the starting
compounds and reagents is selected ACD helps to identify and locate the commercial sources and
side-by-side comparisons can be made concerning purity, quantity and price.
By use of MDL ISIS ACD Finder the compounds can be searched by structure, name and
formula. Figure 7.1. shows 3-ethyl butylacetate as an example. The ACD also links to Pure
Substance Database that provides safety, hazard and regulatory information. The structures can
be entered using ISIS Draw. This is a structure drawing program that can be used in construction
of the library, too. It can be downloaded free of charge from the home page of MDL.
Other software of MDL provides fast access to synthetic methodology information by
connecting directly to chemical literature.

CrossFire Beilstein provides access to the most important collection of preparation of


organic compounds. The huge amount of data like those mentioned below - is easily
searched by computer.
CrossFire Gmelin preparation of inorganic compounds collected over two centuries.
ChemInform Reaction Library contains new and novel methodologies.
Current Synthetic Methodology comprises the most innovative and significant new
reactions since 1992.

173

Figure 7.1. Search in Available Compounds Database

174

MDL Reference library of Synthetic Methodology contains functional group


transformations, chiral chemistry, metal-mediated transformations and heterocyclic
chemistry.
MDL Solid PhaseOrganic Reactions is an important source of new synthetic
methodologies using solid support including description of polymer and other solid
supports, linkers and protecting groups.
ORGSYN Database contains those general and verified methods that are published in
Organic Syntheses series.
Comprehensive Asymmetric Catalysis is a major reference work focusing on reviewing
catalytic methods for asymmetric organic syntheses.
Comprehensive Organic Functional Group Transformations is a reference work focusing
on construction, introduction and interconversion of functional groups.
Encyclopedia of Reagents of Organic Syntheses is a major reference work on preparation,
handling and use of reagents in organic chemistry.
Science of Synthesis is the worlds most comprehensive major reference work covering
functional group transformations and syntheses of compound classes.

There are also available software developed for searching pharmacology, safety,
metabolism and toxicology information:

MDL Drug Data Report contains current bioactivity findings and newly launched
developmental drugs
MDL Comprehensive Medicinal Chemistry contains searchable 3D models plus important
biochemical properties including drug class, logP and pKa.
OHS Hazard Communication contains full-service tools for employee safety etc.
MDL Metabolite Database is the worlds largest and most comprehensive xenobiotic
transformations compiled from literature.
MDL Toxicity Database contains the complete content of the Registry of Toxic Effects of
Chemical Substances database.

PubChem, a freely available component of National Institute Health's Molecular Libraries


Roadmap Initiative (http://pubchem.ncbi.nlm.nih.gov/) also provides a wide range of information
on small molecules: structure-activity analysis, structure clustering tools, search of unique
chemical structures using names, synonyms or keywords. Links to available biological property
information are provided for each compound. Among others, PubChem also provides a fast
chemical structure similarity search tool.
The literature published in the field of combinatorial chemistry is very important for the
combinatorial chemist. A very useful compilation of papers and books published from the
beginnings through 2003 can be found at http://www.5z.com/divinfo/.
Software for storage and retrieval of compound libraries are also important for the
practicing combinatorial chemists. Software developed to operate on automatic machines is
exemplified by Odyssey of RoboDesign (now Rigaku).

175

Odyssey is a high-throughput sample storage and retrieval system. The system is equipped
with an identification system and barcode tracking, enabling programmed access to each
microwell plate, and can process over 400,000 plates a year (Figure 7.2.).

Figure 7.2. the Odyssey storage and retrieval system


(photo: www.rigaku.com)

An LCD touch-screen provides an intuitive user interface. The systems are fully
automatic and include comprehensive safety features assuring reliable and safe operation. Data is
accessed using a customized database. The Odyssey is available in 3 configurations suitable for
storage and retrieval of 2,500, 5,000 and 10,000 plates.
7.1. Software companies
The companies listed below are engaged in developing software and commercialize such
products. Beside the name of the companies the addresses of their home pages are also indicated.
Aber Genomic Computing (www.abergc.com).
It is an informatics company based in Wales, UK. AberGC provides novel data mining,
scheduling and predictive modeling solutions based upon evolutionary computing, machine
learning and other supervised learning techniques. Their new product, gmax-bio , is the first
commercial package to fully utilize these techniques and is designed for all aspects of drug
discovery research. Gmax-bio is a novel informatics package based on genomic computing
techniques which uniquely utilize Darwinian methods of natural selection to evolve mathematical
algorithms to rapidly solve complex data-mining and predictive modeling problems.
Accelrys (www.accelrys.com).
It is a leading provider of software for biologists, chemists, and materials scientists. Covers
computation, simulation, and the management and mining of scientific data.
Afferent Systems (www.afferent.com).

176

It offers an integrated solution for combinatorial chemistry informatics, including instrument


control and product data generation, access, and storage, all optionally using an Oracle-based
enterprise-wide database.
Anacapa (www.anacapagrp.com).
It specializes in software development for process automation. Proficiencies include
customization, documentation and legacy porting. Anacapa Group, Inc. also specializes in
product marking and handling solutions for pharmaceutical manufacturing and research
applications. They are experts in barcoding and data capture solutions for tubes and vials,
microtiter plates/trays, syringes, ampules and other laboratory vessels.
Automation Developers (www.automationdevelopers.com).
It provides a wide range of custom software solutions. From start in the field of laboratory
automation for the pharmaceutical and analytical chemistry industry, to integrated Microsoft
office solutions for business.
Automation Partnership (www.automationpartnership.com).
The offered systems help to reduce drug discovery timelines, resolve process bottlenecks and are
in use at major pharmaceutical companies worldwide including sample management software and
inventory data handling at any scale of operation. HomeBase is a narrow aisle based automated
system for management of sample libraries.
BergenShaw International (www.bergenshaw.com).
Its Focus software product enables high throughput laboratories to rapidly accelerate their
response to change by quickly identifying those process factors and combination of factors
associated with yield loss and yield gain.
Bioreason (www.bioreason.com).
It is in the business of developing proprietary data mining tools and applications for chemical and
biological information. Currently is focused on developing data mining applications for drug
discovery from high throughput screening (HTS) data. The software systems are developed by
combining existing and novel data mining techniques with chemical information and molecular
modeling techniques. Bioreason will work with the data from pharmaceutical and biotechnology
firms to discover and optimize drug leads in their HTS data.
Chemical Computing Group (http://www.chemcomp.com/)
Among the solutions offered by the company are focused combinatorial library design, diverse
combinatorial library design and combinatorial library enumeration. Other software for
calculation electrostatic maps, probabilistic contact potentials, ligand-receptor docking, multifragment search and molecular surface & maps are also available.
Columbus Molecular Software (www.columbus-molecular.com).
The company was founded in 1997. Develops and markets decision support tools for use by life
scientists engaged in drug discovery. Helps organizations to effectively extract and visualize the
valuable knowledge contained in research data.

177

CombiChem, Inc. (www.combichem.com).


It engages in collaborations and partnerships to accelerate the discovery process for
pharmaceutical, biotechnology and agrochemical organizations. CCI's proprietary and unique
discovery approach is a platform technology which integrates proprietary molecular design
technology and rapid synthesis with chemistry expertise.
dataBasyx (www.databasyx-inc.com).
The company is a laboratory instrumentation tracking, service and supports application.
Datect (www.datatect.com).
By data validation and data processing technology the company increases the automation of
scientific data analysis by alleviating the need for time consuming manual examination of all data
and results. Datects technology also standardizes data validation and data analysis decisionmaking, eliminating any potential variability in individual judgment.
DoubleTwist, Inc. (www.doubletwist.com).
The company is a leading provider of genomic information and bioinformatics analysis
technologies. It provides research environments that leverage information technology and the
World Wide Web to simplify and accelerate genomic research.
EMAX Solution Partners (www.emax.com).
It integrates chemical information systems to speed productivity and compliance for major
corporations. The company specializes in solutions for both research and operations. In the
competition to find breakthrough products, discovery research groups generate huge data stores
through the use of combinatorial chemistry and high throughput screening. OPTIMA from
EMAX can accelerate the drug discovery process by integrating proprietary and commercial
substance information into a software infrastructure for rapid access. The complete OPTIMA
solution is an advantage in the race to increase promising drug leads and feed the new product
pipeline.
Galactic (www.galactic.com/galactic/index).
The company has been dedicated to designing and developing state-of-the-art scientific software
for the spectroscopy and chromatography communities. It offers open architecture software
solutions compatible with virtually any laboratory instrument. The unique approach provides a
single software platform to integrate all laboratory data in one common package for archival
storage, data viewing, processing, plotting, and management.
GeneData (www.genedata.com).
The company specializes in the computational analysis of genomes, transcriptomes, proteomes,
and metabolomes as well as compound libraries, and offers a network of communicating software
modules, each of which addresses a critical step in the product development cycle of life science
companies. For the analysis of high-throughput screening (HTS) data the company provides a
modular and scaleable software solution, GeneData ScreenerTM, that improves the
understanding of biological activity of compounds,

178

geneticXchange (www.genetixchange.com).
It is a software product company that produces the K1 System Data Integration Middleware
Platform for any Biotech needing a solution to the biological data integration challenge.
Gensym Corporation (www.gensym.com).
The company is a leading supplier of software products and services for intelligent real-time
systems that help organizations manage and optimize complex dynamic operations. Common
applications include quality management, process optimization, dynamic scheduling, network
fault management, energy and environmental management, and abnormal situation management.
IBM (www-3.ibm.com/solutions/lifesciences/solutions.html).
The company offers technology infrastructure in high-performance computing, data integration,
knowledge management, storage, e-business, and information services. Today, IBM systems
include the most advanced storage management; and a world-renowned computational biology
center.
IDBS (www.idbs.co.uk).
Its specialized applications are used to acquire, manage, integrate and visualize chemical and
biological data ranging from the large amounts of data generated in High Throughput Screening
and combi-chem programs, through multiple IC50 determinations and profiling, to the complex
experimental protocols of toxicology studies.
Labtronics (www.labtronics.com).
The company is recognized for laboratory automation and instrument interfacing. Labtronics has
an Innovative Software Solution.
LabVantage Solutions (www.labvantage.com).
The company is a provider of state-of-the-art software, implementation services, and consulting
to leading discovery-oriented, conventional research, and quality control laboratories. It offers
configurable, industry-specific solutions for in a variety of industries including high throughput
screening, genomics, proteomics, pharmaceuticals, oil and gas, process chemicals, food and
beverage, environmental, and forensics.
Managed Ventures (www.menagedventures.com).
The company has developed custom application components in Java for High Throughput
Screening (HTS), compound registration, inventory and proteomics. Services include the
implementation of solutions for drug discovery using pre-built components to rapidly deliver
working web-based applications. HTS and other informatics applications have been integrated for
Managed Ventures clients in less than 8 weeks using open source Java, XML, SOAP and any
JDBC-compliant database (Oracle, DB2, SQL Server, mySQL).
MatriCal (http://www.matrical.com/).
The company specializes in microwell plates and automated sample management and storage
solutions. The MatriStore is a compact, economical, climate controlled compound management
system that supports multiple sample formats, including 96 and 384 mini-tubes and 96 to 1536
microplates, and others. A storage capacity from 750k to 40 million samples with automatic
sample retrieval in plate or individual sample format
179

MDL Information Systems Inc. (www.mdli.com).


MDL is the leading provider of integrated scientific information management systems, databases,
and services used worldwide in pharmaceutical, chemical, agrochemical, and biotechnology
research and development, and in other industries that use chemical products.
NuGenesis Scientific Data Management System (www.nugenesis.com).
It is an application-independent software and database platform that automatically creates a
common electronic format for laboratory data and scientific information that can be unified,
managed and shared throughout the enterprise. NuGenesis Technologies also offers
comprehensive consulting, support and training programs.
Odysis (www.odysis.com).
It supplies scheduling software technologies and services for building more reliable, productive
and flexible automated systems for a broad range of industrial, scientific and commercial
applications.
Pharma Algorithms (http://ap-algorithms.com).
The company provides combi-chem and HTS strategies, a comprehensive QSAR approach to
resolve and accelerate the process of data evaluation contributing to early lead optimization. The
software introduces the unique capability of developing and building computational algorithms
in-house utilizing comprehensive yet flexible statistical and fragmental approaches.
Process Analysis & Automation (www.paa.co.uk).
Their software has been written to control the equipment for the National DNA Database at the
Forensic Science Service in the UK. OVERLORD is available with over 100 drivers and with an
installed user base of 100 systems worldwide. OVERLORD is the independent software control
system of choice. It also has robot drivers for Hudson Plate Crane, Zymark Twister I & II,
Hamilton SWAP/Labsystems Relay, Sands Technology and Mitsubishi robots and pipetting
workstation drivers for Hamilton, PE Life Sciences (Packard), Tecan and Qiagen. OVERLORD
Developer is available for in-house developers, and incorporates a full licensed version of VBA
(as in Microsoft EXCEL, Word and Access) for the user to write the extensions required for their
application.
Pangea Systems, Inc. (www.pangeasystems.com).
The company is a provider of software for advanced bioinformatics, the application of
information technology to life science research and development. It provides unique
computational tools, an open computing environment, and value-added professional services that
integrate the collection, organization and analysis of biological and biochemical information.
Partek Pro 2000 (www.partek.com).
It is a comprehensive data visualization and pattern recognition system that is being widely
adopted in the market. This is due to unique combination of interactive visualization and strong
statistical, neural, and other numerical analysis techniques. It is also being used in
cheminformatics, combinatorial chemistry, high-throughput screening, and in analysis of clinical
data.

180

Prelude Computer Solutions, Inc. (www.preludenic.com).


The company provides a wide variety of consulting services for regulated industry. Prelude
specializes in 21 CFR Part 11 compliant document management, including integration with
LIMS systems and electronic publishing systems. Additionally, Prelude's business partners
include IBM, Lotus, Microsoft, and CDC Solutions.
QUMAS (www.qumas.com).
QUMAS is dedicated to the design and development of Enterprise Compliance Management
Solutions for companies in regulated industries in the pharmaceutical, medical device,
biotechnology, and contract research industry sectors. QUMAS enterprise compliance solutions
offer immediate payback through; pre-packaged compliance with FDA, EMEA and other
international regulations in particular 21 CFR part 11 (electronic signature and records), rapid
deployment, validation, user training, and by automating critical document processes to
dramatically reduce document cycle times.
REMP (http://www.remp.com)
REMP, a Tecan Group company, develops and produces devices, consumables, software and
fully automated sample processing and storage systems mainly used in research and development
ReTiSoft (www.retisoft.ca).
The company is a software research and development enterprise. Products include a software
framework Genera which simplifies instrument integration, scheduling software Supra, 3D
simulation viewer SimView, and instrument testers Clones for the laboratory automation market.
Rigaku Corporation (www.rigaku.com)
The company is engaged in analytical and industrial instrumentation, automation and software
production.
Spotfire, Inc. (www.spotfire.com).
It provides software solutions that empower scientists and engineers-and their enterprises-to
make decisions in eTime that get products to market first. The Web-based offerings are used by
life- and material-sciences companies worldwide for such activities as high-throughput screening,
genomics, lead optimization, combinatorial chemistry, formulation development, and
bioinformatics.
Structural Bioinformatics Inc. (www.strubix.com).
The company is a world leader in proteomics-driven drug discovery the large-scale generation
and use of protein structural information to accelerate the discovery and optimization processes.
Symyx (www.symyx.com).
The company is developing high-speed combinatorial technologies for the discovery of new
materials. The Renaissance (TM) software and database platform provides tools to enable the
implementation of complete workflows for high throughput design, synthesis and screening of
materials.

181

System Services Inc. (www.sysservices.com).


The companys Labstar is a comprehensive PC based laboratory automation software package
designed to simplify data handling needs. The package includes VisualMicroplate graphical
display, robust file import/export, graphing, calculations, reporting, and data sorting. Laboratory
data from HTS machines and other sources can be easily managed to help detect anomalies,
compare data visually, identify errors and sort large volumes of information.
TAL Technologies (www.taltech.com).
It is committed to providing quality software tools to simplify data acquisition and bar code data
collection.
Titian Software (www.titian.co.uk).
The company specializes in sample management software for the life sciences. The software
tracks sample inventory, manages workflows and seamlessly integrates with laboratory
workstations.
Tripos (www.tripos.com)
Tripos is a software company that offers a wide variety of products that help pharmaceutical
research. Molecular modeling, ligand and receptor based design, library design, bio- and
cheminformatics are the main areas area of their activities.
Viaken (www.viaken.com).
Viaken is an application service provider (ASP) that provides, manages and supports
Bioinformatics applications for the small to mid-size Biotechnology, Pharmaceutical and
Agriculture company. Viaken offers complete application solutions in the areas of Genome
Informatics, Chem Informatics and Pharmaco Informatics. The Viaken solution includes
architecture design, implementation, secure hosting, network infrastructure, 24x7 user support,
application and server management. Viaken's turn-key solutions, called "Technology Templates
TM" * fulfill the research and development needs of its customers by providing solutions that can
be rapidly implemented to meet common R&D IT needs. Additionally, Viaken can design and
implement custom solutions to meet any of its customers' IT needs.

References
1.
2.
3.
4.
5.
6.
7.
8.
9.

A.W. Czarnik Org. Chem. 1995, 8, 13.


J. M. Barnard J. Chem. Inf. Comput. Sci. 1993, 33, 532.
G. Moreau P. Broto Nouv. J. Chem. 1980, 4, 359.
L. B. Kier, L. H. Hall In Molecular Connectivity and DrugRresearch, Academic Press
1976.
Software developed at Tripos. Inc.
R. D. Cramer, D. E. Patterson, J. E. Bunce J. Am. Chem. Soc. 1988, 110, 5959.
R. E. Carhart, D. H. Smith, R. J. Venkataraghavan J. Inf. Comput. Sci. 1985, 25, 64.
R. P. Sheridan, R. B. Nachbar, B. L. Bush, J. Comp. Aided. Mol. Design 1994, 8, 323.
A. Lipinski, E. Lombardo, B. W. Dominy, P. Feeney Adv. Drug. Delivery Rev. 1997, 23,
3.

182

Index
223 Sample Changer, 46
A
ACT 357, 67
Analyst GT, 123
anchors, 22
Apex 396, 43
Available Compounds Database, 173-174
B
Benz, 1
C

diversity molecular, 171


Drug Data Report, 175
drug-likeliness, 172
E
Einstein, 2
electrophoretic map, 66
encoding, 61-63
by position in space, 92
optical, 89
by radiofrequency, 115
Encyclopedia of Reagents of Organic
Syntheses, 175
ExplorerPLS system, 35

catalysts on beads, 159


ChemEnlighten, 173
ChemInform Reaction Library, 173
ChemSpace, 173
Cohen, B., 79
color encoding, 91
CombiLibMaker, 173
Comprehensive Asymmetric Catalysis, 175
Comprehensive Medicinal Chemistry, 175
Comprehensive Organic Functional Group
Transformations, 175
CrossFire Beilstein, 173
CrossFire Gmelin, 173
Current Synthetic Methodology, 173

Geysen, M., 31, 82, 117, 124


gradient deposition, 154

De Witt, S. H., 33
deconvolution, 60, 124-138
DiverseSolutions, 172
diversity descriptors, 172

fabrication
of films by evaporating solutions, 153
of polymer films, 164
of thin films, 151-154
fluorous separations, 40
Fodor, S. P. A., 80, 81, 84, 152
Ford, H., 1
Frank, R., 32, 36, 37
Furka, ., 5, 13, 27, 55, 92, 93, 100, 104,
130
G

H
Hanak, J. J., 151, 155

high throughput screening, 122-123


holographic research strategy, 161-164
Houghten, R. A., 37, 124

partial, 63-80
piperazine 2-carboxamide, 115-117
polymer, 164-168
library
temperature-gradient, 166
thickness gradient, 164-165
unusual, 79
virtual, 66, 104-108, 172
phage display, 86
thin film, 151-155
light directed synthesis, 84-86
linker, 20
Lipinski, A., 172
Lipinskis Rule, 172

Kri, Gy., v

M 384 Ultra HTS Synthesizer, 44


MALDILC System, 47
manual sorting, 95-97
manufacturers, 48-52
Margitfalvi, J. L., v
masking, 152-153
materials, 151-155, 164-168
inorganic, 151-155
Meredith, J.C., 167, 168
Merrifield, R. B., 4, 15, 17, 18, 20, 21, 26,
42, 55, 82, 83
Metabolite Database, 175
Meyers, H. V., 33
microreactors, 158-160
microwave heating, 34
multicomponent reaction, 37-38
Multiple Probe 215 Liquid Handler, 46

I
ISIS Draw, 173
iteration method, 124-129, 139-140
J
Janda, K. D., 124

LabMate, 33
Levassor, 1
library
amino acid tester tester application, 143
amino acid tester tester preparation,
141-143
amino acid tester, 135-137
benzimidazole, 110-115
catalyst prepared by split-mix, 161
catalyst, 156-164
cherry picked, 104-110
combinatorial dynamic, 138
composition-gradient, 165
design product based approach, 172
design reactant based approach, 172
design, 171-176
dynamic combinatorial, 138
inorganic, 151-155
occurrence, 144
omission application of, 140-141
omission, 133-135, 140-141
organic, 61-63

O
Ohlmeyer, 63
OHS Hazard Communication, 175
Olds, Ransome Eli, 1
one bead-one product, 1

184

ORGSYN Database

SAGIAN Core Systems, 123


Saneii, H., v
scavenger, 27
Science of Synthesis, 175
screening
by affinity chromatography, 138
catalyst using MS, 157, 158
catalysts by GC, 158
catalysts using IR, 159
screening
inorganic films, 154-155
methods, 121-138
of polymer films, 166-168
tethered libraries, 146-148
tethered peptide libraries, 148-149
using IR thermography, 147, 157-158
using size exclusion, 138
Selector, 172
similarity, molecular, 171
Skiena, S., 79
Smillie, L. B., 2
Smith, J., 91
software companies, 176-182
solid phase
reagents, 40
synthesis, 15-18
Solid PhaseOrganic Reactions
Solution parallel synthesizer
solution phase synthesis
Sophas HTC
sorter
crown, manual, 95
IRORI automatic, 91
IRORI manual, 90
lantern, manual, 95
sorting
directed, 89, 115-117
parallel, 103
semi-parallel, 97-102
split-mix synthesis, 55-61, 161

Panhard, 1
Parallel evaporation module, 43
PEG, 19
Peugeot, 1
Pinilla, S. E, 130
planning experiments, 71-75
polymers, 164-168
positional scanning, 129-133
application, 145
preparation of catalyst libraries, 157
protecting group, 22
Alloc, 114-117
BOC, 23
Fmoc, 32
nitro, 25
Nvoc, 84
trityl, 24
Z, 23-24
PubChem, 175
pulsed-laser deposition, 153
Pure Substance Database, 173
Q
Quad 3+ system, 47
R
Reference library of Synthetic Methodology,
175
resin
hydroxymethyl, 21
Merrifield, 20-21
Rink amide, 21
trityl chloride, 20
Wang, 21
Tentagel, 19

185

Stafford, C. M., 164


Stille coupling, 40
string synthesis, 93-104
sub-library, 76-79
preparation, 143-145
support
solid surface, 20, 84-86, 89
solid, 18-19
soluble, 83
Syncore Reactor, 34
SynPhase crowns, 32, 91, 94, 95, 100
SynPhase lanterns, 20, 91,94
synthesis
binary, 80-82
synthesis
dendrimer supported, 39
efficiency, 56-57
encore, 91-93
guiding tables, 108
of cherry picked libraries, 104-110
parallel, 30-37
solid phase, 15-18
split-mix, 55-61, 161
spot, 32-33
string, 93-104
tea-bag, 37
with amino acid mixtures, 82-83
synthesizer
automatic, 42-45, 66-71
manual, 33-34, 64
system
Odyssey, storage and retrieval, 175-176
pulsed-laser deposition, 153
quaternary masking, 152-153

Two dimensional gradient films, 166


two libraries on one support, 117-118
U
Ugi, I., 37, 38
V
vapor deposition technique, 152
X
Xiang, X. -D., 151
XP-1500 Plus system, 35

T
Taktsy, Gy., 29, 31, 122
Tanimoto, 172
thin film catalyst arrays, 157
TOPKAT, 173
Toxicity Database, 175
186

You might also like