Professional Documents
Culture Documents
by
IRFAN N
Hyderabad-500037,
India.
1
QSAR, PHARMACOPHORE AND DOCKING STUDIES ON
by
IRFAN N
1. Abstract 3
2. Legends 4
3. Introduction
3.3 Software 19
4.1.2 Pharmacophore 60
4.2.2b C –Docker 72
4.2.2d Ludi 74
5.1 Qsar 78
6 Conclusion 125
7 Reference 127
3
1. ABSTRACT
4
Legends
PLA2 Phospholipase A2
GLY Glycine
HIS Histidine
LYS Lysine
PHE Phenylalanine
LEU Leucine
TYR Tyrosine
LF Ligand fit
QM Quantum mechanics
HYPO Hypothesis
MD Molecular dynamics
µM Micro molar
NM Nano molar
% Percent
R² Regression co-efficient
5
XVR2 Cross validated regression co-efficient
HY Hydrophobic
6
7
3.1 Drug Discovery:
Drug discovery is the process by which drugs are discovered and/or designed. In the
past most drugs have been discovered either by identifying the active ingredient from traditional
remedies or by serendipitous discovery. A new approach has been to understand how disease and
infection are controlled at the molecular and physiological level and to target specific entities
based on this knowledge. The process of drug discovery involves the identification of candidates,
synthesis, characterization, screening, and assays for therapeutic efficacy. Once a compound has
shown its value in these tests, it will begin the process of drug development prior to clinical trials.
8
Problem in drug discovery:
Estimates of time and cost of currently bringing a new drug to market vary, but 7–12
years and $ 1.2 billion are often cited. Furthermore, five out of 40,000 compounds tested in
animals reach human testing and only one of five compounds reaching clinical studies is
approved. This represents an enormous investment in terms of time, money and human and other
resources. It includes chemical synthesis, purchase, curation, and biological screening of hundreds
of thousands of compounds to identify hits followed by their optimization to generate leads which
requiring further synthesis.
In addition, predictability of animal studies in terms of both efficacy and toxicity is
frequently suboptimal. Therefore, new approaches are needed to facilitate, expedite and streamline
drug discovery and development, save time, money and resources, and as per pharma mantra “fail
fast, fail early”. It is estimated that computer modeling and simulations account for ~ 10% of
pharmaceutical R&D expenditure and that they will rise to 20% by 2016
Role of computer aided drug designing:
Both computational and experimental techniques have important roles in drug
discovery and development and represent complementary approaches. CADD entails:
Use of computing power to streamline drug discovery and development process
Leverage of chemical and biological information about ligands and/or targets to identify
and optimize new drugs
Design of in silico filters to eliminate compounds with undesirable properties (poor
activity and/or poor Absorption, Distribution, Metabolism, Excretion and Toxicity,
ADMET) and select the most promising candidates
9
Benefits of CADD
CADD methods and bioinformatics tools offer significant benefits for drug discovery programs.
1. Cost Savings. The Tufts Report suggests that the cost of drug discovery and development
has reached $800 million for each drug successfully brought to market. Many
biopharmaceutical companies now use computational methods and bioinformatics tools to
reduce this cost burden. Virtual screening, lead optimization and predictions of
bioavailability and bioactivity can help guide experimental research. Only the most
promising experimental lines of inquiry can be followed and experimental dead-ends can
be avoided early based on the results of CADD simulations.
2. Time-to-Market. The predictive power of CADD can help drug research programs choose
only the most promising drug candidates. By focusing drug research on specific lead
candidates and avoiding potential “dead-end” compounds, biopharmaceutical companies
can get drugs to market more quickly.
3. Insight. One of the non-quantifiable benefits of CADD and the use of bioinformatics tools
is the deep insight that researchers acquire about drug-receptor interactions. Molecular
models of drug compounds can reveal intricate, atomic scale binding properties that are
difficult to envision in any other way. When we show researchers new molecular models
of their putative drug compounds, their protein targets and how the two bind together, they
often come up with new ideas on how to modify the drug compounds for improved fit.
This is an intangible benefit that can help design research programs.
CADD and bioinformatics together are a powerful combination in drug research and development.
An important challenge for us going forward is finding skilled, experienced people to manage all
the bioinformatics tools available to us, which will be a topic for a future article.
The inflammatory response has two major purposes: to disinfect and to clean injured tissues. In
addition to this, the inflammatory system also helps halt the spread of pathogens to tissues not
already infected. Clotting proteins that are present in the blood plasma also leak into the interstitial
fluid when the blood vessels dilate and become leakier. With platelets, thromboplastin,
prothrombin, fibrinogen, and calcium ions, localized clots can be formed, and healing can be
underway, while the pathogens are also restricted to one area, making it easier for them to be
engulfed by phagocytes.
Although the inflammatory response may be localized, as shown, it may also be widespread and in
effect throughout the body. If there are numerous pathogens, or pathogens have traveled through
the bloodstream and come to reside all over the body, the body will react with a widespread
inflammatory response that has other effects in addition to the ones experienced in localized
responses. The number of leukocytes in the blood may increase. The body may also experience
abnormally high body temperatures, or fever, which may be caused by either toxins released by
pathogens, or due to compounds released by specific leukocytes. Although an extremely high
fever is dangerous to the body, a less extreme temperature may aid the body by stimulating
phagocytosis and inhibiting the reproduction and growth of pathogens.
The classical signs inflammation:
Pain (dolor),
Heat (calor),
Redness (rubor),
Swelling (tumor), and
Loss of function (functio laesa).
Responsible mediator for inflammation:
Phospholipase A2(PLA2)
Lipooxygenase(LOX)
Cyclooxigenase(COX),
11
Figure4. Inflammatory process
12
Current drug against inflammation:
NSAIDS (Non Steroidal Anti-inflammatory drugs)
o Aspirin
o Indomethacin
o Ibuprofen
o Diclofenac
o Piroxicam
Corticosteroids.
o Prednisolon
o Cortisone
o Betamethasone
o Fludrocortisone
WHY NEW DRUGS ARE NEEDED:
Despite decades of research, corticosteroids and NSAIDs remain the main
pharmacological weapons to control inflammation in the clinic. Unfortunately, these drugs have
significant side effects, especially when used chronically. Consequently, there is tremendous
interest in the development of novel, safer, and more effective anti-inflammatory drugs.
GASTEROINTESTIONAL:
Gastric irritation, erosions, peptic ulceration, gastric bleeding , esophagitis.
RENAL:
Na+ and water retention, chronic renal failure, interstitial nephritis.
CNS
Headache, mental confusion, behavioral disturbances, seizure precipition.
OTHERS:
Asthma exacerbation, nasal polyposis, pruritus, angioedema
Side effects CARTICOSTEROIDES:
Cushing‟s habitués
Hyperglycemia
Muscular weakness
Susceptibility to infection
Delayed healing
Peptic ulceration
Osteoporosis
Glaucoma
Fetal abnormalities
mental confusion
Phospholipase a2 (PLA2):
The secretary PLA2 (sPLA2) family, in which 10 isozymes have been identified,
consists of low molecular weight, Ca2+-requiring secretory enzymes that have been implicated in
a number of biological processes, such as modification of eicosanoid generation, inflammation,
and host defense.
13
This enzyme has been proposed to hydrolyze phosphatidylcholine (PC) in lipoproteins to liberate
lyso- PC and free fatty acids in the arterial wall, thereby facilitating the accumulation of bioactive
lipids and modified lipoproteins in atherosclerotic foci.
In mice, sPLA2 expression significantly influences HDL particle size and composition
and demonstrate that an induction of sPLA2 is required for the decrease in plasma HDL
cholesterol in response to inflammatory stimuli. Instillation of bacteria into the bronchi was
associated with surfactant degradation and a decrease in large: small ratio of surfactant aggregates
in rats.
sPLA2-IIA can exert beneficial action in the context of infectious diseases since recent
studies have shown that this enzyme exhibits potent bactericidal effects. Induction of the synthesis
of sPLA2-IIA is generally initiated by endotoxin and a limited number of cytokines via paracrine
and/or autocrine processes.
14
Role of phospholipase A2:
Phospholipase A2 (PLA2) catalyzes the hydrolysis of the sn-2 position of membrane
glycerophospholipids to liberate Arachidonic Acid (AA), a precursor of eicosanoids including
prostaglandins and leukotrienes. The same reaction also produces lysophosholipids, which
represent another class of lipid mediators.
15
Figure7 Mechanism of PLA2
16
8. rheumatoid arthritis,
9. gout, and
10. Other diseases.
REPORT HIGHLIGHTS
1. The market for anti-inflammatory drugs to treat the diseases covered in this report was
approximately $21.9 billion in 2005 and is projected to increase to $35.5 billion in 2010.
2. The fastest growing disease category for anti-inflammatory treatment is psoriasis, which
saw the first introductions of expensive monoclonal antibody products in the last two
years.
3. The largest market by far in 2005 is that for the treatment of asthma and chronic
obstructive pulmonary disease, which accounted for approximately 36% of the total market
in 2005. The asthma/COPD market will remain the largest in 2010, but will decline to
31.4% of the total of market by the end of the forecast period.
Figure8 Report ID: PHM048A, Published: March 2006, Analyst: Lynn Gray
17
Target protein
PDB id : 1DB4
Carboxylate
Cofactor : Calcium
Resolution : 2.00Å
R-factor : 0.226
R-free : 0.256
18
Figure 9 crystal structure of secretory phospholipase a2 (1DB4)
19
3.3 software
Presnt experimental studies carried out using the tools
Accelrys software
Discovery studio
Discovery studio is a complete modeling and simulations environment for life science
researchers. Discovery Studio is a single, easy-to-use, graphical interface for powerful drug design
and protein modeling research. Discovery Studio 2.1 combines established gold-standard
applications such as Catalyst, Modeler, and CHARMm that have years of proven results and
utilizes cutting-edge science to address the drug discovery challenges of today. Discovery Studio
2.1 is built on the Pipeline Pilot open operating platform to seamlessly integrate protein modeling,
pharmacophore analysis, virtual screening, and third-party applications. It offers
20
o Tools for visualization, protein modeling, simulation, docking, pharmacophore
analysis, qsar and library design
o Access computational servers and tools, share data, monitor jobs, and prepare and
communicate their project progress.
21
4. Materials and methods
In the last few years the role of computational methods in both pharmaceutical
and academic research has developed dramatically. The emphasis being placed on high throughput
methods in the pharmaceutical industry, which has increased the number of compounds in the
discovery pipeline. Characterizing the position and orientation of small molecules bound to a
protein surface can be an important step in drug design. Computational methods developed rapidly
as groups seek high throughput, low cost approaches in accelerating the drug discovery process.
Such approaches will be necessary as scientists attempt to characterize the large number of drugs
currently being generated. Structural information of biological macro molecules and their
importance with ligand is increasingly being used in modern medicinal chemistry. There is a
pressing used for novel computational methods that can evaluate the structural information about
ligand receptor complexes in a more quantitative way , both to improve existing leads and to
design de novo compounds with accurately predicted binding affinities . The following
experimental methods categorically divided into two parts.
4.2.2b C –Docker
4.2.2d Ludi
22
Preparation of molecular system:
This summation when given is an explicit form, represents force field, evaluating the potential of a
system.
minimization :
The minimizer uses algorithm to identify the geometrics of the molecule
corresponding to the minimum points on the potential surface energy. The minimum reduced the
unwanted forces which are present in the molecule and lower the energy level of the molecule.
There are many algorithms available in the minimization process. Some of the minimization
methods used in the smart minimizer is steepest decent method, conjugate gradient method,
Newton raphson method and quasi Newton method. From the DS protocols select the
minimization and run .the following figure shows the minimized protein with fixed constraint
.than sve the minimized protein for further studies.
23
Figure 11 Minimized protein with fixed constraint
24
Preparation of bio active molecules:
The 111 bio active compounds are collected from the journals
with the activity range 0.005 to >50 µM.
Journal of medicinal chemistry 1996, vol 39, page no 3636-33658 with the title potent
inhibitors of secretory phospholipase A2: synthesis and inhibitory activities of indolizine
and indene derivatives.
Journal of medicinal chemistry 2005, vol 48, page no 893-896 with the title carbocyclic
[g]indole inhibitors of human nonpancreatic sPLA2
Journal of medicinal chemistry 2008, vol 51, 4708-4714 wit the title highly specific and
broadly potent inhibitors of mammalian secreted phospholipase A2.
1 One molecule was drawn with basic scaffold and the other molecules were constructed
with one drawn earlier as the reference model.
2. Drawn compounds are typed with charmm force field.
3. The typed molecule are subjected to the energy minimization using smart minimizer.
Minimizes a series of ligand poses using CHARMm
4. Minimized molecule is saved with .sd and .mol2 extension for further study.
Following table shows the 2d structure of the molecule and activity
25
4.1 Analogue Based Drug Design
The unknown 3D structural target knowledge is applied to rationally design a drug; this is
referred to as Analogue Based Drug Design. This refers to the application of the knowledge of the
ligand structure ant their activity when the 3D structure of the target is having a very less
information or is completely not known .It is required to design the binding site based on the
known structure of the ligands.
4.1.1 Qsar:
The fundamental quantitative structure activity relationship studies reveals
that the structures can be easily be compared, overlayed and displayed. The Quantitative structure
of activity relationship is obtained by providing more parameters to optimize a series of bioactive
molecules. The quantitative structure activity relationship based on physio chemical properties
describes a drugs structural, electronic and physiochemical characteristics. Data sets are produced
using all available descriptors.
Apply knowledge of the three-dimensional (3D) structure of the target
(receptor/enzyme/DNA) to rationally design drug molecules to bind to the target for the following
reasons are:-
1. Understand atomic details of drug binding strength and specificity (drug-receptor interactions).
2. Develop novel drugs (unique chemical structures) for a selected target via de novo drug design
or database searching techniques.
3. Optimize the therapeutic index of an already available drug or lead compound concerning
structural requirements for activity from a minimum number of compounds are tested.
26
A QSAR equation numerically defines the chemical properties, Biological activity
form physiochemical properties. Biological activity is defined as pharmacological response
usually expressed in millions such as the effective dose in 50% of the subjects (ED 50). The lethal
dose is 50% of the subjects (LD50) or the minimum inhibitory concentration IC50. It is common to
express the biological activity as a reciprocal QSAR equation is similar to the equation for a
straight line:-
Y = mx + c
Log biological activity = a (physiochemical property) + c
A = regression coefficient of slope of the straight line.
C = intercept on y-axis (when the physiochemical property equals zero)
Biological activity expressed as a reciprocal to produce a positives lope and also due
to the inverse relationship between physiochemical chemical property and biological potency.
There is a positive relationship between the reciprocal of the biological activity(I/BA) and
physiochemical property, because (I/BA) increases as the studies are based on the descriptors and
biological activity relationship the biological activity data must be minimal .and the choice of the
descriptors of the descriptors must be accurate and appropriate .
OBJECTIVE OF QSAR
27
A. CSD DATABASE
Experimental information about the structures of molecules can often be extremely useful
for forming theories of conformational analysis and hoping to predict the structures of molecules
for which no experimental information is available. The most important technique currently
available for determining the three dimensional structure of molecules is x-ray crystallography
community has distributed in electronic form two practically important databases for molecular
modeler are the Cambridge structural database CSD which contains crystal structures of organic
and organ metallic molecules and the protein data bank (PDB) which contain structures of proteins
and some DNA fragments.
A data base of little use without software tools to search extract and manipulate the data. A
simple use of a database is for extracting information about a particular molecule or group of
molecules .the data may also be identified by creating a two dimensional representation of
molecule and using a substructure search program to search the database. Crystallographic
database have also been used to develop an understanding of the factors that influence the
conformations of the molecules, and of the ways in which molecules interact with each other. For
example, the CSD has comprehensively analyzed to characterize how the lengths of chemical
bonded depend upon the atomic numbers, hybridization and the environment of the atoms
involved. Analyzing of intermolecular hydrogen bonding have revealed distinct distance and
angular preferences a major use of the CSD is substructure searching for molecules which contain
a particular fragment, in order to investigate the conformation that the fragment adopts.
A crystallographic database can only provide information about the crystal state of
matter and that the possible influence of crystal packing forces should always be taken into
account. This is less of concern for protein than for small molecules as protein crystals contain a
large amount of water and indeed NMR studies are established that protein have approximately,
the same structure in solution as in the crystal.
A second, more stable subtle, bias is that crystallographic databases only contain
molecules that can be crystallized and indeed only those molecules whose X-ray structures were
considered enough to be published. The structures in a crystallographic database may therefore not
be a wholly representative set.
C. MOLECULAR DESCRIPTORS
The study of steric requirements for interaction between ligands and corresponding
biological acceptor sites is often of decisive importance in understanding the role played by the
structural features in promoting activity in its most general form drug receptor theory requires that
a ligand exerts its biological action as a consequence of binding or otherwise interacting with a
specific biological acceptor site such as membrane protein , an enzyme etc., which may be
generally termed the receptor the concept is the basis for modern drug receptor theory involves
the old principle that a ligand fits its receptor much as a key fits a lock. This concept, although
some what arbitrary since a high degree of flexibility is present in biomacromolecules, structure,
governs the principle of molecular recognition and molecular discrimination. Although
stereochemistry often plays a major role in drug bioactive, care must be taken when considering
structure activity relationship to explore whether other differences in physiochemical properties
exists before one makes significant correlations with the steric properties of the structure under
study.
In early studies organic chemists defined a number of steric parameters in order to
explain steric effects of substituents on the reaction centers of organic molecules. The same type
of steric effects observe in studies of variation of physical properties and the chemical reactivity
with structure may be assumed to be involved in biological activity studies which at least as a first
28
approximation may be treated in similar fashion in the past 35 years owing to the development of
drug design and Hansch Approach many other parameters and methods have been developed
which have the permit of trying to avoid a simple empirical correlation with given ligand
properties and also trying to propose the possible geometric features of the receptor .
Steric descriptors are classified into following groups:
1. Topological indices based on characterization of the chemical structures of the graph theory.
2. Geometric descriptors resulting from the view of organic molecules as three dimensional
objects from which standard dimensions can be calculated.
3. Chemical descriptors derived from steric influence upon a standard reaction.
4. Physical descriptors derived when an organic molecule is considered as three dimensional
object with size determined physical properties and different descriptors which result when an
organic molecule is considered as a three dimensional object from reference structure.
I. FRAGMENT CONSTANT DESCRIPTORS
Fragment constant descriptors are constants that relate the effect of substituents on a reaction
center one type of process to other. The basic idea is that similar changes in structure are likely to
produce similar changes in reactivity, ionization or binding. There are different constants
corresponding to different effects. These are typically used to parameterize the Hammet equation
some series of analogs.
Log kx= pσ +log kh
Where Kx and kh are reaction rate constants for the substituents x and h , respectively ;0 is an
electronic constant by an ionization constant and p is fit to set etc at different properties
(electronic , steric )etc at different R group positions are used . In this way measurements of
ionization constants can be used to predict rate constants once a sealing factor (p) is determined
effects for the rate of constant.The default database currently contains the following types of
constants. These come from table VI –I of hansch expect for the sterimol constant which is
calculated.
Sm, Sp
Electronic effect sigma Meta and sigma para respectively. Positive values correspond in
electronic withdrawal, negative ones with electronic release. Sigma is generally not appropriate for
ortho substituents because of steric interaction with reaction center.
F and R
Decomposition of sigma Para constant into an inductive polar part F and a resonance part R for
the case when the substituent is conjugated with the reaction center producing through resonance
effects.
Pi
Hydrophobic character Pi for substituent x is given by the difference of its log P from the
log P for hydrogen.
HA hydrogen bond acceptor
HB hydrogen bond acceptor
MR molar refractivity is given by p
MR= (n2-1/n2+1)*(MW/d)
Where n is the refractive index .MW is the molecular weight and d is the compound density
sterimol L.
Sterimol-L
Steric length parameter, measured long the substitution point bond axis.
Sterimol –B 1 through B4
29
Steric distance s perpendicular to bond axis, these define a bounding box for the substituent and
are numbered in ascending size axis.
Sterimol –BS
The overall maximum steric distance is perpendicular to the bond axis.
The following table lists the MSA descriptors available in QSAR are as follows:
30
V) STRUCTURAL DESCRIPTORS
The following table lists the structural descriptors available in QSAR are as follows:
Table 5: Structural descriptors
SYMBOL DESCRIPTORS
Mw molecular weight
33
F. INTERPRETING QSAR EQUATION
QSAR is used for predicting the activities of as yet untested and possibly not yet synthesized)
molecules. The predictive ability of a QSAR is generally more accurate for interpolative (for
compound that have parameters with in the range of those considered in the data set) than for the
extra polative predictions (compounds that are outside the range)
A QSAR equation provides insights into the mechanism of the process being studies.
1. SQUARE OF CORRELATION COEFFICIENT (R2): If x (independent) and y (dependent)
variables are highly correlated, there is considerable information in x and y that is redundant. The
degree of correlation is measured by the correlation coefficient (r 2)
34
Calculate molecular properties:
The Calculate Molecular Properties protocol will calculate many properties or perform basic
statistical and correlation analysis of the numeric properties as requested.
36
3. HB DONOR (vector): Matches these types of atoms or groups of atoms:
Non-acidic hydroxyls
Thiols
Acetylenic hydrogens
NHs (except tetrazoles and trifluoromethyl sulfonamide hydrogens)
Does not match: electron-rich pyridines and imidazoles that would be protonated or nitrogen‟s that
would be protonated due to their high basicity
4. HYDROPHOBIC (point): Matches these types of groups of atoms:
A contiguous set of atoms that is not adjacent to any concentrations of charge (charged atoms or
electronegative atoms) in a conformer such that the atoms have surface accessibility such as
phenyl, cycloalkyl, isopropyl, and methyl.
5. HYDROPHOBIC ALIPHATIC (point): Matches these types of groups of atoms:
A contiguous set of atoms that are not adjacent to any concentrations of charge (charged atoms or
electronegative atoms) in a conformer such that the atoms have surface accessibility is cycloalkyl,
isopropyl, and methyl
6. HYDROPHOBIC AROMATIC (point): Matches these types of groups of atoms:
A contiguous set of atoms that is not adjacent to any concentrations of charge (charged atoms or
electronegative atoms) in a conformer such that the atoms have surface accessibility such as
phenyl and indole.
7. NEG CHARGE (atom): Matches negative charges not adjacent to a positive charge.
8. NEG IONIZABLE (point): Matches atoms or groups of atoms that are likely to be
deprotonated at physiological pH, such as:
Trifluoromethyl sulfonamide hydrogens
Sulfonic acids (centroid of the three oxygens)
Phosphoric acids (centroid of the three oxygen‟s)
Sulfinic, carboxylic, or phosphinic acids (centroid of the two oxygen‟s)
Tetrazoles
Negative charges not adjacent to a positive charge
9. POS CHARGE (atom): Matches positive charges not adjacent to a negative charge.
10. POS IONIZABLE (point): Matches atoms or groups of atoms that are likely to be protonated
at physiological pH, such as:
Basic amines
Basic secondary amidines (iminyl nitrogen)
Basic primary amidines, except guanidine‟s (centroid of the two nitrogen‟s)
Basic guanidine‟s (centroid of the three nitrogen‟s)
Positive charges adjacent to a negative charge do not match weakly basic aromatic nitrogen‟s such
as pyridine and imidazole.
11. RING AROMATIC (vector and plane): Matches 5- and 6-membered aromatic rings. The
feature defines 2 points, the ring centroid and a projected point normal to the ring plane. The
projected point can map both above and below the ring.
STEPS TO BE FOLLOWED IN DS
1. Construct or import the molecules.
2. Perform conformational search
3. Examine the each conformer for the presence of chemical features.
4. Determine the set of features that correlate with activity
37
STEPS AND APPLICATION OF PARAMETERS WHICH ARE USED IN HYPOTHESIS
GENERATION
Import the molecules in view compound work bench and cleaning the constructed
molecules.
Apply catalyst force field , then do the 3D minimize
Conformation search: the aim of the conformation search is to obtain the diversified
conformations .Conformations generation methods are classified into two types. One is best
method and the other is fast method. Both the methods emphasize broad coverage to cover the
conformational space. Fast conformer generation is used to cover the conformational space of
molecules. It uses systematic or random search depending on the size of the molecules. Systematic
search is useful for small molecules and random search is used for macromolecules. In the case of
macro molecules the conformers are minimized by poling algorithm.
CONFORMATIONAL ANALYSIS STOPS WHEN ONE OF TEST THREE CONDITIONS
IS MET
After maximum number of conformers have generated.
Energy of the newly generated conformer is too high to the predefined energy rest hold.
If there is no possible new conformer generation after certain number of trials.
PHARMACOPHORE HYPOTHESIS
Catalysts confirm hip hop and hypogen are application that provides tools to generate
pharmacophore hypothesis. The hypothesis are created by generating conformation for a set of
study molecules, then using the conformation to find and align chemically important functional
groups common to the molecules in the study set.. Chemically important functional groups
common to the molecules in the study set. Each hypothesis can also incorporate data on the
biological activities of the study molecules.
STEPS INVOLVED GENERATING A PHARMACOPHORE HYPOTHESIS
I. GENERATE CONFORMATIONS
The interface to confirm is used to generate conformations for a single molecule or a set of
molecules. The number of conformation needed to produce a good representation of a compound
conformational space depends on the molecules. Both conformations generating algorithms
available in confirm (best and fast) are adjusted to produce a diverse set of conformations ,
avoiding repetition groups of conformations all representing local minima.
The conformations all representing local minima.
The conformations generated by confirm can be used as input into hip-hop and hypo to
align common molecular features and generate a hypothesis.
Align common features to generate a hypothesis.
The following procedure involves
Aligning common molecular features.
Setting preferences using control panel
Incorporating activity data into a hypothesis
Using aligned structures to generate receptor models.
Hip hop and hypo use conformations generated in confirm to align chemically important
functional groups common in the molecules in the study set. A pharmacophore hypothesis can
then be generated from these aligned structures. Incorporated biological activity data into a
hypothesis
38
The hip hop is also used to incorporate biological activity data into the hypothesis
generating process. Each hypothesis is tested by regression techniques to compare estimated
activity with actual activity data. The software uses the data from these tests to select the
hypothesis that do the best job predicting activity for the set of study molecules. This capability is
provided by catalyst / hypo.
4.2.2a HIP HOP THEORY
Pharamcophore based on multiple common features alignment generate receptor models
using hip hop. The objective is to identify and enumerate all possible pharamacophore
configurations that are common to the training set. The aligned structures the model receptor menu
card is included in the hypothesis models card deck so that you can use structures that have been
aligned in hip hop to generate a receptor surface model. Since structures used in hip hop are
aligned by common chemical features, the receptor surface model that is generated for them can
be significantly different from a receptor surface model generated from template aligned
structures.
The ideal hip hop training set area s follows:-
2-30 compounds ideally 6 molecules
Structurally diverse set of input molecules.
Feature rich compounds
Include the most active compounds
Spread sheet set up for hip hop
Molecules hypothesis generation work bench imported into a spread sheet principal
specific the reference molecules references configuration models are potential centers for
hypothesis
If (0) do not consider these molecules
If (1) consider configuration of the molecules.
If(2) use this compound as a reference molecules used only for hip hop hypothesis
generation
Maximum omit features how many feature for each compound may be omitted
If (0) all features must map to generate hypothesis
If (1) all but one feature must map to generate hypothesis
If(2) features need to map to generate hypothesis used only for hip hop hypothesis
generation
When compound data appear in the spreadsheet, you are ready to add values in the
Principal and MaxOmitFeat columns. Common-features hypothesis generation uses values in
these columns to determine which molecules should be considered when building hypothesis
space and which molecules should map to all or some of the features in the final hypotheses.
In the Principal column, a value of 2 means that all the chemical features in the compound
will be considered in building hypothesis space. A value of 1 means that features will be
considered when generating hypotheses and that at least one mapping for each generated
hypothesis will be found unless the Misses or Complete Misses options are used. A value of 0
means the compound will be ignored.
The Max Omit Feat column specifies how many hypothesis features must map to the
chemical features in each compound a 0 in this column forces mapping of all features, a 1 means
that all but one feature must map, and a 2 allows hypotheses to which no compound features map
39
4.2.2b HYPOGEN
Hypogen attempts to derive SAR models for a set of molecules for which activity value
(IC50 or Ki) on a given biological target are available. Hypogen optimizes hypothesis that are
present in the highly active compounds in the training set. But missing among the least active (or
inactive) ones. It attempts to construct the simplest hypothesis that best correlates that activity
(estimates vs. measured) the predicted models are created the predicted models are created in
three stages:
Constructive
Subtractive
Optimization
The constructive phase identifies hypothesis that are common to the most active set of
compounds.
The most active set is determined by the following equation of compounds. The most active set is
determined by the following equation
MA x UncA = (A/UncA)>0.0
Where MA is the activity of the most active compounds
Unc is the uncertainty in the measured activity and
A is the activity of the compound
The most active set of compounds is limited to a maximum of eight. Once the set is
determined hypogen enumerates all possible pharmacophore features for each of the
conformations for the two most active compounds. Furthermore, the hypothesis must fit a
minimum subset of features of the remaining most active compounds in order to be considered. At
the end of the constructive phase a database of every number of pharmacophore configurations is
generated. The objective of the substractive phase is to identify those pharmacophore
configurations is generated. The objective of the subtractive phase is to identify that
pharmacophore configuration developed in the constructive phase that is also present in the least
active set of molecules and remove them. The first step is the identification of the least active
compounds. This is accomplished by these of equations log (A) - log (MA).305 '' where the A is
the activity of the current compound and MA is the activity of the most active compound. in
simple terms, all compounds whose activity is 3.5 order of magnitude less than that or the most
active compound are considered to be in the set of least active molecules. The value 3.5 is user
adjustable parameter, if needed (i.e., if the activity range of the dataset does not span more than
3.5 orders of magnitude the subtractive phase identifies the hypothesis that are common to the
least active compounds the least active set is determined by the following equation,'' log (cmpdx)-
log (most active compounds)3.5''. It enumerates all possible pharmacophore configurations. Then
it checks for configuration with the most active compounds and eliminates if shred by more than
half of the least actives leading to feasible pharmacophores.
The optimization phase involves improvement of the hypothesis score. Small
perturbations are applied to those pharmacophore configurations that survived the subtractive
phase and that are scored based on errors I activity estimates from regression and complexity of
the hypothesis. The cost of a hypothesis is a quantitative extension of Occams razor (everything
else being equal, the simplest model is preferred;
40
Figure14 Hypogen process flow
A detail of the cost of each pharamcophore is computed by the sum of three costs: weight
error configuration. While the weight component increases with deviation of the feature weight
from the ideal value of 2.0, the error component increases with RMS difference between the
measured and estimated activities. The configuration cost is fixed and depends on the complexity
of the pharamcophore upon completion of this phase.
Hip hop and hypo use conformations generated in confirm to align chemically important
functional groups common to the molecules in a study set. Biological activity data can be
incorporated into this hypothesis so that the best hypothesis for predicting activity are generated
and selected. Additionally, you can use structures that have been aligned in these programs to
generate a receptor surface model.
HYPOGEN TRAINING AND TEST SET SELECTION
Selection of the training set molecules is one of the most important exercises the user
must purpose for the following reasons:
Catalyst derives the information used in subsequent analysis from those structures thus, the
garbage in garbage out” paradigm certainly applies.
The statistical procedures applied during analysis have limits in terms of over and under
fitting the data.
Data sets that are ideal for those analysis procedures and data sets from typical medicinal
chemistry structure activity series are often not the same thing.
The ideal training set
1. At least 16 compounds are necessary to assure statistical power.
2. Activities should span 4 orders of magnitude.
3. Each order of magnitude should be represented by at lest 3 compounds.
4. No redundant information.
5. No excluded volume problems.
41
METHODOLOGY
INTRODUCTION
To build a better pharamcophore the following steps were employed
1. Building set of molecules
2. Conformer generation
3. Hypothesis generation
4. Database generation
5. Database search
6. Compare / fit to estimate activity
Criteria to generate successful hypothesis are:
1. Cost factor: a dumping score that is the difference between fixed and null cost should be
greater than so hits i.e., larger difference gives better prediction.
2. Fixed cost represents the simplest method model that fits all data perfectly and the null cost
represents the highest cost of a pharmacophore with no features and which estimates
activity to be average of activity data of training set of molecules.
3. The configuration value which is a measure of magnitude of hypothesis space for a given
training set should be less than 18. If it is above, more degree s of freedom and the result
may not be useful.
4. The estimated and the actual activity data correlation value should be around 1.0
5. The RMS deviations, which should be as low as possible, nearly equal to 0, which
represents the quality of the correlation between the estimated and the actual activity data.
METHOD
BUILDING A SET OF MOLECULES
All molecules were built using catalyst view compound work bench. They were cleaned
using option 2D beautify and minimized using CHARMm like force field.
CONFORMER GENERATION
A conformer is a representation model of the possible conformational space of a ligand. It
is assumed that the biologically active conformation of a ligand (or a close approximation there of)
should be contained within this model. Conformers were generated for all molecules with cut off
energy range 20 Kcal /mol and up to a maximum of 255 conformers.
COST HYPOTHESIS
The lowest cost hypothesis is considered to be the best. However, hypothesis with costs
within 10-15 of the lowest cost hypothesis are also considered as good candidates. The units of
cost are binary bits. Hypothesis costs are calculated according to the number of bits required to
completely describe a hypothesis. Simplex hypothesis require bits for a complete description and
the assumption is made that simplex hypothesis are better.
HYPOTHESIS GENERATION / PHARAMCOPHORE SEARCH
A pharmacophore model consists of a collection of features necessary for the biological
activity of the ligand arranged in 3D space, the common ones being hydrogen bond acceptor,
hydrogen bond donor and hydrophobic features. Hydrogen bond donors are defined as vectors
from the donor atom of the ligand to the corresponding acceptor atom in the receptor. Hydrogen
bond acceptors are analogously defined. Hydrophobic features are located at the centroids of
hydrophobic atoms.
Conformation s for all molecular were generated in view compound work bench using
poling algorithm and the best quality conformer generation method. The best conformer
generation considers the arrangement of atoms. Best conformer generation accepts a maximum of
255 conformers for the set of molecules catalyst generated conformers that provided the most
42
comprehensive treatment of flexible ring systems. All the conformers are automatically saved and
the number of conformers generated for each molecule with lowest conformer energy in kcal/mol.
Conformers were selected that fell within 20 kcal/mol range above the lowest energy
conformation found.
HYPOTHESIS GENERATION
The pharmacophore hypothesis generated in generate hypothesis work bench. The
molecular were selected as training set based on order of magnitude. Hypothesis generation
carried out by employing following assumptions.
1. Highly active and most inactive molecule should represent in the training set.
2. At least 3 or more molecules from each order of magnitude should be selected for
pharmacophore generation.
3. A minimum of 15 or above molecules will constitute for a training set.
4. Molecules selected should represent diversity towards chemical features.
HYPOTHESIS CONSIDERATIONS
In order to achieve a better pharmacophore, the following limits or considerations should be
met by generated hypothesis.
Configuration value should be around 17.
RMS should be as low as possible, preferable nearer to zero.
Correlation should be around 1.0
Cost factor difference between fixed cost and Null cost should be between 40-80 bits.
FACTORS THAT DETERMINE THE QUALITY OF PHARMACOPHORE
The overall cost of a hypothesis is calculated by summing three cost factors, a weight cost,
an error cost and a configuration cost. These are qualitatively defined.
WEIGHT COST
A value that increases in a Gaussian form as the feature weight in model deviates from an
idealized value of 2.0. This cost factor is designed to favor hypothesis where the feature Weights
are close to 2.
ERROR COST
A value that increases at the RMS difference between estimated and measured activities for the
training set molecules increases. This cost factor is designed to favor models where the correlation
between estimated and measured activities is better.
CONFIGURATION COST
This is a fixed cost which depends on the complexity of the hypothesis space being optimized. It is
equal to the entropy of the hypothesis space.
Of the three, the error cost factor has the major effect in establishing hypothesis cost.
During the beginning phase of an automated hypothesis generation, Catalyst calculates the cost of
two theoretical hypothesis one in which the error cost is minimal (all compounds fall along a line
of slope=10, and one where the error cost is high (all compounds fall along a line of slope +O).
These models can be considered upper and lower bounds for the training set. The cost values for
them are useful guides for estimating the chances for a successful experiment and are available
within 15 minutes from the start of the run because these experiments can easily require days of
run time. The ideal hypothesis cost (fixed cost) is reported in the full file found in the hypothesis
43
generation directory. This value tends to be 70-100 bits. The null hypothesis cost is reported in the
log file found in the same directory and is usually higher than the fixed cost. What is important is
the difference between these two costs. The greater the difference, the higher is the probability for
finding useful model. In terms of hypothesis significance, what really matters is the magnitude of
the difference the cost of any returned hypothesis and the cost of the null hypothesis. In general, if
this difference is greater than 60 bits, there is an excellent chances the model represents a true
correlation. Since, most returned hypothesis will be higher in cost than fixed cost model, a
difference between fixed cost and null cost of 70 or more will be necessary in order to achieve the
60 bit difference. If a returned hypothesis has a cost that differs from the null hypothesis by 40-60
bits, there is a high probability it has a 75-90% chances of representing a true correlation in the
data. As the difference becomes less than 40 bits, likelihood of the hypothesis representing a true
correlation in the data rapidly drops below 50%%. Under these conditions, it may be difficult to
find a model that can be shown to be predictive. In the extreme situation where the fixed and null
cost differential is small (>20), there is little chance of succeeding and it is advisable to reconsider
the training set before proceeding. Another useful number is the entropy of hypothesis space. This
value is calculated early in the run and is in full near the value for fixed cost.
TRAINING SET
44
4.2 STRUCTURE OR TARGET BASED DRUG DESIGN
Structure based drug design, the three dimensional structure of drug target interacting with
small molecules (drug) is used to guide drug discovery. . Drug targets are typically key molecules
involved in a specific metabolic or cell signaling pathway that is known, or believed, to be related
to a particular disease state. Drug targets are most often proteins and enzymes in these pathways.
Drug compounds are designed to inhibit, restore or otherwise modify the structure and behavior of
disease-related proteins and enzymes.
SBDD uses the known 3D geometrical shape or structure of proteins to assist in the
development of new drug compounds. The 3D structure of protein targets is most often derived
from x-ray crystallography or nuclear magnetic resonance (NMR) techniques as they have the
resolution few angstroms (about 500,000 times smaller than the diameter of a human hair). At this
level of resolution, researchers can precisely examine the interactions between atoms in protein
targets and atoms in potential drug compounds that bind to the proteins. This ability to work at
high resolution with both proteins and drug compounds makes SBDD as one of the most powerful
methods in drug design
Once bound at the receptor site, drugs may act either to initiate a response (agonist action or
stimulant) or decrease the activity potential of that receptor (antagonist action or Depressant) by
blocking access to it by active molecules. Thus, any drug may have structural features that
contribute independently to the affinity for the receptor and to the efficiency with which the drug
receptor combination initiates the response (intrinsic activity or efficiency). The response is
related to the drug receptor complexes. The affinity of a drug may be estimated by comparison of
the dose required to produce a pharmacological response with the dose required by a reference
standard drug or the natural ligand for that receptor. The affinity of a drug may be estimated by
comparison of the dose required to produce a pharmacological response with the dose required by
a reference standard drug or the natural ligand for that receptor. Structure based drug design, the
three dimensional structure of drug target interacting with small molecules (drug) is used to guide
drug discovery. Structure based drug deigning is employed with the following parts:-
4.2.1 Structure based pharmacophore generation:
Structure based pharmacophore approach was find an out the essential feature of active
site which can contribute for ligand binding.
The interaction generation protocol takes an input receptor and a defined active site and
analyzes the active site for donors, acceptors, and hyderophobes. The result of the calculation is an
interaction map. The density of polar site parameter specifies the density of the vectors in the
interaction site for hydrogen bonds. The density of lipophilic sites parameter specifies the density
of points in the interaction site for lipophilic atoms.
Procedure:
1. Load the interaction generation protocol from the protocols explorer. The parameters
display in the parameter explorer
2. Ensure that the structure you want to define as the receptor is open in 3d window .use the
binding site tool panel to define the structure as the receptor.
3. Set the input site sphere parameter to define the active site. Select the ligand from the
receptor ligand complex and define the input site sphere
4. The radius of the site sphere can change by selecting the sphere and changing the radius in
the attributes dialog.
5. Select the receptor structure from the input receptor parameter list.
6. select the sphere as the input site sphere parameter
7. Set the remaining parameter as desired .an run the protocol.
45
4.2.2 Docking:
Molecular docking is the technique that is used to study molecular binding and how
molecules bind. The term “docking” is mostly related to protein molecule interactions. Following
chart shows the work flow of the docking process.
a. SITE SEARCH
The position and shape binding site of protein is defined to a grid. The active site shape is
defined based on the shape of the protein, from which all sites are detected. Docked ligand method
is used to define active site, in which unoccupied grid points with in a certain user definable
distance to ligand atoms are collected to form the site.
46
b. CONFORMATIONAL SEARCH
The Monte Carlo simulation is employed in the conformational search of the ligand. During
the search, bond lengths and bond angles are untouched only torsional angles (except those in a
ring) are randomized. Therefore, the ligand molecules should be energy minimized to ensure
correct bond lengths and bond angles before using ligand fit.
c. LIGAND FITTING
After a new conformer is generated, the ligand fitting is carried out in two steps. First the
non mass- weighted principle moment of inertia (PMI) of the binding site is compared with non
mass- weighted principle moment of inertia (PMI) of the ligand. If the value (Fit value) is above the
threshold or not better fitting results previously saved, no further docking process will be
performed. If the value (Fitvalue) is better than previously saved results the ligand is positioned
into the binding site according to the PMI. Because PMI is a scalar property, there are four
possible positions for the ligand to orient in the binding site. For each position, the corresponding
docking score is computed.
The docking score is negative value of the non-bonded inter molecular energy between
ligand and protein. After the docking score is calculated, for each orientation it is compared with
the results saved previously. If the new one is better, it is saved, and then the process of
conformational search and ligand fitting is iterated until number of trials is reached. Finally rigid
body minimization is applied to the saved conformations of the ligand to optimize their positions
and docking scores.
PROCEDURE
Steps followed for ligand fit
1. Potent inhibitor molecules which can inhibit the action of spla2 were taken.
2. Molecules with diversified similarities and pharmacophore features were selected from the
literature.
3. The molecules which are to be docked in a receptor site are created in a SD file so as all
molecules are processed for the docking score at a site.
4. The active site of a protein is identified by the find site from receptor cavities which is
processed by the flood flow algorithm.
5. The identification of the active site is located by the already docked ligand
6. The protein molecule is selected, the set of molecules in the SD file are chosen and docking
score is calculated.
7. Thus, the docking score for a set of molecules are calculated through ligand fit.
4.2.2b C-Docker:
C docker is a grid based molecular docking method that employs charm. It has been
employed in ds through the dock ligands (cdocker) protocol. In c docker, the receptor is held rigid
while the ligands are allowed to flex during the refinement. Random ligand conformations are
generated from the initial ligands structure through high temperature molecular dynamics followed
by random rotations. The random conformations are refined by grid based simulated annealing and
a final grid based or full force field minimization.
C-Docker steps:
1. Define the receptor and search for binding sites,
2. Prepare and run the dock ligands (c docker) protocol,
Procedure
1. open the receptor protein and apply the charmm force field
47
2. define the selected molecule as a receptor after that select the ligand define sphere from
selection
3. open the c docker protocol and set the parameters
4. run the protocol
48
4.2.2d DENOVO LIGAND DESIGN
LUDI
Ludi is method for the denovo design of ligand for protein (inhibitor)
It can be also suggest modification of known ligand that may enhance the target protein. The
following Chart shows the ludi work flow.
49
Ludi method:
Ludi is based on fragment approach method. It suggests how suitable and small fragment can be
positioned into cleft of protein structures. This positioning is the strength ludi because it
immediately provides with the ideas about how putative binding site on the protein can be
saturated by the fragment and those fragment might be linked together .ludi works in three steps
It calculate interaction site within the protein active site or from the active angles.
It searches libraries for fragments and fits than onto the interaction sites
To process an alignment or linked for the fragment.
Ludi distinguishes four types of interaction sites.
H-donor
H acceptor
Lipophilic aliphatic
Lipophlic aromatic.
The aromatic and aliphatic interactions are suitable sites for hydrophobic interactions
The H donor and H acceptor interaction sites are suitable for H bond formation. Ludi is capable
for fitting fragments on to the interaction sites and simultaneously a linking (i.e linking) them to
an existing ligand.
Method:
1. Identification of chemical nature of active site amino acids
2. Fragments identification and analysis of ludi score
3. Searching for link
4. Linking the fragments
5. Fusing the fragment and linking
6. Docking validation.
FRAGMENT FITTING
The next step is to fit fragments onto the interaction sites. Ludi searches the list of
interaction sites by distance criteria for suitable sets of two to sites to match the fragments.
Required interaction are specified are specified using targeted mode. In targeted mode fragments
are require to interact with the protein atom or atoms specified by the user. Any fragment fit that
does not interact with the entire set of specified target atoms is rejected.
To fit the fragment, Ludi performs a root mean squares (RMS) superimposition using
algorithm given by Kabasch (1978). A fragment fit is accepted if the RMS value is less than a user
defined threshold (typically 0.2A to 0.6A) , and no vanderwaals overlap of the fitted fragment with
the protein occurs , and if ,the electrostatic check parameter on the ludi runtimes parameters
control panel is checked , no unacceptable electrostatic repulsions are found. When the receptor
structure is not known, a fragment fit is rejected if the fragment extends outside the volume
defined by the set of active analogs.
LINK SITES: ALIGNING FRAGMENTS WITH PARTIALLY BUILT LIGANDS
Ludi is capable of fitting fragments onto the interaction sites and simultaneously aligning
(i.e. linking) them to an existing to a ligand. For this purpose, link sites are defined on the ligand.
A link site is a hydrogen atom that all the hydrogen atoms of the positioned ligand (within a user
specified cutoff radius) are link sites.
The ludi works as described above:
50
LUDI FRAGMENT LIBRARIES
The Ludi fragment library is divided into two parts. The de novo library is used when Ludi is run
in no-link mode. The link library is used when Ludi is run in link mode. The de novo library and
the link library each consist of two files, a file that specifies the fragment topologies and a file that
specifies the interaction types of fragment functional groups.
PROCEDURE
1. It calculates
interaction sites within the protein sPLA2 active site or from the active analogs.
2. It searches
libraries for fragments and fits them from onto the five interaction sites which are present
at the active site.
3. It proposes an
alignment or linking for the fragments and the new ligand is designed.
The highest activity with the best dock score is better fitted when compared to
other. A knowledge based approach is to suggest possible binding positions. The present
experimental studies carried out using ludi program. This program is studied to dock small
molecular fragments within protein binding sites using interactions between the donor hydrogen
and its acceptor is close to 1.8Å and the angle subtended at the hydrogen is rarely less than 1.20A.
Information about the preferred geometries of such interactions can be obtained from analysis of X
ray crystallographic database. Kelbe has performed a very careful analysis of non bonded contacts
observed in the CSD.
51
52
5.1. QSAR:
In the present study quantitative structure activity relationship studies were carried
out on phospholipasea2 inhibitors in order to design selective and potential inhibitors. QSAR
models were developed using1D and 2D-descriptors using discovery studio software. QSAR
attempts to model the activity of a series of compounds using measured or computed properties of
the compounds. In the equation the term „N‟ means the number of data points, r 2 which is the
square of the correlation coefficient which describing the binding of the compounds to the QSAR
model. XV r2, a squared correlation coefficient generated during a validation procedure using the
equation
XV r2 = (SD PRESS)/SD
SD means the sum of squared deviations of the dependent variable values from their
mean the predicted sum of squares (PRESS), the sum of overall compounds of the squared
differences between the actual and the predicted values for the dependent variables. The PRESS
value is computed during a validation procedure for the entire training set. The larger the PRESS
value the more reliable is the equation. XV r2 is usually smaller than the overall r2 for a QSAR
equation. It is used as a diagnostic tool to evaluate the predicted power of an equation generated
using the multiple leaner regression method.
GFA work by generating random populations of solution to a problem, scoring the
relative quality of the solution , and caring forward the most fit solutions or analogues(generated
through mutation and crossover)of other solutions to iteratively generated(and finally converge
on)new, more fit solution. In this study GFA analysis was done with following parameters.
Population size
Initial equation length
Final equation length
Number of generation
Boot strap r2 correlation coefficient calculated during the validation procedure. 79
compounds were included in the training set to generate the primitive QSAR model covering the
widest data range of IC50 values 0.005 to 50.01 µM. The predictive characters of QSAR were
further assessed using test molecules. To judge the predictive ability of the QSAR model for new
drug candidates the IC50 values for the test and training set were evaluated.
GFA parameters
Population 100
53
The GFA method performs a search over the space of possible QSAR models using lack of fit
(LOF) scores to estimate the fitness of each model. These models lead to the discovery of
predictive QSAR equations.
From the above equation, the positive values are the reference for the presence of specific group
at that point and increase the activity of molecule and the negative values indicate the presence of
ionic group which reduce the activity.
54
Table 8: Experimental and predicted values of Training set compounds using GFA
57
Experimental activity
58
Test Set
The purpose of QSAR is not only to produce the biological activity of the training set but
also to predict the values of the test set molecules. From the above equation obtained for the
training set molecules of known activity are introduced to study table so as to predict the
biological activity. A series of molecules are introduced to study table which are known as test set
molecules. After the prediction of activities of test set molecules the activity of prediction crosses
over 80%.
Table 9: Experimental and predicted values of Test set compounds using GFA
11g 7 6.95
1b 6.9 5.523
15 8 7.114
22 8 7.584
59
28iii 8.22 7.941
60
Graph 2: Showing correlation between experimental and predicted activities by
QSAR equation using GFA method for test set.
The result generated from QSAR equation using GFA method, the values observed for r2
and XV r2 are in specific range and there is a good correlation between experimental and GFA
predicted activity as listed. Good correlation is observed between the experimental IC50 and
computational predicted IC50 values. It has been suggested as since the predictive ability of
equations is good, they can be used to develop new analogs.
61
62
Pharmacpohore
The work in discovery studio shows how chemical features hydrogen acceptor, hydrogen
donor, hydrophobic aliphatic of set of compounds along with their activities ranging over several
orders of magnitude can be used to generate pharmacophore hypothesis, that can successfully
predict the activity. The models were not only predictive within the same series of compounds but
differences classes of diverse compounds also effectively mapped onto most of the features
important for activity. The pharmacopore generated can be used for diversified structures that can
be potentially inhibit lethal factor inhibitors discovery and to evaluate how well any newly
designed compound maps in the pharmacophore developed in this study, using inhibitors against
lethal factor showed distinct features that may be responsible for the activity of the inhibitors.
Analogue based pharmacophore generation:
28v 7.70 7.70 1.73 4.00 2.00 2.00 1.00 0.00 0.00 6.00
28ii 7.80 7.80 1.80 4.00 3.00 1.00 1.00 0.00 0.00 4.00
28xl 8.05 8.05 1.81 2.91 1.82 1.00 1.00 0.00 0.00 4.00
67a 5.63 5.63 1.43 3.77 1.77 2.00 0.00 0.00 0.00 4.00
43d 7.48 7.48 1.78 3.00 3.00 0.00 1.00 0.00 0.00 2.00
12d 6.59 6.59 5.57 4.93 3.93 1.00 0.00 0.00 0.00 4.00
A-hydrogen bond acceptor: H-hydrogen bond acceptor lipid: D-hydrogen bond donor: z-hydrophobic
Y-hydrophobic aliphatic: X-hydrophobic aromatic: N-negativeionizable: P-positive with Exclusions
W- PositiveIonizabl: R-ring aromatic.
63
Table 11: Common Feature Pharmacophore Generation Rank File:
Hypo. Pharmacophore Rank score Direct hit Partial hit Max fit
no feature
1 ZDA 192.485 111111111 000000000 3
64
5.3. HYPOGEN (Training set):
Sets of 5 hypotheses were generated using the data from 22 training set compounds.
Different cost values correlation coefficient RMS deviations and pharmacophore features are
listed in table.
TABLE 12: The 5 pharmacophore models generated by the hip-hop algorithm
The best pharmacophore is taken as the hypothesis 1 which has the highest cost difference,
lowest error cost, lowest RMS difference and the best correlation coefficient has two hydrogen
bond acceptor, one hydrophobic and one hydrogen bond donor features. The best pharmacophore
(hypo1) has the highest cost difference of 35.867, the best correlation coefficient and RMS
difference.
For the highly active compound (28v) in training set, mapped all the features are perfectly
to the features of Hypo 1. In compound 28v, HBA1 feature mapped to the electron rich O atom of
Sulfur Dioxide group and HBA2 feature corresponded to the another O group of Sulfur Dioxide
group. The HBD feature mapped to the NH group attached with Sulfur Dioxide group. The
Hydrophobic group was mapped to the methyl attached to 3rd-position of the benzene ring of the
compound.
.
65
Figure18: Blank Pharmacophore feature of sPLA2 inhibitors
66
Figure 19: Showing the distances between pharmacophore features
67
Figure 20: Overlapping of highest active inhibitor molecules (28v) of
training set with the best pharamcophore (Hypo1).
68
Figure 21: Overlapping of lowest active inhibitor molecule 28xl of training set with
the best pharamcophore (Hypo1)
69
Table 13: Results of pharmacophore hypothesis generated using training set.
70
Discussion
The dataset was divided into training set (22 compounds) and test set (89 compounds,),
considering both structural diversity and wide coverage of the activity range. The compounds with
activity with < 1 μM were considered as highly actives (+++), compounds with an activity range
of 1-10 μM as moderate actives (++) and activity of >10 μM as least actives (+).At end of run,
HypoGen generated 5 pharmacophore models. The Null cost for ten hypotheses was 128.556, the
fixed cost of the run was 79.954 and the configuration cost was 18.83. A difference of 48.602 bits
obtained between fixed and null costs is a sign of highly predictive nature of hypotheses. All 5
hypotheses generated showed high correlation coefficient between experimental and predicted
IC50 values, in the range of 0.95 to 0.89 and moreover, these are having cost difference less than
45 bits between the cost of each hypothesis and the null cost. It indicates that all the hypotheses
are having true correlation between 80-95%. The cost values, correlation coefficients (r), RMSD,
and pharmacophore features are listed in Table12.The best pharmacophore (Hypo 1) consisted of
two H-bond acceptor (HBA), an H-bond donor (HBD), and a hydrophobic feature with a
correlation coefficient (r) of 0.95, total cost (92.689), and lowest RMSD value (0.89) was chosen
to further validate its predictive power by estimating the activity of test set.
The predictability of Hypogen one was evaluated by using diversifies test set compounds.
The generated pharmacophore model has predicted the activity of a diverse dataset of 89 test set
compounds with correlation value of 0.7987. Hence from this analysis, Hypo1 was able to
distinguish active compounds from the inactive compounds
71
5.4 Structure based pharmacophore:
Structure based pharmacophore approach was to find an out the essential feature of
active site which can contribute for ligand binding.
.
Interaction generation:
Enumerates pharmacophore features from a protein active site. The site finding algorithm
from Ludi to identify points in the active site that could interact with the receptor. Creates a
pharmacophore query containing Hydrogen bond acceptor, donor and hydrophobic features from
these points
After interaction generation run, it Found 329 features: minimizied1DB41
Found 98 lipophilic features
72
Figure 23: center points of cluster feature
73
Figure24: Mapping of 28v molecule with structure based pharmacophore feature.
This structure based pharmacophore features are useful for virtual screening of large
database.
74
5.5 LIGAND FIT
Every molecule in the prepared bio active compound SD file will be docked
into the binding site chosen, the fits will be automatically processed according to the preferences
chosen and saved into the output SD file. The results containing RMS calculations perform by
comparing the RMS difference of every fit and the first conformer in the input SD file.
Minimization energies of the fits in the presence of the protein and ludi score according to the
references can be seen in the input SD file the option of performing ligand fit using flexible fit
method carried out initially in a random conformation. The docking score is the negative values of
the non-bonded inter molecular energy; if the ligand atom has partial charge on it, the electrostatic
grid is used to estimate electrostatic energy. If it is a hydrogen atom, the hydrogen grid is used for
vanderwaals energy. Otherwise carbon grid is used. The following table enlists the docking score
and the corresponding minimization energies obtained for the beast conformer for each molecule.
The activity of the each molecule may be contributed by the best lowest energy obtained in the
ligand fit with the corresponding dock scores in table14 are as follows:-
Table14: Docking scores of inhibitors molecules of LF obtained after subjecting to ligand fit.
75
Figure25: conformation search of high active compound (28v) inside the protein
(1DB4) binding site.
76
Figure26: highest acting Molecule 28V which has been subjected to ligand fit
showing its interaction.
77
Figure27: Hydrogen bond interaction of high active compound 28v with active site
amino acids
78
Figure28: Hydrogen bond interaction of low active compound 11d with active site
amino acids
79
Discussion:
Docking studies shows that the compound 28v having the high dock score of 75.456. And
compound 11d has the low dock score of 44.873.the following table shows distance and active site
amino acids forming the hydrogen bond interactions with 28V and 11d.
Table15: Hydrogen bond distances and hydrogen bond forming amino acids with 28v and 11d
compound
80
5.6 C docker:
Every molecule in the SD file will be docked into the binding site chosen and in these
docking Docks ligands into an active site using CHARMm.
Uses a CHARMm-based molecular dynamics (MD) scheme to dock ligands into a receptor
binding site. Random ligand conformations are generated using high-temperature MD. The
conformations are then translated into the binding site. Candidate poses are then created using
random rigid-body rotations followed by simulated annealing. A final minimization is then used to
refine the ligand poses. The following table enlists the docking energy and the corresponding
minimization energies obtained for the beast conformer for each molecule. The activity of the each
molecule may be contributed by the best lowest energy obtained in the c- docker with the
corresponding dock in energy in table 15 are as follows:-
Table15: C-Dock energy of inhibitors molecules of C-Dock obtained after subjecting to legend
fit.
81
Figure29: conformation search of high active compound (28v) inside the protein
(1DB4) binding site.
82
Figure30: highest acting Molecule 28V which has been subjected to C-Dock
showing its interaction.
83
Figure31: Hydrogen bond interaction of high active compound 28v with active site
amino acids
84
Figure32: Low acting Molecule 11d which has been subjected to C-Dock showing
its interaction.
85
Discussion:
C-Docking studies shows that the compound 28v having the high c-docker energy of
28.754. And compound 28xl having the low dock score of -38.076. The following table shows
distance and active site amino acids forming the hydrogen bond interactions with 28V and 28xl.
Table16: Hydrogen bond distances and hydrogen bond forming amino acids with 28v and 11d
compound
86
5.7 Lib dock
Docks ligands into an active site using hotspots. Hotspots are polar and apolar interaction sites.
Ligand conformations can be recalculated or generated on the using DS.
Table17: Lip Docking scores of inhibitors molecules of obtained after subjecting to ligand fit.
55.71,31.12,44.29,P,36,24
62.91,33.92,48.49,A,94,23
62.71,33.73,47.09,A,77,23
57.31,26.32,45.89,A,56,23
57.31,33.73,46.49,A,69,14
59.31,28.73,44.49,A,30,36
87
Figure33: highest acting Molecule 28V which has been subjected to Lib dock
showing its interaction.
88
Figure34: Hydrogen bond interaction of high active compound 28v with active site
amino acids
89
Figure35: Low acting Molecule 11d which has been subjected to Lib dock showing
its interaction.
90
Figure36: Hydrogen bond interaction of high active compound 11d with active site
amino acids
Discussion:
The lib dock score of the above stated molecules are all positive values. Thus the molecules
can be used as the potential ligands for the inhibition of sPLA2. The molecule 28v and 28xl are
found to have a dock score 161.275and 146.516respectively.
91
5.8 Ludi
Ludi is method for the denovo design of ligand for protein (inhibitor) it can be also
suggest modification of known ligand that may enhance the target protein. In these studies the
denovo legand UA6 found by ludi. Following table shows the 2d structure of the ligand and there
molecular properties
CH3
O
H3C CH3
S
5-(1-methoxy-4-methylpentan-3-yl)[1]benzothieno[3,2-b]furan
Molecular Formula = C17H20O2S
Formula Weight = 288.4045
Composition = C(70.80%) H(6.99%) O(11.10%) S(11.12%)
Molar Refractivity = 87.04 ± 0.3 cm 3
Molar Volume = 252.7 ± 3.0 cm3
Parachor = 643.3 ± 4.0 cm3
Index of Refraction = 1.604 ± 0.02
Surface Tension = 41.9 ± 3.0 dyne/cm
Density = 1.140 ± 0.06 g/cm 3
Dielectric Constant = Not available
Polarizability = 34.50 ± 0.5 10-24cm3
Monoisotopic Mass = 288.1184 Da
Nominal Mass = 288 Da
Average Mass = 288.4045 Da
Uses Ludi to search a library of small fragments to find candidates that bind in an active
site. Fragments in the library that overlay with a calculated interaction map are found.
92
Figure37: Ludi molecule with interaction map
93
C–docking result for ludi ligand:
94
Figure38: Ludi Molecule UA6 which has been subjected to C dock showing its
interaction
95
Figure42: Hydrogen bond interaction of Ludi compound ua6 with active site amino acids
96
Pharmacophore mapping of ludi ligand:
Fit
Name Estimate Mapped Atoms value
4,0.231,-3.402,2.042,1.6,HBA 1.11 3.35
18,4.905,0.304,-1.932,1.6,HBA 2.11
19,2.548,5.188,1.714,1.6,Hydrophobic1
97
Discussion
Docking shows that the new ligand molecule (UA6) has the c-dock energy -21.094
and UA6 compound forming the hydrogen bonding with active site amino acids gly 22, gly 27and
his 47. As per the pharmacophore feature mapping studies showed the new compound having
estimated value of 1.901 and fit value 3.35.
98
99
6. Conclusion
The 3D QSAR studies conducted for training set compound gave a good r2 score of 0.936
with four outliers with a GFA graph with a Fit line representing the good correlation of the
compounds with the activities. The pharmacophore studies gave the best quantitative
pharmacophore model in terms of predictive value consisted of three features like Hydrogen bond
acceptor, Hydrogen bond acceptor lipid, Hydrophobic, and Ring aromatic. Hypogen which is
further validated by using a set of sPLA2 inhibitors gave a correlation value of 0.968. The
Pharmacophore studies showed four regions which showed interactions i.e., hydrogen bond
acceptor, Hydrophobic, hydrogen bond acceptor lipid and ring aromatic. docking studies shows
that the compound 28v having the high dock score of 75.456and the compound 11d having less
dock score 44.87.
The Insilco modeling helped to guide the lead optimization and lead to the generation of a
highly potent series of sPLA2 inhibitors with good drug like properties and is subject of another
communication. However, the scope for fine tuning and optimizing this potent class of sPLA2
inhibitor could lead to the generation of new therapeutic agents.
The combined approach of analogue and structure based drug designing methods allowed
us to gain an insight into predicting the enhanced activity and exploring the docking interactions
between amino acid residues of lethal factor and the ligand. Good ligands may not act as good
drugs. Thus, the prime objective of this project to prove the authenticity of our techniques
obtained from the various journals is completed using computer aided drug designing. The results
obtained are used to develop new ligand molecules and find their activities Insilco and proving the
same in accordance to the experimental values. Thus, the results reported can successfully employ
in the rational drug designing of novel and potent lethal factor inhibitors.
100
101
Reference:
Humphrey P. Rang MB BS MA DPhil FMedSci FRS Drug Discovery and Development:
Technology In Transition (2006-01-12)
computer aided drug design by T. J. Perun1 edition (February 22, 1989) page 369, page 453,
and page 455
Bomalaski, J. S.; Clark, M. A. Phospholipase A2 and Arthritis. Arthritis Rheumatism 1993, 36,
190-198.
7. (A) Vadas, P.; Pruzanski, W.; Stefanski, E.; Ruse, J.; Farewell, V.; McLaughlin, J.; Bombardier,
C. Concordance of Endogeneous Cortisol and Phospholipase A2 Levels in Gram-Negative Septic
Shock: A Prospective Study. J. Lab. Clin. Med. 1988, 111, 584.
(b) Aufenanger, J.; Zimmer, W.; Kattermann, R. Characteristics and Clinical Application of a
Radiometric Escherichia Coli-Based Phospholipase A2 Assay Modified for Serum Analysis. Clin.
Chem. 1993, 39, 605-613.
(b) Pruzanski, W.; Vadas, P. Secretory Synovial Fluid Phosphlipase A2 and its Role in the
Pathogenesis of Inflammation in Athritis. J. Rheumatol. 1988, 15, 1601-1603. (c) Pruzanski, W.,
Vadas, P. Phospholipase A2: A Mediator Between Proximal and Distal Effectors of Inflammation.
Immunol. Today 1991, 12, 143-146.
10. (a) Mobilio, D.; Marshall, L. A. Recent Advances in the Design and Evaluation of Inhibitors
of PLA2. Annu. Rep. Med. Chem. 1989, 24, 157-166. (b) Sofia, M. J.; Silbough, S. A. Novel
Approaches to Anti-Inflammatory Agents as Therapeutics for Pulmonary Disease. Annu. Rep.
Med. Chem. 1993, 28, 109-118.
102
(c) Wilkerson, W. W. Anti-Inflammatory Phospholipase A2 Inhibitors. Drugs Future 1990, 15,
139-148.
11. Schevitz, R. V.; Bach, N. J.; Carlson, D. G.; Chirgadze, N. Y.; Carlson, D. K.; Dillard, R. D.;
Draheim, S. E.; Hartley, L. W.;Jones, N. D.; Mihelich, E. D.; Olkowski, J. L.; Snyder, D. W.;
Sommers, C.; Wery, J.-P. Structure-Based Design of the First Potent and Selective Inhibitor of
Human non-Pancreatic Secretory Phospholipase A2. Nature Struct. Biol. 1995, 2, 458-465
12. Marison, L.; Cockburn, W. F. The synthesis of pseudoconhydrine. J. Am. Chem. Soc. 1949,
71, 3402-3404. Konakahara, T.; Takagi, Y. Convenient method for the preparation of 2-
phenacylpyridines. Heterocycles 1980, 14, 393-396.
13 (a) Uchida, T.; Matsumoto, K. Methods for the construction of the indolizine nucleus. Synthesis
1976, 209-236. (b) Casagrande, C.; Invernizzi, A.; Ferrini, R.; Miragoli, G. Indolizine analogues
of indomethacin. Farmaco. Ed. Sci. 1971, 26, 1059-1073.
14. (a) Desidiri, N.; Galli, A.; Seslili, I.; Stein, M. l. Synthesis and binding properties to GABA
receptors of 3-hydroxypyridyl- and 3-hydroxypiperidyl- analogues of Baclofen. Arch. Pharm.
1992, 325, 29-33. (b) Kimura. E.; Kotake, Y.; Koike, T.; Shionoya, M.; Shiro, M. A novel cyclam
appended with 3-hydroxypyridine. An ambient donor ligand comprising a pyridyl N and
apyridynolate O- donor. Inorg. Chem. 1990, 29, 4991-4996.
15 Lau, C. K.; Tardif, S.; Dufresne, C.; Scheigetz, J. Reductive deoxygenation of aryl aldehydes
and ketones by tert-butylamineborane and aluminum chloride. J. Org. Chem. 1989, 54, 491- 494.
16. Desideri, N.; Manna, F.; Stein, M. L.; Bile, G.; Filippelli, W.; Marmo, E. Eur. Synthesis of 3-
hydroxy-2-pyridineacetic acid and its evaluation on experimentallipaemia. J. Med. Chem. 1983,
18, 295-299.
17. Bellesia, F.; Ghelfi, F.; Grandi, R.; Pagnoni, U. M. Regioselective R-bromination of carbonyl
compounds with trimethylbromosilane- dimethylsulfoxide. J. Chem. Res. (s) 1986, 428-429.
Potent Inhibitors of Secretory Phospholipase A2 Journal of Medicinal Chemistry, 1996, Vol. 39,
No. 19 3657
18. (a) Rapport, H.; Volcheck jr., E. J. The synthesis of desoxycarpyrinic and carprinic acids. J.
Am. Chem. Soc. 1956, 78, 2451-2455. (b) Iorio, M. A.; Gatta, F.; Michalek, H. Synthesis
andconformational aspects of acetoxypiperidinium iodides related to acetylcholine. Eur. J. Med.
Chem. 1980, 15, 165-171.
(i) a-fused. Comprehensive Heterocyclic Chemistry; Bird, C. W., Cheesman, G. W. M., Eds.;
Pergamon Press: London. 1984; Vol. 4, pp 443-495.
103
20. Kakehi, A.; Ito, S.; Yamada, N.; Yamaguchi, K. Preparation of new nitrogen-bridged
heterocycles. 21. A facile synthesis of 2-indolizinethiols using new protecting groups. Bull. Chem.
Soc. Jpn. 1990, 63, 829-834.
21. Gresham, T. L.; Jansen, J. E.; Shaver, F. W.; Bankart, R. A.; Beears, W. L.; Prendergast, M. G.
â-Propiolactone. VI. Reactions with phenols, thiophenols and their salts. J. Am. Chem. Soc. 1949,
71, 661-663.
22. Reynolds, L. J.; Hughes, L. L.; Dennis, E. A. Analysis of human synovial fluid phospholipase
A2 on short chain phosphatidylcholine- mixed micelles: development of a spectrometric assay
suitable for a microtiterplate reader. Anal. Biol. 1992, 204, 190-197.
23. Schadlich, H. R.; Buchler, M.; Berger, H. G. Improved method for the determination of
phospholipase A2 catalytic concentration in human serum and ascities. J. Clin. Chem. Clin.
Biochem. 1987, 25, 505-509.
24. Tojo, H.; Ono, T.; Okamoto M. Reverse-phase high performance liquid chromatographic assay
of phospholipases: application of spectrometric detection to rat phospholiase A2 isozymes J. Lipid
Res. 1993, 34, 837-844.
26. Dole, V. P.; Meinertz, H. J. Microdetermination of long-chain fatty acids in plasma and
tissues. J. Biol. Chem. 1960, 235, 2595-2599.JM960395Q 3658 Journal of Medicinal Chemistry,
1996, Vol. 39, No. 19 Hagishita et al.
104