QSAR, Pharmacophore and Docking Studies On Human Phaspholipase A2 Inhibitors

QSAR, PHARMACOPHORE AND
DOCKING STUDIES ON HUMAN PLA2

INHIBITORS
by
IRFAN N
GVK Biosciences Private Limited nirfan_05@rediffmail.com
# phase-1, technocrats industrial

estates,Balanagar,
Hyderabad-500037,
India.
1
QSAR, PHARMACOPHORE AND DOCKING STUDIES ON
HUMAN PLA2 INHIBITORS
Protein Modeling and Rational Drug Designing
by
IRFAN N
In bioCampus Centre of Excellence
GVK Biosciences Private Limited

# phase-1, technocrats industrial estates,
Balanagar,
Hyderabad-500037,
There are no sources in the current document.

There are no sources in the current document.
2
S.NO DESCRIPTION PAGE NO
1. Abstract 3
2. Legends 4
3. Introduction
3.1 Drug discovery 7
3.2 introduction to protein 9
3.3 Software 19
4 Material and Methods 28
4.1 Analogue based drug designing
4.1.1 Quantitative structure activity relationships(Qsar) 51
4.1.2 Pharmacophore 60
4.2 structure based drug designing
4.2.1 Structure based pharmacophore generation 70
4.2.2 Docking studies
4.2.2a Ligand Fit 71
4.2.2b C –Docker 72
4.2.2c Lib Dock 73
4.2.2d Ludi 74
5 Result and Discussions
5.1 Qsar 78
5.2 Common feature pharmacophore generation 88
5.3 3D Qsar pharmacophore generation 90
5.4 structure based pharmacophore generation 97
5.5 Ligand fit 100
5.6 C – Docker 102
5.7 Lib Dock 112
5.8 Ludi 117
6 Conclusion 125
7 Reference 127
3
1. ABSTRACT
Phospholipase A2 is an enzyme which hydrolyzes the sn-2 position of certain

cellular phospholipids. The liberated lysophospholipid and arachidonic acid are precursors in the
biosynthesis of various biologically active products. As human no pancreatic sPLA2 is present in
high level in the blood of patients in several pathological conditions like septic shock,
pancreatities, trauma, bronchial asthma, gout and other diseases. The potent PLA2 inhibitors have
been suggested to be useful drugs. In this qsar, pharmacophore and docking studies on human
PLA2 inhibitors useful to find new and potent active compounds against several pathological
condition. As per this studies the compound 28v having high dock score and it formed hydrogen
bond interaction with gly29, his27, his47, lys62 amino acids. Novel drug 5-(1-methoxy-4-
methylpentan-3-yl)[1]benzothieno[3,2-b]furan Found through the ludi have the C-dock energy of
-21.094 and it formed hydrogen bond interaction with active site amino acids gly 22, gly 29 and
his 47. Analogue based studies were performed using qsar and pharmacophore generation on
sPLA2 inhibitors. Qsar model having the r² value is 0.968. This study provides the insight in to
binding interaction between receptor and ligands and useful in designing of novel and potent
inhibitors against inflammatory conditions.
4
Legends
PLA2 Phospholipase A2
GLY Glycine
HIS Histidine
LYS Lysine
CADD Computer Aided drug design
NSAIDS Nonsteroidal Anti-inflammatory drugs
CNS Central Nervous system
HDL High density lipids
ASP Aspartic acid
PHE Phenylalanine
LEU Leucine
TYR Tyrosine
LF Ligand fit
CHARMM Chemistry at Harvard macromolecular mechanics
QM Quantum mechanics
HYPO Hypothesis
MD Molecular dynamics
SD FILE Structural data file
µM Micro molar
NM Nano molar
% Percent
IC50 Half maximal inhibitory concentration
R² Regression co-efficient
5
XVR2 Cross validated regression co-efficient
PRESS Predicted residual error sum squares
LOF Lake of fit
CSD Cambridge structure data base
MLR Multiple linear regression
HBD Hydrogen bond donor
HBA Hydrogen bond acceptor
HY Hydrophobic
PDB Protein data bank
SBDD Structure based drug designing
ABGD Analog based drug designing
RMS Root mean square
HTS High throughput screening
DNA Deoxyribonucleic acid
NMR Nuclear magnetic resonance
QSAR Quantitative structure activity relationship
SAR Structure activity relationship
ADMET Adsorption distribution metabolism excretion toxicity
6
7
3.1 Drug Discovery:
Drug discovery is the process by which drugs are discovered and/or designed. In the
past most drugs have been discovered either by identifying the active ingredient from traditional
remedies or by serendipitous discovery. A new approach has been to understand how disease and
infection are controlled at the molecular and physiological level and to target specific entities
based on this knowledge. The process of drug discovery involves the identification of candidates,
synthesis, characterization, screening, and assays for therapeutic efficacy. Once a compound has
shown its value in these tests, it will begin the process of drug development prior to clinical trials.
Figure 1. Drug Discovery and development.
8
Problem in drug discovery:
Estimates of time and cost of currently bringing a new drug to market vary, but 7–12
years and $ 1.2 billion are often cited. Furthermore, five out of 40,000 compounds tested in
animals reach human testing and only one of five compounds reaching clinical studies is
approved. This represents an enormous investment in terms of time, money and human and other
resources. It includes chemical synthesis, purchase, curation, and biological screening of hundreds
of thousands of compounds to identify hits followed by their optimization to generate leads which
requiring further synthesis.
In addition, predictability of animal studies in terms of both efficacy and toxicity is
frequently suboptimal. Therefore, new approaches are needed to facilitate, expedite and streamline
drug discovery and development, save time, money and resources, and as per pharma mantra “fail
fast, fail early”. It is estimated that computer modeling and simulations account for ~ 10% of
pharmaceutical R&D expenditure and that they will rise to 20% by 2016
Role of computer aided drug designing:
Both computational and experimental techniques have important roles in drug
discovery and development and represent complementary approaches. CADD entails:
Use of computing power to streamline drug discovery and development process
Leverage of chemical and biological information about ligands and/or targets to identify
and optimize new drugs
Design of in silico filters to eliminate compounds with undesirable properties (poor
activity and/or poor Absorption, Distribution, Metabolism, Excretion and Toxicity,
ADMET) and select the most promising candidates
Figure 2.Role of computer aided drug designing
9
Benefits of CADD
CADD methods and bioinformatics tools offer significant benefits for drug discovery programs.
1. Cost Savings. The Tufts Report suggests that the cost of drug discovery and development
has reached $800 million for each drug successfully brought to market. Many
biopharmaceutical companies now use computational methods and bioinformatics tools to
reduce this cost burden. Virtual screening, lead optimization and predictions of
bioavailability and bioactivity can help guide experimental research. Only the most
promising experimental lines of inquiry can be followed and experimental dead-ends can
be avoided early based on the results of CADD simulations.
2. Time-to-Market. The predictive power of CADD can help drug research programs choose
only the most promising drug candidates. By focusing drug research on specific lead
candidates and avoiding potential “dead-end” compounds, biopharmaceutical companies
can get drugs to market more quickly.
3. Insight. One of the non-quantifiable benefits of CADD and the use of bioinformatics tools
is the deep insight that researchers acquire about drug-receptor interactions. Molecular
models of drug compounds can reveal intricate, atomic scale binding properties that are
difficult to envision in any other way. When we show researchers new molecular models
of their putative drug compounds, their protein targets and how the two bind together, they
often come up with new ideas on how to modify the drug compounds for improved fit.
This is an intangible benefit that can help design research programs.
CADD and bioinformatics together are a powerful combination in drug research and development.
An important challenge for us going forward is finding skilled, experienced people to manage all
the bioinformatics tools available to us, which will be a topic for a future article.
3.2 Introduction to target protein:
The Inflammatory Response

The inflammatory response is a major part of the non-specific defense system, and is activated by
any damage caused to the tissues of the body, whether caused by a pathogen (such as damage
caused by an infectious microorganism) or even physical injury such as that caused by a scratch or
an insect bite. The affected area becomes red and swollen, or inflamed.
Figure3. Shows the steps of the inflammatory response. In the example shown, a pin pierces
through the skin surface, and infects the tissue with bacteria. The steps shown are:
 As soon as the tissue is ruptured, the damaged cells release chemicals such as histamine,
which serve as alarm signals.
 The chemicals released activate numerous defense mechanisms in the body. For example,
histamine forces nearby blood vessels to dilate and to allow more diffusion by becoming
leakier. Due to this, blood flow to the affected area increases, and the plasma of the blood
seeps into the interstitial fluid of the damaged tissues. Other chemicals that are released
attract phagocytes and other leukocytes to the affected area. These leukocytes squeeze out
of the blood vessels into the interstitial fluid and tissue spaces. This increase in blood
flow, blood plasma, and white blood cells causes the redness, heat, and swelling that are
normally found in inflammation.
10
 The leukocytes that have been attracted to the area engulf the bacteria, and any dead body
cells damaged by the pathogens or by the injury. This may result in the death of the
leukocytes, as well, and their remains are also digested. Pus found at the site of infection
consists mainly of white blood cells and blood plasma.
figure3. The Inflammatory Response against pathogens.
The inflammatory response has two major purposes: to disinfect and to clean injured tissues. In
addition to this, the inflammatory system also helps halt the spread of pathogens to tissues not
already infected. Clotting proteins that are present in the blood plasma also leak into the interstitial
fluid when the blood vessels dilate and become leakier. With platelets, thromboplastin,
prothrombin, fibrinogen, and calcium ions, localized clots can be formed, and healing can be
underway, while the pathogens are also restricted to one area, making it easier for them to be
engulfed by phagocytes.
Although the inflammatory response may be localized, as shown, it may also be widespread and in
effect throughout the body. If there are numerous pathogens, or pathogens have traveled through
the bloodstream and come to reside all over the body, the body will react with a widespread
inflammatory response that has other effects in addition to the ones experienced in localized
responses. The number of leukocytes in the blood may increase. The body may also experience
abnormally high body temperatures, or fever, which may be caused by either toxins released by
pathogens, or due to compounds released by specific leukocytes. Although an extremely high
fever is dangerous to the body, a less extreme temperature may aid the body by stimulating
phagocytosis and inhibiting the reproduction and growth of pathogens.
The classical signs inflammation:
 Pain (dolor),
 Heat (calor),
 Redness (rubor),
 Swelling (tumor), and
 Loss of function (functio laesa).
Responsible mediator for inflammation:
 Phospholipase A2(PLA2)
 Lipooxygenase(LOX)
 Cyclooxigenase(COX),
11
Figure4. Inflammatory process
12
Current drug against inflammation:
 NSAIDS (Non Steroidal Anti-inflammatory drugs)
o Aspirin
o Indomethacin
o Ibuprofen
o Diclofenac
o Piroxicam
 Corticosteroids.
o Prednisolon
o Cortisone
o Betamethasone
o Fludrocortisone
WHY NEW DRUGS ARE NEEDED:
Despite decades of research, corticosteroids and NSAIDs remain the main
pharmacological weapons to control inflammation in the clinic. Unfortunately, these drugs have
significant side effects, especially when used chronically. Consequently, there is tremendous
interest in the development of novel, safer, and more effective anti-inflammatory drugs.
Side effects of NSAIDS:
GASTEROINTESTIONAL:
Gastric irritation, erosions, peptic ulceration, gastric bleeding , esophagitis.
RENAL:
Na+ and water retention, chronic renal failure, interstitial nephritis.
CNS
Headache, mental confusion, behavioral disturbances, seizure precipition.
OTHERS:
Asthma exacerbation, nasal polyposis, pruritus, angioedema
Side effects CARTICOSTEROIDES:
Cushing‟s habitués
Hyperglycemia
Muscular weakness
Susceptibility to infection
Delayed healing
Peptic ulceration
Osteoporosis
Glaucoma
Fetal abnormalities
mental confusion
Phospholipase a2 (PLA2):
The secretary PLA2 (sPLA2) family, in which 10 isozymes have been identified,
consists of low molecular weight, Ca2+-requiring secretory enzymes that have been implicated in
a number of biological processes, such as modification of eicosanoid generation, inflammation,
and host defense.
13
This enzyme has been proposed to hydrolyze phosphatidylcholine (PC) in lipoproteins to liberate
lyso- PC and free fatty acids in the arterial wall, thereby facilitating the accumulation of bioactive
lipids and modified lipoproteins in atherosclerotic foci.
In mice, sPLA2 expression significantly influences HDL particle size and composition
and demonstrate that an induction of sPLA2 is required for the decrease in plasma HDL
cholesterol in response to inflammatory stimuli. Instillation of bacteria into the bronchi was
associated with surfactant degradation and a decrease in large: small ratio of surfactant aggregates
in rats.
sPLA2-IIA can exert beneficial action in the context of infectious diseases since recent
studies have shown that this enzyme exhibits potent bactericidal effects. Induction of the synthesis
of sPLA2-IIA is generally initiated by endotoxin and a limited number of cytokines via paracrine
and/or autocrine processes.
Figure5 Biosynthesis of Arachidonic acid
14
Role of phospholipase A2:
Phospholipase A2 (PLA2) catalyzes the hydrolysis of the sn-2 position of membrane
glycerophospholipids to liberate Arachidonic Acid (AA), a precursor of eicosanoids including
prostaglandins and leukotrienes. The same reaction also produces lysophosholipids, which
represent another class of lipid mediators.
Figure6 Role of phospholipase A2

Mechanism
Close-up rendering of PLA2 active site with phosphate enzyme inhibitor. Calcium
ion (pink) coordinates with phosphate (light blue). Phosphate mimics tetrahedral intermediate
blocking substrate access to active site. His-48, Asp-99, and 2 water molecules.
The suggested catalytic mechanism of pancreatic sPLA2 is initiated by a His-48/Asp-99/calcium
complex within the active site. The calcium ion polarizes the sn-2 carbonyl oxygen while also
coordinating with a catalytic water molecule, w5. His-48 improves the nucleophilicity of the
catalytic water via a bridging second water molecule, w6. It has been suggested that two water
molecules are necessary to traverse the distance between the catalytic histidine and the ester. The
basicity of His-48 is thought to be enhanced through hydrogen bonding with Asp-99. An
asparagines substitution for His-48 maintains wild-type activity, as the amide functional group on
asparagines can also function to lower the pKa, or acid dissociation constant, of the bridging water
molecule. The rate limiting state is characterized as the degradation of the tetrahedral intermediate
composed of a calcium coordinated oxyanion. The role of calcium can also be duplicated by other
relatively small cations like cobalt and nickel.
15
Figure7 Mechanism of PLA2
PLA2 can also be characterized as having a channel featuring a hydrophobic wall

in which hydrophobic amino acid residues such as Phe, Leu, and Tyr serve to bind the substrate.
Another component of PLA2 is the seven disulfide bridges which are influential in regulation and
stable protein folding.
Why phospholipase a2 inhibitores are needed:

Activation of PLA2 leads to the release of fatty acids and lysophospholipid, which
are than converted to mediators of inflammation and allergy, such as prostaglandins, leukotrienes,
and platelet activating factor .therefore, blocked of phospholipase pla2 can result in the suppersion
of three important classes of lipid mediators and offers an attractive therapeutic approach to
inflammation
Inhibition of phosphlipase A2:
sPLA2 inhibitor can be a therapeutically useful drug in the treatment of
1. septic shock
2. acute respiratory
3. distress syndrome,
4. pancreatitis,
5. trauma,
6. bronchial asthma,
7. allergic rhinitis,
16
8. rheumatoid arthritis,
9. gout, and
10. Other diseases.
REPORT HIGHLIGHTS
1. The market for anti-inflammatory drugs to treat the diseases covered in this report was
approximately $21.9 billion in 2005 and is projected to increase to $35.5 billion in 2010.
2. The fastest growing disease category for anti-inflammatory treatment is psoriasis, which
saw the first introductions of expensive monoclonal antibody products in the last two
years.
3. The largest market by far in 2005 is that for the treatment of asthma and chronic
obstructive pulmonary disease, which accounted for approximately 36% of the total market
in 2005. The asthma/COPD market will remain the largest in 2010, but will decline to
31.4% of the total of market by the end of the forecast period.
Figure8 Report ID: PHM048A, Published: March 2006, Analyst: Lynn Gray
Pipeline drugs against PLA2:
17
Target protein
PDB id : 1DB4
Name : Hydrolase/hydrolase inhibitor
Title : Human s-pla2 in complex with indole 8
Structure : Phospholipase a2. Chain: a. Synonym: hnp-spla2
Source : Homo sapiens.
Biological unit : Dimer
Enzyme class : E.C.3.1.1.4
Reaction : Phosphatidylcholine + H2O = 1-acylglycerophosphocholine + a
Carboxylate
Cofactor : Calcium
Resolution : 2.00Å
R-factor : 0.226
R-free : 0.256
Amino acid length : 124 AA
Authors : N.Y.Chirgadze, R.W.Schevitz, and and J.-P.Wery
18
Figure 9 crystal structure of secretory phospholipase a2 (1DB4)
19
3.3 software
Presnt experimental studies carried out using the tools
 Accelrys software
Discovery studio
Discovery studio is a complete modeling and simulations environment for life science
researchers. Discovery Studio is a single, easy-to-use, graphical interface for powerful drug design
and protein modeling research. Discovery Studio 2.1 combines established gold-standard
applications such as Catalyst, Modeler, and CHARMm that have years of proven results and
utilizes cutting-edge science to address the drug discovery challenges of today. Discovery Studio
2.1 is built on the Pipeline Pilot open operating platform to seamlessly integrate protein modeling,
pharmacophore analysis, virtual screening, and third-party applications. It offers
Figure 10: feature of discovery studio
o Interactive, visual and integrated software.

o Consistent, contemporary user interface for added ease-of-use
20
o Tools for visualization, protein modeling, simulation, docking, pharmacophore
analysis, qsar and library design
o Access computational servers and tools, share data, monitor jobs, and prepare and
communicate their project progress.
21
4. Materials and methods
In the last few years the role of computational methods in both pharmaceutical
and academic research has developed dramatically. The emphasis being placed on high throughput
methods in the pharmaceutical industry, which has increased the number of compounds in the
discovery pipeline. Characterizing the position and orientation of small molecules bound to a
protein surface can be an important step in drug design. Computational methods developed rapidly
as groups seek high throughput, low cost approaches in accelerating the drug discovery process.
Such approaches will be necessary as scientists attempt to characterize the large number of drugs
currently being generated. Structural information of biological macro molecules and their
importance with ligand is increasingly being used in modern medicinal chemistry. There is a
pressing used for novel computational methods that can evaluate the structural information about
ligand receptor complexes in a more quantitative way , both to improve existing leads and to
design de novo compounds with accurately predicted binding affinities . The following
experimental methods categorically divided into two parts.
4.1 Analogue based drug designing
4.1.1 Quantitative structure activity relationships (Qsar)
4.1.2 Common feature pharmacophore (hip hop)
4.1.3 3D Qsar pharmacophore (hypogen)
4.2 structure based drug designing
4.2.1 Structure based pharmacophore generation
4.2.2 Docking studies
4.2.2a Ligand Fit
4.2.2b C –Docker
4.2.2c Lib Dock
4.2.2d Ludi
22
Preparation of molecular system:
Macro molecule (protein 1db4) preparation:
Load the protein and Apply the force field

For this qsar, pharmacophore and docking
studies, the protein 1DB4 load from RCSB protein data bank
(www.rcsb.org/pdb/) and apply the force field .Force field refers to
the functional form parameter sets which are used to find out potential energy of a system. It
includes parameter which is obtained through experimental works and quantum mechanics
calculations. All molecules in a molecule in a mechanical system are made up of a number of
components. Covalently bonded atoms takes into consideration several parameters such as bond
length , bond angle , dihedral angles etc., similarly there exists non bonded interactions such as
vanderwaals interactions , electrostatic interactions . Thus the total potential energy of the system
is calculated as follows
E1= [E bond + E angle + E torsion + E vandervaals + E electronic]
This summation when given is an explicit form, represents force field, evaluating the potential of a
system.
minimization :
The minimizer uses algorithm to identify the geometrics of the molecule
corresponding to the minimum points on the potential surface energy. The minimum reduced the
unwanted forces which are present in the molecule and lower the energy level of the molecule.
There are many algorithms available in the minimization process. Some of the minimization
methods used in the smart minimizer is steepest decent method, conjugate gradient method,
Newton raphson method and quasi Newton method. From the DS protocols select the
minimization and run .the following figure shows the minimized protein with fixed constraint
.than sve the minimized protein for further studies.
23
Figure 11 Minimized protein with fixed constraint
24
Preparation of bio active molecules:
The 111 bio active compounds are collected from the journals
with the activity range 0.005 to >50 µM.
Journal of medicinal chemistry 1996, vol 39, page no 3636-33658 with the title potent
inhibitors of secretory phospholipase A2: synthesis and inhibitory activities of indolizine
and indene derivatives.
Journal of medicinal chemistry 2005, vol 48, page no 893-896 with the title carbocyclic
[g]indole inhibitors of human nonpancreatic sPLA2
Journal of medicinal chemistry 2008, vol 51, 4708-4714 wit the title highly specific and
broadly potent inhibitors of mammalian secreted phospholipase A2.
1 One molecule was drawn with basic scaffold and the other molecules were constructed
with one drawn earlier as the reference model.
2. Drawn compounds are typed with charmm force field.
3. The typed molecule are subjected to the energy minimization using smart minimizer.
Minimizes a series of ligand poses using CHARMm
4. Minimized molecule is saved with .sd and .mol2 extension for further study.
Following table shows the 2d structure of the molecule and activity
25
4.1 Analogue Based Drug Design
The unknown 3D structural target knowledge is applied to rationally design a drug; this is
referred to as Analogue Based Drug Design. This refers to the application of the knowledge of the
ligand structure ant their activity when the 3D structure of the target is having a very less
information or is completely not known .It is required to design the binding site based on the
known structure of the ligands.
4.1.1 Qsar:
The fundamental quantitative structure activity relationship studies reveals
that the structures can be easily be compared, overlayed and displayed. The Quantitative structure
of activity relationship is obtained by providing more parameters to optimize a series of bioactive
molecules. The quantitative structure activity relationship based on physio chemical properties
describes a drugs structural, electronic and physiochemical characteristics. Data sets are produced
using all available descriptors.
Apply knowledge of the three-dimensional (3D) structure of the target
(receptor/enzyme/DNA) to rationally design drug molecules to bind to the target for the following
reasons are:-
1. Understand atomic details of drug binding strength and specificity (drug-receptor interactions).
2. Develop novel drugs (unique chemical structures) for a selected target via de novo drug design
or database searching techniques.
3. Optimize the therapeutic index of an already available drug or lead compound concerning
structural requirements for activity from a minimum number of compounds are tested.
figure12: concept of QSAR
26
A QSAR equation numerically defines the chemical properties, Biological activity
form physiochemical properties. Biological activity is defined as pharmacological response
usually expressed in millions such as the effective dose in 50% of the subjects (ED 50). The lethal
dose is 50% of the subjects (LD50) or the minimum inhibitory concentration IC50. It is common to
express the biological activity as a reciprocal QSAR equation is similar to the equation for a
straight line:-
Y = mx + c
Log biological activity = a (physiochemical property) + c
A = regression coefficient of slope of the straight line.
C = intercept on y-axis (when the physiochemical property equals zero)
Biological activity expressed as a reciprocal to produce a positives lope and also due
to the inverse relationship between physiochemical chemical property and biological potency.
There is a positive relationship between the reciprocal of the biological activity(I/BA) and
physiochemical property, because (I/BA) increases as the studies are based on the descriptors and
biological activity relationship the biological activity data must be minimal .and the choice of the
descriptors of the descriptors must be accurate and appropriate .
OBJECTIVE OF QSAR
1. Drug transport/ mechanism

2. Prediction of activity.
3. Classification of molecules as highly active, moderately active and inactive.
4. Optimization of activity by steric, electrostatic and hydrophobicity
5. Refinement of synthetic targets.
6. Reduction and replacement of animals for the action of drugs
BASIC REQUIREMENT IN QSAR STUDIES
1. All analogue belong congeneric series

2. All analogues exert same mechanisms of actions.
3. All analogue bind in a comparable manner.
4. Effect of isosteric replacement can be predicted
5. Binding affinity correlated to interaction energies
6. Biological activities correlated to binding activity
QSAR STUDIES INVOLVE THE FOLLOWING STEPS
A. CSD data base.

C. Choice of descriptors.
D. Statistical methods to evaluate to evolve QSAR equation.
E. Validation.
27
A. CSD DATABASE
Experimental information about the structures of molecules can often be extremely useful
for forming theories of conformational analysis and hoping to predict the structures of molecules
for which no experimental information is available. The most important technique currently
available for determining the three dimensional structure of molecules is x-ray crystallography
community has distributed in electronic form two practically important databases for molecular
modeler are the Cambridge structural database CSD which contains crystal structures of organic
and organ metallic molecules and the protein data bank (PDB) which contain structures of proteins
and some DNA fragments.
A data base of little use without software tools to search extract and manipulate the data. A
simple use of a database is for extracting information about a particular molecule or group of
molecules .the data may also be identified by creating a two dimensional representation of
molecule and using a substructure search program to search the database. Crystallographic
database have also been used to develop an understanding of the factors that influence the
conformations of the molecules, and of the ways in which molecules interact with each other. For
example, the CSD has comprehensively analyzed to characterize how the lengths of chemical
bonded depend upon the atomic numbers, hybridization and the environment of the atoms
involved. Analyzing of intermolecular hydrogen bonding have revealed distinct distance and
angular preferences a major use of the CSD is substructure searching for molecules which contain
a particular fragment, in order to investigate the conformation that the fragment adopts.
A crystallographic database can only provide information about the crystal state of
matter and that the possible influence of crystal packing forces should always be taken into
account. This is less of concern for protein than for small molecules as protein crystals contain a
large amount of water and indeed NMR studies are established that protein have approximately,
the same structure in solution as in the crystal.
A second, more stable subtle, bias is that crystallographic databases only contain
molecules that can be crystallized and indeed only those molecules whose X-ray structures were
considered enough to be published. The structures in a crystallographic database may therefore not
be a wholly representative set.
C. MOLECULAR DESCRIPTORS
The study of steric requirements for interaction between ligands and corresponding
biological acceptor sites is often of decisive importance in understanding the role played by the
structural features in promoting activity in its most general form drug receptor theory requires that
a ligand exerts its biological action as a consequence of binding or otherwise interacting with a
specific biological acceptor site such as membrane protein , an enzyme etc., which may be
generally termed the receptor the concept is the basis for modern drug receptor theory involves
the old principle that a ligand fits its receptor much as a key fits a lock. This concept, although
some what arbitrary since a high degree of flexibility is present in biomacromolecules, structure,
governs the principle of molecular recognition and molecular discrimination. Although
stereochemistry often plays a major role in drug bioactive, care must be taken when considering
structure activity relationship to explore whether other differences in physiochemical properties
exists before one makes significant correlations with the steric properties of the structure under
study.
In early studies organic chemists defined a number of steric parameters in order to
explain steric effects of substituents on the reaction centers of organic molecules. The same type
of steric effects observe in studies of variation of physical properties and the chemical reactivity
with structure may be assumed to be involved in biological activity studies which at least as a first
28
approximation may be treated in similar fashion in the past 35 years owing to the development of
drug design and Hansch Approach many other parameters and methods have been developed
which have the permit of trying to avoid a simple empirical correlation with given ligand
properties and also trying to propose the possible geometric features of the receptor .
Steric descriptors are classified into following groups:
1. Topological indices based on characterization of the chemical structures of the graph theory.
2. Geometric descriptors resulting from the view of organic molecules as three dimensional
objects from which standard dimensions can be calculated.
3. Chemical descriptors derived from steric influence upon a standard reaction.
4. Physical descriptors derived when an organic molecule is considered as three dimensional
object with size determined physical properties and different descriptors which result when an
organic molecule is considered as a three dimensional object from reference structure.
I. FRAGMENT CONSTANT DESCRIPTORS
Fragment constant descriptors are constants that relate the effect of substituents on a reaction
center one type of process to other. The basic idea is that similar changes in structure are likely to
produce similar changes in reactivity, ionization or binding. There are different constants
corresponding to different effects. These are typically used to parameterize the Hammet equation
some series of analogs.
Log kx= pσ +log kh
Where Kx and kh are reaction rate constants for the substituents x and h , respectively ;0 is an
electronic constant by an ionization constant and p is fit to set etc at different properties
(electronic , steric )etc at different R group positions are used . In this way measurements of
ionization constants can be used to predict rate constants once a sealing factor (p) is determined
effects for the rate of constant.The default database currently contains the following types of
constants. These come from table VI –I of hansch expect for the sterimol constant which is
calculated.
Sm, Sp
Electronic effect sigma Meta and sigma para respectively. Positive values correspond in
electronic withdrawal, negative ones with electronic release. Sigma is generally not appropriate for
ortho substituents because of steric interaction with reaction center.
F and R
Decomposition of sigma Para constant into an inductive polar part F and a resonance part R for
the case when the substituent is conjugated with the reaction center producing through resonance
effects.
Pi
Hydrophobic character Pi for substituent x is given by the difference of its log P from the
log P for hydrogen.
HA hydrogen bond acceptor
HB hydrogen bond acceptor
MR molar refractivity is given by p
MR= (n2-1/n2+1)*(MW/d)
Where n is the refractive index .MW is the molecular weight and d is the compound density
sterimol L.
Sterimol-L
Steric length parameter, measured long the substitution point bond axis.
Sterimol –B 1 through B4
29
Steric distance s perpendicular to bond axis, these define a bounding box for the substituent and
are numbered in ascending size axis.
Sterimol –BS
The overall maximum steric distance is perpendicular to the bond axis.
II. CONFORMATIONAL DESCRIPTORS

Energy descriptor energy is the energy of the currently selected conformation in the study table.
Low energy Low energy is the energy of the most stable conformation in the set of conformations
belonging to each molecular model.
E penalty E penalty is the difference between energy and low energy
III. ELECTRONIC DESCRIPTORS

The following table lists the electronic descriptors available in QSAR are as follows:
Table 3: Electronic descriptors
SYMBOL DESCRIPTION
Charge sum of partial charges
F charge sum of formal charges
Apol sum of atomic polarizabilities
Dipole dipole moment
HOMO highest occupied molecular orbital
LUMO lowest occupied molecular orbital
SR super delocalizability
IV. MOLECULAR SHAPE ANALYSIS (MSA) DESCRIPTORS
The following table lists the MSA descriptors available in QSAR are as follows:
Table4: Molecular shape analysis descriptors

SYMBOL
DESCRIPTORS
DIFF difference volume
Fo common overlap volume (ratio)
NCOSV non common overlap steric volume
Shape RMS RMS to shape reference
COSV common overlap steric volume
SR vol volume of shape reference compound
30
V) STRUCTURAL DESCRIPTORS
The following table lists the structural descriptors available in QSAR are as follows:
Table 5: Structural descriptors
SYMBOL DESCRIPTORS
Mw molecular weight
Rot bonds number of rotatable bonds
H bond acceptors number of hydrogen acceptors

H bond donors number of hydrogen bond donor
VI. THERMODYNAMIC DESCRIPTORS

The following table lists the thermodynamic descriptors available in QSAR follows:
Table 6: Thermodynamics descriptors
SYMBOL DESCRIPTORS
AlogP log of partition coefficient
FH2o desolvation free energy of water
Foct desolvation free energy of octanol
HF heat of formation
Molref molar refractivity
VII. RECEPTOR DESCRIPTORS

Quantitative values such as the interaction energy calculated in receptor for a generated
receptor model are available to use in QSAR. By using receptor data to develop a QSAR model,
you can evaluate the goodness of fit between a candidate‟s structure and postulated pseudo
receptor. When you have generated a receptor model ad have aligned the models you want to
study, you can proceed to build a QSAR using data from the receptor structure iterations.
The following table lists the receptor descriptors available to QSAR are as follows:
Table 7: Receptor Descriptors
SYMBOL DESCRIPTION
Intra energy molecular internal energy inside receptor
Inter Elec energy Non bonded electrostatic energy between molecule and receptor
InterVDW energy Non bonded vanderwaals energy between molecule and receptor
Inter energy total nonbonded energy between molecules and receptor
Min intra energy molecular internal energy minimized without receptor
Stain energy molecular strain energy within receptor
VIII. MOLECULAR FIELD ANALYSIS (MFA) DESCRIPTORS:

Molecular field analysis (MFA) evaluates the energy between a probe and molecular model at a
series of points defined by a rectangular or spherical grid. This energy may be added to the study
table to form new columns headed according to the probe type. The new columns may be used as
independent X variables in the generation of QSAR.
31
Molecular field analysis (MFA)
MFA evaluates the interaction energy between a probe and a molecular model at a series of
points defined by a rectangular or spherical grid. This method quantifies the interaction energy
between a probe molecule and a set of aligned target molecules in QSAR. Six descriptors are
available in this family.
1. H+ probe: This selects proton “as a probe‟, having +1 charge and zero vanderwaals radius.
It has electrostatic interactions and non bonded interaction are not considered
2. CH3 probe: This probe with a vanderwaals radius of united CH3 group but with a zero
charge. The energy of interaction of this probe with a study molecule will include only non
bonded interactions.
3. Donor / acceptor probe: It is two atom probes consisting of oxygen bounded to hydrogen.
The vanderwaals radii of eth atoms are exactly how they are defined in the particular force
field loaded. The probe is neutral. Depending on the orientation of this probe. It is capable
of bleaching as a hydrogen bond donor or an acceptor.
4. CH3 probe: It is single atom probe with a vanderwaals radius of a united CH 3 of -1. The
energy of interaction of this probe includes both non-bonded of interaction of this probe
includes both non bounded and electrostatic interactions.
5. Generic probe: There is a generic single atom probe with a user specified Vander radius
and charge.
6. Other probes: Any multi atom model may be employed as a probe specifying the Msi file
format.
D. STATISTICAL METHOD TO EVALUATE QSAR EQUATION

QSAR analysis uses statistical methods for studying the correlation of biological activity to
structural and physio chemical properties of candidate molecules. Here are different statistical
techniques used to fit the molecule under multivariate statistics, which include the following:-
1. PCA (PRINCIPAL COMPONENT ANALYSIS): It aims at representing large amount of
multidimensional data by transforming them into a more intuitive low dimensional representation.
This method does not create a model, but searches for relationship among the independent
variables. It then creates new variables (the principal components) which represent most of the
information contained in the independent variables.
2. CLUSTER ANALYSIS: The goal of cluster analysis is to partition (typically to representing
set of models in a molecular descriptor property space) into classes or categories consisting of
elements of comparable similarity. The algorithm assumes that models are represented by points
in multidimensional property space with Euclidian distance between points representing model
dissimilarity. The below mentioned are the types in this category
1. Jarivs – Patrick clustering
2. Variable-Length Jarnis Patrick clustering
3. Relocation Clustering
4. Hierarchical Clustering Analysis (HCA)
3. SIMPLE LINEAR REGRESSION: It performs a standard linear regression calculation to

generate a set of QSAR equations that includes one equation for each independent variable. It is
good for exploring simple relations between structure and activity
4. LINEAR (MULTIPLE LINEAR REGRESSIONS): This method calculates QSAR equation
by performing standard multi variable regression calculations using multiple variables in a single
equation. In this method variables are independent correlated).
32
5. STEPWISE MULTIPLE LINEAR REGRESSION: It calculates QSAR equation s by adding
one variable data time and testing each addition for significance and such variables are sued in
QSAR equation. It is useful when the number of variables is large and when the key descriptors
are not known. If the number of variables exceeds number of structures this method should not be
used.
6. PLS (PARTIAL LEAST SQUARES): This method carries out regression using latent
variables. From the independent and dependent data that are along their axes of greatest variation
and are most highly correlated. It can be used with more than one dependent variable. It is
typically applied when the independent variables are correlated or the number of independent
variables exceeds the number of observations (rows).
7. GFA (GENETIC FUNCTION APPROXIMATION): It is an alternative to the standard
regression analysis for constructing QSAR equations. The method provides multiple models that
are created by evolving random initial models using a genetic algorithm GFA can build linear and
higher- order polynomials, splines and other non–linear equations. In this method, models are
collected that have a randomly chosen proper subset of the independent variables and then the
collected models are evolved A generation is eth set of models resulting from performing multiple
linear regression on each model, a selection of the best one becomes the next generation .crossover
operations are performed on this equation.
8. G/PLS: (GENETIC PARTIAL LEAST SQUARES): It is a method derived from GFA and
PLS that are valuable analytical tools for datasets that have more descriptors than samples. The
following three statistical methods are useful in combi chem. and analog builder.
10. FA (FACTOR ANALYSIS): It addresses one of the main problems found in PCA that is not
simple to relate the principal component to molecular properties. All the common factors have a
close relation ship to real molecular properties.
11. RP (RECURSIVE PARTITION): It identifies the internal representation of classes used by
classification structure activity relations hip (CSAR) for deriving recursive portioning models.
E. VALIDATION METHODS
Once a regression equation is obtained it is important to determine its reliability and its
significance. Internal validation uses the data set for which the model is derived and checks for
internal consistency. The procedure derives a new model and is used to predict the activities of the
molecules that were not included in the new model set. This is repeated until all compounds have
been deleted and predicted once. Internal validation is less rigorous than external validation.
External validation evaluates how well the equation generalization. The original data are divided
into two groups, the training set and the test set. The training set is used to derive a model, and the
model is used to predict the activities of the test set numbers. The following procedures are used to
check that the size of the model is appropriate for the quantity of data availability as well as
provides some estimate of how well the model can predict activity for new models are as
follows:-
1. CROSS VALIDATION: This process repeats the regression may times on subsets of the data.
Usually each molecule is left out intern and r 2 is computed using the predicted values of the
missing molecules (r2)
2. RANDOMIZATION TEST: Even with large number of observations and a small number of
terms, an equation can still have a very poor predictive power. This can come about it the
observation are not sufficiently independent of each other.
33
F. INTERPRETING QSAR EQUATION
QSAR is used for predicting the activities of as yet untested and possibly not yet synthesized)
molecules. The predictive ability of a QSAR is generally more accurate for interpolative (for
compound that have parameters with in the range of those considered in the data set) than for the
extra polative predictions (compounds that are outside the range)
A QSAR equation provides insights into the mechanism of the process being studies.
1. SQUARE OF CORRELATION COEFFICIENT (R2): If x (independent) and y (dependent)
variables are highly correlated, there is considerable information in x and y that is redundant. The
degree of correlation is measured by the correlation coefficient (r 2)
2. CROSS VALIDATED R2 (TERMED AS Q2 OR XVR2): r2can be computed using cross

validation methods (XVr2) o r boost strap methods (BSr2). It is also the fraction of the variance
explained by the model. Cross validated r2 is always some what lower and often much lower than
the r2.
3. PRESS (PREDICTIVE ERROR SUM OF SQUARES): The sum of overall compares of the
squared differences between the actual and the predicted values for independent variables [1/i y] 2.
The intensity of the cross validated process is controlled by selecting the number of groups or
number of times the cross validation step is to be carried out while predicting all rows (at each
stage of model development).
PROCEDURE:
Figure13: Flowchart of qsar procedure
34
Calculate molecular properties:
The Calculate Molecular Properties protocol will calculate many properties or perform basic
statistical and correlation analysis of the numeric properties as requested.
To set up a Calculate Molecular Properties protocol:

1. Load the QSAR and apply the force field on molecules and Calculate Molecular
Properties protocol from the Protocols Explorer. The parameters display in the Parameters
Explorer.
2. On the Parameters Explorer, click in the cell for the Input Ligands parameter and click
the button to specify the ligand source on the Specify Ligands dialog. On the dialog, select
all ligands from a Table Browser, a 3D Window, or a file.
3. Select the properties to calculate by clicking the button in a cell for the Molecular
Properties, Semi empirical QM descriptors, or Density Functional QM descriptors, and
follow the instructions in the popup dialog window.
The Create genetic function approximation can build a Create genetic function approximation
model for a dependent property using the selected molecular descriptors.
To set up a Create genetic function approximation Model protocol
1. Load the QSAR | Create genetic function approximation Model protocol from the
Protocols Explorer. The parameters display in the Parameters Explorer.
2. On the Parameters Explorer, click in the cell for the Input Ligands parameter and click
the button to specify the ligand source on the Specify Ligands dialog. On the dialog, select
all ligands from a Table Browser, a 3D Window, or a file.
3. Set the desired model name using the Model Name parameter. Once created, this model
will appear under the other category of the Molecular Properties parameter in the Calculate
Molecular Properties protocol and can be used to compute the property for future ligands.
4. Set the initial equation length and remaining parameters as desired. Parameters
presented in red are required.
4.1.2. PHARMACOPHORE:
“A pharmacophoreis the ensemble of stericand electronic features that is necessary to
ensure the optimalsupramolecular interactions with a specific biological target and to trigger (or
block) its biological response.” Perceiving a pharmacophore is the most important first step
towards understanding the interaction between a receptor and ligand. In the early 1900‟s Paul
Ehrlich offered the first definition for a pharmacophore. A pharmacophore was first defined by
Paul Ehrlich in 1909 as "a molecular framework that carries (phoros) the essential features
responsible for a drug‟s (=pharmacon's) biological activity" . Catalyst provides the tools for
selecting potential ligand compounds prior to synthesis. The aim of this software is to reduce the
time and cost of screening, synthesis and biological testing. It accelerates the drug discovery
process by identifying lead candidates faster.
Pharmacophore or hypothesis describes the generalized molecular features involved in the
binding of ligand to activate site of proteins molecular features including 1D which represents the
physical and biological properties, 2D represents the sub structures and 3D represents the chemical
features such as acceptors, donors, positive, negative, ionizable, hydrophobic (aromatic &
aliphatic) and ring compounds features. In catalyst each hypothesis can be defined in four parts.
The first one is chemical features, second is location and orientation in 3D dimensional space,
35
tolerance and weight. Weight represents the relative importance of each chemical function in
conferring activity
A pharmacophore model or hypothesis consists of a three-dimensional configuration of
chemical functions surrounded by tolerance spheres. A tolerance sphere defines that area in space
that should be occupied by a specific type of chemical functionality. Pharmacophore models are
routinely used in lead identification and optimization in the areas of library focusing, evaluation
and prioritization of virtual high throughput screening (VHTS) results, de novo design, and
scaffold hopping. Pharmacophore models can be constructed using analog-based (using known
active ligands) or receptor-based techniques (using receptor active site information). In the
absence of crystallographic structure data of a protein for which the active site for receptor binding
is clearly identified, a chemist must rely on the structure activity data for a given set of ligands. If
these ligands are known to bind to the same receptor, then one can attempt to define the
commonality between them. Accelrys Catalyst program can generate two types of automated
pharmacophore models, Hypo Gen and Hip Hop, depending on whether or not activity data is
used. In the presence of protein crystal structure data, active site pharmacophore models can be
used as a pre-filter for docking large libraries. Generation of a pharmacophore model using the
active site residue information is key to the success of any pharmacophore-based docking
algorithm. In the absence of X-Ray bound ligand information; it is a challenge to select a single
pharmacophore model that represents the binding characteristics. A methodology is proposed in
this case study that can be used to analyze and visualize multiple pharmacophore models. This
methodology can be applied to different types of Catalyst pharmacophore models (qualitative,
quantitative, receptor-based, etc.) as it only considers feature types and coordinates.
This methodology can be applied successfully to the following applications:
1. VHTS screening
2. multiple binding mode identification
3. classification of proteins based on binding characteristics
4. visualization of pharmacophore model space
To build a better pharmacophore, the following steps were employed:
1. Building a set of molecules
2. Conformer generation
3. Hypothesis Generation
4. Database Search
5. Compare/Fit to estimate Activity
The Feature Dictionary list contains the generalized chemical functions in Catalyst.
Definitions of these functions are:
1. HB ACCEPTOR (vector): Matches these types of atoms or groups of atoms with surface
accessibility:
 sp or sp2 nitrogen‟s that have a lone pair and charge less than or equal to zero
 sp3 oxygen‟s or sulfurs that have a lone pair and charge less than or equal to zero
 non-basic amines that have a lone pair
Does not match: basic, primary, secondary, and tertiary amines that are protonated at physiological
pH. There is no exclusion of electron-deficient pyridines and imidazoles.
2. HB ACCEPTOR lipid (vector): Matches these types of atoms or groups of atoms: nitrogen‟s,
oxygens, or sulfurs (except hypervalent) that have a lone pair and charge less than or equal to zero.
This function is the same as HB ACCEPTOR except that it includes basic nitrogen. There is no
exclusion of electron-deficient pyridines and imidazoles.
36
3. HB DONOR (vector): Matches these types of atoms or groups of atoms:
 Non-acidic hydroxyls
 Thiols
 Acetylenic hydrogens
 NHs (except tetrazoles and trifluoromethyl sulfonamide hydrogens)
Does not match: electron-rich pyridines and imidazoles that would be protonated or nitrogen‟s that
would be protonated due to their high basicity
4. HYDROPHOBIC (point): Matches these types of groups of atoms:
A contiguous set of atoms that is not adjacent to any concentrations of charge (charged atoms or
electronegative atoms) in a conformer such that the atoms have surface accessibility such as
phenyl, cycloalkyl, isopropyl, and methyl.
5. HYDROPHOBIC ALIPHATIC (point): Matches these types of groups of atoms:
A contiguous set of atoms that are not adjacent to any concentrations of charge (charged atoms or
electronegative atoms) in a conformer such that the atoms have surface accessibility is cycloalkyl,
isopropyl, and methyl
6. HYDROPHOBIC AROMATIC (point): Matches these types of groups of atoms:
A contiguous set of atoms that is not adjacent to any concentrations of charge (charged atoms or
electronegative atoms) in a conformer such that the atoms have surface accessibility such as
phenyl and indole.
7. NEG CHARGE (atom): Matches negative charges not adjacent to a positive charge.
8. NEG IONIZABLE (point): Matches atoms or groups of atoms that are likely to be
deprotonated at physiological pH, such as:
 Trifluoromethyl sulfonamide hydrogens
 Sulfonic acids (centroid of the three oxygens)
 Phosphoric acids (centroid of the three oxygen‟s)
 Sulfinic, carboxylic, or phosphinic acids (centroid of the two oxygen‟s)
 Tetrazoles
Negative charges not adjacent to a positive charge
9. POS CHARGE (atom): Matches positive charges not adjacent to a negative charge.
10. POS IONIZABLE (point): Matches atoms or groups of atoms that are likely to be protonated
at physiological pH, such as:
 Basic amines
 Basic secondary amidines (iminyl nitrogen)
 Basic primary amidines, except guanidine‟s (centroid of the two nitrogen‟s)
 Basic guanidine‟s (centroid of the three nitrogen‟s)
Positive charges adjacent to a negative charge do not match weakly basic aromatic nitrogen‟s such
as pyridine and imidazole.
11. RING AROMATIC (vector and plane): Matches 5- and 6-membered aromatic rings. The
feature defines 2 points, the ring centroid and a projected point normal to the ring plane. The
projected point can map both above and below the ring.
STEPS TO BE FOLLOWED IN DS
1. Construct or import the molecules.
2. Perform conformational search
3. Examine the each conformer for the presence of chemical features.
4. Determine the set of features that correlate with activity
37
STEPS AND APPLICATION OF PARAMETERS WHICH ARE USED IN HYPOTHESIS
GENERATION
 Import the molecules in view compound work bench and cleaning the constructed
molecules.
 Apply catalyst force field , then do the 3D minimize
Conformation search: the aim of the conformation search is to obtain the diversified
conformations .Conformations generation methods are classified into two types. One is best
method and the other is fast method. Both the methods emphasize broad coverage to cover the
conformational space. Fast conformer generation is used to cover the conformational space of
molecules. It uses systematic or random search depending on the size of the molecules. Systematic
search is useful for small molecules and random search is used for macromolecules. In the case of
macro molecules the conformers are minimized by poling algorithm.
CONFORMATIONAL ANALYSIS STOPS WHEN ONE OF TEST THREE CONDITIONS
IS MET
 After maximum number of conformers have generated.
 Energy of the newly generated conformer is too high to the predefined energy rest hold.
 If there is no possible new conformer generation after certain number of trials.
PHARMACOPHORE HYPOTHESIS
Catalysts confirm hip hop and hypogen are application that provides tools to generate
pharmacophore hypothesis. The hypothesis are created by generating conformation for a set of
study molecules, then using the conformation to find and align chemically important functional
groups common to the molecules in the study set.. Chemically important functional groups
common to the molecules in the study set. Each hypothesis can also incorporate data on the
biological activities of the study molecules.
STEPS INVOLVED GENERATING A PHARMACOPHORE HYPOTHESIS
I. GENERATE CONFORMATIONS
The interface to confirm is used to generate conformations for a single molecule or a set of
molecules. The number of conformation needed to produce a good representation of a compound
conformational space depends on the molecules. Both conformations generating algorithms
available in confirm (best and fast) are adjusted to produce a diverse set of conformations ,
avoiding repetition groups of conformations all representing local minima.
 The conformations all representing local minima.
 The conformations generated by confirm can be used as input into hip-hop and hypo to
align common molecular features and generate a hypothesis.
 Align common features to generate a hypothesis.
The following procedure involves
 Aligning common molecular features.
 Setting preferences using control panel
 Incorporating activity data into a hypothesis
 Using aligned structures to generate receptor models.
Hip hop and hypo use conformations generated in confirm to align chemically important
functional groups common in the molecules in the study set. A pharmacophore hypothesis can
then be generated from these aligned structures. Incorporated biological activity data into a
hypothesis
38
The hip hop is also used to incorporate biological activity data into the hypothesis
generating process. Each hypothesis is tested by regression techniques to compare estimated
activity with actual activity data. The software uses the data from these tests to select the
hypothesis that do the best job predicting activity for the set of study molecules. This capability is
provided by catalyst / hypo.
4.2.2a HIP HOP THEORY
Pharamcophore based on multiple common features alignment generate receptor models
using hip hop. The objective is to identify and enumerate all possible pharamacophore
configurations that are common to the training set. The aligned structures the model receptor menu
card is included in the hypothesis models card deck so that you can use structures that have been
aligned in hip hop to generate a receptor surface model. Since structures used in hip hop are
aligned by common chemical features, the receptor surface model that is generated for them can
be significantly different from a receptor surface model generated from template aligned
structures.
The ideal hip hop training set area s follows:-
 2-30 compounds ideally 6 molecules
 Structurally diverse set of input molecules.
 Feature rich compounds
 Include the most active compounds
Spread sheet set up for hip hop
Molecules hypothesis generation work bench imported into a spread sheet principal
specific the reference molecules references configuration models are potential centers for
hypothesis
 If (0) do not consider these molecules
 If (1) consider configuration of the molecules.
 If(2) use this compound as a reference molecules used only for hip hop hypothesis
generation
Maximum omit features how many feature for each compound may be omitted
 If (0) all features must map to generate hypothesis
 If (1) all but one feature must map to generate hypothesis
 If(2) features need to map to generate hypothesis used only for hip hop hypothesis
generation
When compound data appear in the spreadsheet, you are ready to add values in the
Principal and MaxOmitFeat columns. Common-features hypothesis generation uses values in
these columns to determine which molecules should be considered when building hypothesis
space and which molecules should map to all or some of the features in the final hypotheses.
In the Principal column, a value of 2 means that all the chemical features in the compound
will be considered in building hypothesis space. A value of 1 means that features will be
considered when generating hypotheses and that at least one mapping for each generated
hypothesis will be found unless the Misses or Complete Misses options are used. A value of 0
means the compound will be ignored.
The Max Omit Feat column specifies how many hypothesis features must map to the
chemical features in each compound a 0 in this column forces mapping of all features, a 1 means
that all but one feature must map, and a 2 allows hypotheses to which no compound features map
39
4.2.2b HYPOGEN
Hypogen attempts to derive SAR models for a set of molecules for which activity value
(IC50 or Ki) on a given biological target are available. Hypogen optimizes hypothesis that are
present in the highly active compounds in the training set. But missing among the least active (or
inactive) ones. It attempts to construct the simplest hypothesis that best correlates that activity
(estimates vs. measured) the predicted models are created the predicted models are created in
three stages:
 Constructive
 Subtractive
 Optimization
The constructive phase identifies hypothesis that are common to the most active set of
compounds.
The most active set is determined by the following equation of compounds. The most active set is
determined by the following equation
MA x UncA = (A/UncA)>0.0
Where MA is the activity of the most active compounds
Unc is the uncertainty in the measured activity and
A is the activity of the compound
The most active set of compounds is limited to a maximum of eight. Once the set is
determined hypogen enumerates all possible pharmacophore features for each of the
conformations for the two most active compounds. Furthermore, the hypothesis must fit a
minimum subset of features of the remaining most active compounds in order to be considered. At
the end of the constructive phase a database of every number of pharmacophore configurations is
generated. The objective of the substractive phase is to identify those pharmacophore
configurations is generated. The objective of the subtractive phase is to identify that
pharmacophore configuration developed in the constructive phase that is also present in the least
active set of molecules and remove them. The first step is the identification of the least active
compounds. This is accomplished by these of equations log (A) - log (MA).305 '' where the A is
the activity of the current compound and MA is the activity of the most active compound. in
simple terms, all compounds whose activity is 3.5 order of magnitude less than that or the most
active compound are considered to be in the set of least active molecules. The value 3.5 is user
adjustable parameter, if needed (i.e., if the activity range of the dataset does not span more than
3.5 orders of magnitude the subtractive phase identifies the hypothesis that are common to the
least active compounds the least active set is determined by the following equation,'' log (cmpdx)-
log (most active compounds)3.5''. It enumerates all possible pharmacophore configurations. Then
it checks for configuration with the most active compounds and eliminates if shred by more than
half of the least actives leading to feasible pharmacophores.
The optimization phase involves improvement of the hypothesis score. Small
perturbations are applied to those pharmacophore configurations that survived the subtractive
phase and that are scored based on errors I activity estimates from regression and complexity of
the hypothesis. The cost of a hypothesis is a quantitative extension of Occams razor (everything
else being equal, the simplest model is preferred;
40
Figure14 Hypogen process flow
A detail of the cost of each pharamcophore is computed by the sum of three costs: weight
error configuration. While the weight component increases with deviation of the feature weight
from the ideal value of 2.0, the error component increases with RMS difference between the
measured and estimated activities. The configuration cost is fixed and depends on the complexity
of the pharamcophore upon completion of this phase.
Hip hop and hypo use conformations generated in confirm to align chemically important
functional groups common to the molecules in a study set. Biological activity data can be
incorporated into this hypothesis so that the best hypothesis for predicting activity are generated
and selected. Additionally, you can use structures that have been aligned in these programs to
generate a receptor surface model.
HYPOGEN TRAINING AND TEST SET SELECTION
Selection of the training set molecules is one of the most important exercises the user
must purpose for the following reasons:
 Catalyst derives the information used in subsequent analysis from those structures thus, the
garbage in garbage out” paradigm certainly applies.
 The statistical procedures applied during analysis have limits in terms of over and under
fitting the data.
 Data sets that are ideal for those analysis procedures and data sets from typical medicinal
chemistry structure activity series are often not the same thing.
The ideal training set
1. At least 16 compounds are necessary to assure statistical power.
2. Activities should span 4 orders of magnitude.
3. Each order of magnitude should be represented by at lest 3 compounds.
4. No redundant information.
5. No excluded volume problems.
41
METHODOLOGY
INTRODUCTION
To build a better pharamcophore the following steps were employed
1. Building set of molecules
2. Conformer generation
3. Hypothesis generation
4. Database generation
5. Database search
6. Compare / fit to estimate activity
Criteria to generate successful hypothesis are:
1. Cost factor: a dumping score that is the difference between fixed and null cost should be
greater than so hits i.e., larger difference gives better prediction.
2. Fixed cost represents the simplest method model that fits all data perfectly and the null cost
represents the highest cost of a pharmacophore with no features and which estimates
activity to be average of activity data of training set of molecules.
3. The configuration value which is a measure of magnitude of hypothesis space for a given
training set should be less than 18. If it is above, more degree s of freedom and the result
may not be useful.
4. The estimated and the actual activity data correlation value should be around 1.0
5. The RMS deviations, which should be as low as possible, nearly equal to 0, which
represents the quality of the correlation between the estimated and the actual activity data.
METHOD
BUILDING A SET OF MOLECULES
All molecules were built using catalyst view compound work bench. They were cleaned
using option 2D beautify and minimized using CHARMm like force field.
CONFORMER GENERATION
A conformer is a representation model of the possible conformational space of a ligand. It
is assumed that the biologically active conformation of a ligand (or a close approximation there of)
should be contained within this model. Conformers were generated for all molecules with cut off
energy range 20 Kcal /mol and up to a maximum of 255 conformers.
COST HYPOTHESIS
The lowest cost hypothesis is considered to be the best. However, hypothesis with costs
within 10-15 of the lowest cost hypothesis are also considered as good candidates. The units of
cost are binary bits. Hypothesis costs are calculated according to the number of bits required to
completely describe a hypothesis. Simplex hypothesis require bits for a complete description and
the assumption is made that simplex hypothesis are better.
HYPOTHESIS GENERATION / PHARAMCOPHORE SEARCH
A pharmacophore model consists of a collection of features necessary for the biological
activity of the ligand arranged in 3D space, the common ones being hydrogen bond acceptor,
hydrogen bond donor and hydrophobic features. Hydrogen bond donors are defined as vectors
from the donor atom of the ligand to the corresponding acceptor atom in the receptor. Hydrogen
bond acceptors are analogously defined. Hydrophobic features are located at the centroids of
hydrophobic atoms.
Conformation s for all molecular were generated in view compound work bench using
poling algorithm and the best quality conformer generation method. The best conformer
generation considers the arrangement of atoms. Best conformer generation accepts a maximum of
255 conformers for the set of molecules catalyst generated conformers that provided the most
42
comprehensive treatment of flexible ring systems. All the conformers are automatically saved and
the number of conformers generated for each molecule with lowest conformer energy in kcal/mol.
Conformers were selected that fell within 20 kcal/mol range above the lowest energy
conformation found.
HYPOTHESIS GENERATION
The pharmacophore hypothesis generated in generate hypothesis work bench. The
molecular were selected as training set based on order of magnitude. Hypothesis generation
carried out by employing following assumptions.
1. Highly active and most inactive molecule should represent in the training set.
2. At least 3 or more molecules from each order of magnitude should be selected for
pharmacophore generation.
3. A minimum of 15 or above molecules will constitute for a training set.
4. Molecules selected should represent diversity towards chemical features.
HYPOTHESIS CONSIDERATIONS
In order to achieve a better pharmacophore, the following limits or considerations should be
met by generated hypothesis.
 Configuration value should be around 17.
 RMS should be as low as possible, preferable nearer to zero.
 Correlation should be around 1.0
 Cost factor difference between fixed cost and Null cost should be between 40-80 bits.
FACTORS THAT DETERMINE THE QUALITY OF PHARMACOPHORE
The overall cost of a hypothesis is calculated by summing three cost factors, a weight cost,
an error cost and a configuration cost. These are qualitatively defined.
WEIGHT COST
A value that increases in a Gaussian form as the feature weight in model deviates from an
idealized value of 2.0. This cost factor is designed to favor hypothesis where the feature Weights
are close to 2.
ERROR COST
A value that increases at the RMS difference between estimated and measured activities for the
training set molecules increases. This cost factor is designed to favor models where the correlation
between estimated and measured activities is better.
CONFIGURATION COST
This is a fixed cost which depends on the complexity of the hypothesis space being optimized. It is
equal to the entropy of the hypothesis space.
Of the three, the error cost factor has the major effect in establishing hypothesis cost.
During the beginning phase of an automated hypothesis generation, Catalyst calculates the cost of
two theoretical hypothesis one in which the error cost is minimal (all compounds fall along a line
of slope=10, and one where the error cost is high (all compounds fall along a line of slope +O).
These models can be considered upper and lower bounds for the training set. The cost values for
them are useful guides for estimating the chances for a successful experiment and are available
within 15 minutes from the start of the run because these experiments can easily require days of
run time. The ideal hypothesis cost (fixed cost) is reported in the full file found in the hypothesis
43
generation directory. This value tends to be 70-100 bits. The null hypothesis cost is reported in the
log file found in the same directory and is usually higher than the fixed cost. What is important is
the difference between these two costs. The greater the difference, the higher is the probability for
finding useful model. In terms of hypothesis significance, what really matters is the magnitude of
the difference the cost of any returned hypothesis and the cost of the null hypothesis. In general, if
this difference is greater than 60 bits, there is an excellent chances the model represents a true
correlation. Since, most returned hypothesis will be higher in cost than fixed cost model, a
difference between fixed cost and null cost of 70 or more will be necessary in order to achieve the
60 bit difference. If a returned hypothesis has a cost that differs from the null hypothesis by 40-60
bits, there is a high probability it has a 75-90% chances of representing a true correlation in the
data. As the difference becomes less than 40 bits, likelihood of the hypothesis representing a true
correlation in the data rapidly drops below 50%%. Under these conditions, it may be difficult to
find a model that can be shown to be predictive. In the extreme situation where the fixed and null
cost differential is small (>20), there is little chance of succeeding and it is advisable to reconsider
the training set before proceeding. Another useful number is the entropy of hypothesis space. This
value is calculated early in the run and is in full near the value for fixed cost.
TRAINING SET
1. Training set should contain the most active compounds.

2. Each compound must posses some thing new to teach catalyst.
3. If two compounds have similar structures (collections of features), they must differ in
activity by an order of magnitude to be included, otherwise, pick only the more active of
the two.
4. If two compounds have similar activities (within one order of magnitude), they must be
structurally distinct (from a chemical feature point of view) in order to both be included,
other wise pick only the most active of the two.
The pharmacophore features are perceived from the hip hop data, the features present
in training set molecules are hydrogen bond acceptor, hydrophobic aliphatic, hydrophobic
aromatic, and 22 molecules are selected for the training set and activity values are loaded into a
spread sheet and all the preferences and uncertainty values are loaded. Then the hypogen
algorithm is used to generate the hypothesis are generated.
44
4.2 STRUCTURE OR TARGET BASED DRUG DESIGN
Structure based drug design, the three dimensional structure of drug target interacting with
small molecules (drug) is used to guide drug discovery. . Drug targets are typically key molecules
involved in a specific metabolic or cell signaling pathway that is known, or believed, to be related
to a particular disease state. Drug targets are most often proteins and enzymes in these pathways.
Drug compounds are designed to inhibit, restore or otherwise modify the structure and behavior of
disease-related proteins and enzymes.
SBDD uses the known 3D geometrical shape or structure of proteins to assist in the
development of new drug compounds. The 3D structure of protein targets is most often derived
from x-ray crystallography or nuclear magnetic resonance (NMR) techniques as they have the
resolution few angstroms (about 500,000 times smaller than the diameter of a human hair). At this
level of resolution, researchers can precisely examine the interactions between atoms in protein
targets and atoms in potential drug compounds that bind to the proteins. This ability to work at
high resolution with both proteins and drug compounds makes SBDD as one of the most powerful
methods in drug design
Once bound at the receptor site, drugs may act either to initiate a response (agonist action or
stimulant) or decrease the activity potential of that receptor (antagonist action or Depressant) by
blocking access to it by active molecules. Thus, any drug may have structural features that
contribute independently to the affinity for the receptor and to the efficiency with which the drug
receptor combination initiates the response (intrinsic activity or efficiency). The response is
related to the drug receptor complexes. The affinity of a drug may be estimated by comparison of
the dose required to produce a pharmacological response with the dose required by a reference
standard drug or the natural ligand for that receptor. The affinity of a drug may be estimated by
comparison of the dose required to produce a pharmacological response with the dose required by
a reference standard drug or the natural ligand for that receptor. Structure based drug design, the
three dimensional structure of drug target interacting with small molecules (drug) is used to guide
drug discovery. Structure based drug deigning is employed with the following parts:-
4.2.1 Structure based pharmacophore generation:
Structure based pharmacophore approach was find an out the essential feature of active
site which can contribute for ligand binding.
The interaction generation protocol takes an input receptor and a defined active site and
analyzes the active site for donors, acceptors, and hyderophobes. The result of the calculation is an
interaction map. The density of polar site parameter specifies the density of the vectors in the
interaction site for hydrogen bonds. The density of lipophilic sites parameter specifies the density
of points in the interaction site for lipophilic atoms.
Procedure:
1. Load the interaction generation protocol from the protocols explorer. The parameters
display in the parameter explorer
2. Ensure that the structure you want to define as the receptor is open in 3d window .use the
binding site tool panel to define the structure as the receptor.
3. Set the input site sphere parameter to define the active site. Select the ligand from the
receptor ligand complex and define the input site sphere
4. The radius of the site sphere can change by selecting the sphere and changing the radius in
the attributes dialog.
5. Select the receptor structure from the input receptor parameter list.
6. select the sphere as the input site sphere parameter
7. Set the remaining parameter as desired .an run the protocol.
45
4.2.2 Docking:
Molecular docking is the technique that is used to study molecular binding and how
molecules bind. The term “docking” is mostly related to protein molecule interactions. Following
chart shows the work flow of the docking process.
Figure 15 docking work flow
4.2.2a LIGAND FIT

Ligand fit is designed to search the binding site of a protein and dock a series of potential
ligands into the binding site. During docking the protein is rigid, in which the ligand remains
flexible allowing the conformations to be searched and docked with in the binding site. The three
dimensional structure of protein and ligand are required. There are three key steps in this process.
a. Site search
b. Conformational search
c. Ligand fitting
a. SITE SEARCH
The position and shape binding site of protein is defined to a grid. The active site shape is
defined based on the shape of the protein, from which all sites are detected. Docked ligand method
is used to define active site, in which unoccupied grid points with in a certain user definable
distance to ligand atoms are collected to form the site.
46
b. CONFORMATIONAL SEARCH
The Monte Carlo simulation is employed in the conformational search of the ligand. During
the search, bond lengths and bond angles are untouched only torsional angles (except those in a
ring) are randomized. Therefore, the ligand molecules should be energy minimized to ensure
correct bond lengths and bond angles before using ligand fit.
c. LIGAND FITTING
After a new conformer is generated, the ligand fitting is carried out in two steps. First the
non mass- weighted principle moment of inertia (PMI) of the binding site is compared with non
mass- weighted principle moment of inertia (PMI) of the ligand. If the value (Fit value) is above the
threshold or not better fitting results previously saved, no further docking process will be
performed. If the value (Fitvalue) is better than previously saved results the ligand is positioned
into the binding site according to the PMI. Because PMI is a scalar property, there are four
possible positions for the ligand to orient in the binding site. For each position, the corresponding
docking score is computed.
The docking score is negative value of the non-bonded inter molecular energy between
ligand and protein. After the docking score is calculated, for each orientation it is compared with
the results saved previously. If the new one is better, it is saved, and then the process of
conformational search and ligand fitting is iterated until number of trials is reached. Finally rigid
body minimization is applied to the saved conformations of the ligand to optimize their positions
and docking scores.
PROCEDURE
Steps followed for ligand fit
1. Potent inhibitor molecules which can inhibit the action of spla2 were taken.
2. Molecules with diversified similarities and pharmacophore features were selected from the
literature.
3. The molecules which are to be docked in a receptor site are created in a SD file so as all
molecules are processed for the docking score at a site.
4. The active site of a protein is identified by the find site from receptor cavities which is
processed by the flood flow algorithm.
5. The identification of the active site is located by the already docked ligand
6. The protein molecule is selected, the set of molecules in the SD file are chosen and docking
score is calculated.
7. Thus, the docking score for a set of molecules are calculated through ligand fit.
4.2.2b C-Docker:
C docker is a grid based molecular docking method that employs charm. It has been
employed in ds through the dock ligands (cdocker) protocol. In c docker, the receptor is held rigid
while the ligands are allowed to flex during the refinement. Random ligand conformations are
generated from the initial ligands structure through high temperature molecular dynamics followed
by random rotations. The random conformations are refined by grid based simulated annealing and
a final grid based or full force field minimization.
C-Docker steps:
1. Define the receptor and search for binding sites,
2. Prepare and run the dock ligands (c docker) protocol,
Procedure
1. open the receptor protein and apply the charmm force field
47
2. define the selected molecule as a receptor after that select the ligand define sphere from
selection
3. open the c docker protocol and set the parameters
4. run the protocol
4.2.2c Lip dock:

Lip dock uses protein site features referred to us hot spots. Hot spots consist of two types:
polar and pallor. a polar hot spot is preferred by a polar ligand atom (for example a hydrogen bond
donor are acceptor ) and an pallor hotspot is preferred by an pallor atom the receptor hot spot file
is calculated prior to the docking procedure.
However, if desired, a pre defined or user adjusted hotspot file can be used .the protocol allows the
user to specify several modes for generating ligand conformations for docking
The rigid ligand poses are placed into the active site and hot spots are matched as triplets. The
poses are pruned and a final optimization step is performed before the poses are scored. Lip dock
algorithm has four function aspects:
1. conformation generation of the ligands,
2. creating a binding site
3. matching the binding site image and the ligand
4. optimization stage and scoring
Procedure:
 In the protocol explorer, expand ligand receptor interactions and then double click dock
ligands (lib dock). The lib dock ligands protocol opens in the parameter view with the
parameters for setting up the protocol.
 Ensure that the structure you want to define as the receptor is open and selected in a 3d
window. The receptor should have hydrogen‟s added and all atom valances satisfied. open
the binding site tool pane and click define selected molecule as a receptor
 On the parameter explorer, select the sphere as the input site sphere parameter. Before the
input site sphere parameter can be assigned, define the site sphere from the binding site
tool panel.
 Click on the button for the input ligand parameter, this opens the specify ligands dialog.
On the dialog, select the legands from a table browser, a 3d window or .sd file.
 The number of hot spots parameter determines how many hot spots to generate.
 Set the remaining parameters as desired. Run the lip dock protocol.
48
4.2.2d DENOVO LIGAND DESIGN
LUDI
Ludi is method for the denovo design of ligand for protein (inhibitor)
It can be also suggest modification of known ligand that may enhance the target protein. The
following Chart shows the ludi work flow.
Figure 16: ludi work flow

Ligand design:
The design of new ligand for protein (enzyme inhibiter) for protein is carried out if the structure is
known. If the structure of one or more protein –inhibiter complex is known ,the design ma be
added by study that identifies essential ligand- protein interaction .there are two approach to find a
compound csn fit into active site
The known structure approach:
Searching through database such a Cambridge structure database identifies that structure that fit
the active site. The advantage of this approach is that the molecules retried from the database do
exit and the structure represents low energy confermation. This approach does not address the
issue conformation flexibility.
The fragment approach:
This approach use a library of fragment the idea is to position molecular fragment into the active
site, in such a way that hydrogen bond can be formed with the protein and hydrophobic pockets
filled with hydrophobic groups. The fragment is than connected by suitable apacer fragment to
form single molecules.
49
Ludi method:
Ludi is based on fragment approach method. It suggests how suitable and small fragment can be
positioned into cleft of protein structures. This positioning is the strength ludi because it
immediately provides with the ideas about how putative binding site on the protein can be
saturated by the fragment and those fragment might be linked together .ludi works in three steps
It calculate interaction site within the protein active site or from the active angles.
It searches libraries for fragments and fits than onto the interaction sites
To process an alignment or linked for the fragment.
Ludi distinguishes four types of interaction sites.
H-donor
H acceptor
Lipophilic aliphatic
Lipophlic aromatic.
The aromatic and aliphatic interactions are suitable sites for hydrophobic interactions
The H donor and H acceptor interaction sites are suitable for H bond formation. Ludi is capable
for fitting fragments on to the interaction sites and simultaneously a linking (i.e linking) them to
an existing ligand.
Method:
1. Identification of chemical nature of active site amino acids
2. Fragments identification and analysis of ludi score
3. Searching for link
4. Linking the fragments
5. Fusing the fragment and linking
6. Docking validation.
FRAGMENT FITTING
The next step is to fit fragments onto the interaction sites. Ludi searches the list of
interaction sites by distance criteria for suitable sets of two to sites to match the fragments.
Required interaction are specified are specified using targeted mode. In targeted mode fragments
are require to interact with the protein atom or atoms specified by the user. Any fragment fit that
does not interact with the entire set of specified target atoms is rejected.
To fit the fragment, Ludi performs a root mean squares (RMS) superimposition using
algorithm given by Kabasch (1978). A fragment fit is accepted if the RMS value is less than a user
defined threshold (typically 0.2A to 0.6A) , and no vanderwaals overlap of the fitted fragment with
the protein occurs , and if ,the electrostatic check parameter on the ludi runtimes parameters
control panel is checked , no unacceptable electrostatic repulsions are found. When the receptor
structure is not known, a fragment fit is rejected if the fragment extends outside the volume
defined by the set of active analogs.
LINK SITES: ALIGNING FRAGMENTS WITH PARTIALLY BUILT LIGANDS
Ludi is capable of fitting fragments onto the interaction sites and simultaneously aligning
(i.e. linking) them to an existing to a ligand. For this purpose, link sites are defined on the ligand.
A link site is a hydrogen atom that all the hydrogen atoms of the positioned ligand (within a user
specified cutoff radius) are link sites.
The ludi works as described above:
50
LUDI FRAGMENT LIBRARIES
The Ludi fragment library is divided into two parts. The de novo library is used when Ludi is run
in no-link mode. The link library is used when Ludi is run in link mode. The de novo library and
the link library each consist of two files, a file that specifies the fragment topologies and a file that
specifies the interaction types of fragment functional groups.
PROCEDURE
1. It calculates
interaction sites within the protein sPLA2 active site or from the active analogs.
2. It searches
libraries for fragments and fits them from onto the five interaction sites which are present
at the active site.
3. It proposes an
alignment or linking for the fragments and the new ligand is designed.
The highest activity with the best dock score is better fitted when compared to
other. A knowledge based approach is to suggest possible binding positions. The present
experimental studies carried out using ludi program. This program is studied to dock small
molecular fragments within protein binding sites using interactions between the donor hydrogen
and its acceptor is close to 1.8Å and the angle subtended at the hydrogen is rarely less than 1.20A.
Information about the preferred geometries of such interactions can be obtained from analysis of X
ray crystallographic database. Kelbe has performed a very careful analysis of non bonded contacts
observed in the CSD.
51
52
5.1. QSAR:
In the present study quantitative structure activity relationship studies were carried
out on phospholipasea2 inhibitors in order to design selective and potential inhibitors. QSAR
models were developed using1D and 2D-descriptors using discovery studio software. QSAR
attempts to model the activity of a series of compounds using measured or computed properties of
the compounds. In the equation the term „N‟ means the number of data points, r 2 which is the
square of the correlation coefficient which describing the binding of the compounds to the QSAR
model. XV r2, a squared correlation coefficient generated during a validation procedure using the
equation
XV r2 = (SD PRESS)/SD
SD means the sum of squared deviations of the dependent variable values from their
mean the predicted sum of squares (PRESS), the sum of overall compounds of the squared
differences between the actual and the predicted values for the dependent variables. The PRESS
value is computed during a validation procedure for the entire training set. The larger the PRESS
value the more reliable is the equation. XV r2 is usually smaller than the overall r2 for a QSAR
equation. It is used as a diagnostic tool to evaluate the predicted power of an equation generated
using the multiple leaner regression method.
GFA work by generating random populations of solution to a problem, scoring the
relative quality of the solution , and caring forward the most fit solutions or analogues(generated
through mutation and crossover)of other solutions to iteratively generated(and finally converge
on)new, more fit solution. In this study GFA analysis was done with following parameters.
 Population size
 Initial equation length
 Final equation length
 Number of generation
Boot strap r2 correlation coefficient calculated during the validation procedure. 79
compounds were included in the training set to generate the primitive QSAR model covering the
widest data range of IC50 values 0.005 to 50.01 µM. The predictive characters of QSAR were
further assessed using test molecules. To judge the predictive ability of the QSAR model for new
drug candidates the IC50 values for the test and training set were evaluated.
GFA parameters
Number of rows in model 79
Population 100
Maximum generation 50000
Initial terms per equation 20

Friedman
Scoring function LOF
Mutation probability 0.1
53
The GFA method performs a search over the space of possible QSAR models using lack of fit
(LOF) scores to estimate the fitness of each model. These models lead to the discovery of
predictive QSAR equations.
GFA equation = 4.7849

+ 0.00716121 * −In−Situ Starting Energy
− 2.0176 * Activ
+ 0.10343 * Dipole_Mag_Propgen_VAMP
− 0.610585 * Local_polarity_Propgen_VAMP
+ 0.26681 * Mean_Polarizability_VAMP
+ 0.633171 * Num_H_Acceptors
− 0.149947 * Num_RotatableBonds
− 0.000507116 * Octupole_XYY_VAMP
+ 0.0647933 * RIJestateSumHal_Propgen_VAMP
+ 8.73998e−05 * −In−Situ Final Energy * −In−Situ Final Energy
1 + 5.40146e−05 * ESP_point_count__3_Propgen_VAMP *
ESP_point_count__3_Propgen_VAMP
+ 17.3371 * Molecular_FractionalPolarSurfaceArea * Molecular_FractionalPolarSurfaceArea
− 7.29063e−07 * No._of_surface_points_with_−ve_ESP_Propgen_VAMP *
No._of_surface_points_with_−ve_ESP_Propgen_VAMP
+ 86.4313 * QsumHal_Propgen_VAMP * QsumHal_Propgen_VAMP
− 0.000148821 * Quadrupole_YY_VAMP * Quadrupole_YY_VAMP
− 2.65319e−07 * Total_Energy_VAMP * Total_Energy_VAMP
+ 13.6832 * Activ * Covalent_hydrogen_bond_acidity_Propgen_VAMP
− 0.000694297 * Activ * No._of_surface_points_with_+ve_ESP_Propgen_VAMP
+ 1.85589e−06 * DMol3_Mol_ID * Electronic_Energy_VAMP
+ 0.0165753 * Num_AromaticRings * RIJestateSumO_Propgen_VAMP
From the above equation, the positive values are the reference for the presence of specific group
at that point and increase the activity of molecule and the negative values indicate the presence of
ionic group which reduce the activity.
Table: The validation statistics for the model.

Fridman LOF 0.323
R-squared 0.968
adjusted R-squared 0.957
cross validated R-squared 0.941
lack of fit points 58
error for non significant LOF 0.176
significance of regression F value 89.789
54
Table 8: Experimental and predicted values of Training set compounds using GFA
MOLECULE NAME Activity GFA predicted value

1 7.89 7.829
10 6.97 7.091
11 8 7.983
12 7.14 7.103
13 8.05 7.953
14 6.88 6.489
16 6.09 6.336
17 7.09 7.383
18 6.97 7.373
19 7.89 7.743
20 8.15 8.314
21 6.96 6.866
23 7 6.813
24 7.96 7.684
25 6.83 7.062
26 7.96 7.82
12d 4.72 4.727
15b 7.85 7.813
15c 7.82 7.651
15e 7.89 7.624
15g 7.52 7.998
27ii 7.77 7.816
28i 8.1 7.984
28ii 8.22 7.958
28iv 8.05 8.121
55
28ix 7.68 7.834
28v 8.3 8.178
28viii 7.74 7.6
28x 7.72 7.556
28xi 7.64 7.831
28xii 7.21 7.563
28xiii 7.77 7.628
28xix 8 7.88
28xl 4.3 4.315
28xv 5.18 5.235
28xvi 7.29 7.545
28xviii 7.96 8.108
28xx 7.68 7.59
28xxi 7.44 7.481
28xxii 7.6 7.476
28xxiii 6.92 6.9
28xxix 7.85 7.752
28xxv 7.34 7.428
28xxvii 8 7.777
28xxviii 7.54 7.419
28xxx 7.37 7.42
28xxxi 7.46 7.434
28xxxii 7 7.179
28xxxiii 6.55 6.931
28xxxv 7.42 7.265
28xxxvii 7.64 7.668
28xxxviii 7.82 7.623
43a 6.7 6.906
43b 5.36 5.201
56
43c 6.85 6.722
43d 5.47 5.449
43e 5.85 5.888
43f 5.89 5.827
43g 6.62 6.907
48b 4.66 4.701
49b 8.15 8.25
49c 7.34 7.224
49d 7.08 7.158
49e 7.26 7.307
49g 7.1 6.768
49h 7.36 7.424
50b 7.52 7.535
51a 4.3 4.288
65a 6.04 6.206
65b 6.38 6.487
65c 6.8 6.757
65d 7 6.956
65e 6.4 6.667
65f 6.7 6.623
65g 7.17 7.066
65i 7.6 7.399
67a 4.82 4.823
67b 4.92 4.791
6b 5.96 6.009
57
Experimental activity
Graph 1: Showing correlation between experimental and predicted activities by

QSAR equation using GFA method
58
Test Set
The purpose of QSAR is not only to produce the biological activity of the training set but
also to predict the values of the test set molecules. From the above equation obtained for the
training set molecules of known activity are introduced to study table so as to predict the
biological activity. A series of molecules are introduced to study table which are known as test set
molecules. After the prediction of activities of test set molecules the activity of prediction crosses
over 80%.
Table 9: Experimental and predicted values of Test set compounds using GFA
Molecule Name Activity GFA predicted activity
11c 6.6 6.676
11d 5.79 5.641
11g 7 6.95
11h 5.79 5.494
12a 6.77 5.789
12b 5.79 5.005
12e 7.4 5.808
12f 5.79 5.057
14a 7.85 6.128
16a 5.79 5.35
16b 7.52 5.601
16c 7.46 6.055
1b 6.9 5.523
indoxam 7.22 5.856
15 8 7.114
22 8 7.584
15f 7.89 7.544
59
28iii 8.22 7.941
28vi 8.05 7.749
28vii 8.1 7.907
28xiv 7.77 8.006
28xli 4.3 -5.902
28xvii 8.22 7.841
28xxiv 8.1 8.062
28xxvi 7.96 7.915
28xxxiv 8.05 7.372
28xxxix 7.8 8.938
28xxxvi 7.29 7.543
44b 7.1 6.663
51b 4.3 -4.277
65h 6.62 7.418
65j 6.77 7.154
60
Graph 2: Showing correlation between experimental and predicted activities by
QSAR equation using GFA method for test set.
The result generated from QSAR equation using GFA method, the values observed for r2
and XV r2 are in specific range and there is a good correlation between experimental and GFA
predicted activity as listed. Good correlation is observed between the experimental IC50 and
computational predicted IC50 values. It has been suggested as since the predictive ability of
equations is good, they can be used to develop new analogs.
61
62
Pharmacpohore
The work in discovery studio shows how chemical features hydrogen acceptor, hydrogen
donor, hydrophobic aliphatic of set of compounds along with their activities ranging over several
orders of magnitude can be used to generate pharmacophore hypothesis, that can successfully
predict the activity. The models were not only predictive within the same series of compounds but
differences classes of diverse compounds also effectively mapped onto most of the features
important for activity. The pharmacopore generated can be used for diversified structures that can
be potentially inhibit lethal factor inhibitors discovery and to evaluate how well any newly
designed compound maps in the pharmacophore developed in this study, using inhibitors against
lethal factor showed distinct features that may be responsible for the activity of the inhibitors.
Analogue based pharmacophore generation:
5.2. Common Feature Pharmacophore Generation (HIP HOP):

The 10 most active molecules were used to derive common feature based alignments. All the
10 most active molecules were considered as reference molecules to get the best features. The best
features obtained from hip-hop run method are
1. Hydrogen bond acceptor, 2. Hydrogen bond acceptor lipid
3. Hydrogen bond donor 4. Hydrophobic
5. Ring aromatic
Table 10 Summary of feature definition hits by molecule

Molecule A H D Z Y X N P W R
28v 7.70 7.70 1.73 4.00 2.00 2.00 1.00 0.00 0.00 6.00
28ii 7.80 7.80 1.80 4.00 3.00 1.00 1.00 0.00 0.00 4.00
28xl 8.05 8.05 1.81 2.91 1.82 1.00 1.00 0.00 0.00 4.00
67a 5.63 5.63 1.43 3.77 1.77 2.00 0.00 0.00 0.00 4.00
43d 7.48 7.48 1.78 3.00 3.00 0.00 1.00 0.00 0.00 2.00
12d 6.59 6.59 5.57 4.93 3.93 1.00 0.00 0.00 0.00 4.00
A-hydrogen bond acceptor: H-hydrogen bond acceptor lipid: D-hydrogen bond donor: z-hydrophobic
Y-hydrophobic aliphatic: X-hydrophobic aromatic: N-negativeionizable: P-positive with Exclusions
W- PositiveIonizabl: R-ring aromatic.
63
Table 11: Common Feature Pharmacophore Generation Rank File:
Hypo. Pharmacophore Rank score Direct hit Partial hit Max fit
no feature
1 ZDA 192.485 111111111 000000000 3
2 ZHA 191.559 111111111 000000000 3
3 ZHA 191.559 111111111 000000000 3
4 YZA 190.387 111111111 000000000 3
5 ZHA 190.012 111111111 000000000 3
6 ZHA 190.012 111111111 000000000 3
7 ZHA 190.012 111111111 000000000 3
8 ZHA 190.012 111111111 000000000 3
9 ZDA 189.761 111111111 000000000 3
10 ZDA 189.735 111111111 000000000 3
64
5.3. HYPOGEN (Training set):
Sets of 5 hypotheses were generated using the data from 22 training set compounds.
Different cost values correlation coefficient RMS deviations and pharmacophore features are
listed in table.
TABLE 12: The 5 pharmacophore models generated by the hip-hop algorithm
Total Cost Error

Hypo. no cost difference cost RMS Correlation Feature
1 92.689 35.867 70.98 1.077 0.951 AADZ

2 98.9081 29.6479 77.6027 1.376 0.896 AADZ
3 104.54 24.016 86.435 1.296 0.898 AADZ

4 117.235 11.321 91.143 1.24 0.908 AAZR
5 122 6.556 94.289 1.36 0.893 AAZR

Note: Cost difference=null cost (128.34)-total cost
The best pharmacophore is taken as the hypothesis 1 which has the highest cost difference,
lowest error cost, lowest RMS difference and the best correlation coefficient has two hydrogen
bond acceptor, one hydrophobic and one hydrogen bond donor features. The best pharmacophore
(hypo1) has the highest cost difference of 35.867, the best correlation coefficient and RMS
difference.
For the highly active compound (28v) in training set, mapped all the features are perfectly
to the features of Hypo 1. In compound 28v, HBA1 feature mapped to the electron rich O atom of
Sulfur Dioxide group and HBA2 feature corresponded to the another O group of Sulfur Dioxide
group. The HBD feature mapped to the NH group attached with Sulfur Dioxide group. The
Hydrophobic group was mapped to the methyl attached to 3rd-position of the benzene ring of the
compound.
.
65
Figure18: Blank Pharmacophore feature of sPLA2 inhibitors
66
Figure 19: Showing the distances between pharmacophore features
67
Figure 20: Overlapping of highest active inhibitor molecules (28v) of
training set with the best pharamcophore (Hypo1).
68
Figure 21: Overlapping of lowest active inhibitor molecule 28xl of training set with
the best pharamcophore (Hypo1)
69
Table 13: Results of pharmacophore hypothesis generated using training set.
Name Activity ConfNumber Estimate Fit Value

28v 0.005 199 0.009 10.976
28iii 0.006 18 0.011 10.887
20 0.007 142 0.034 10.413
13 0.009 78 0.005 11.24
28iv 0.009 201 0.07 10.098
22 0.01 23 0.015 10.768
26 0.011 10 0.096 9.956
1 0.013 43 0.162 9.73
10 0.106 81 0.037 10.369
21 0.109 182 0.139 9.798
25 0.148 42 0.207 9.625
28xxxiii 0.28 153 0.126 9.838
16 0.806 119 0.55 9.2
43e 1.4 18 0.083 10.019
43d 3.4 139 3.222 8.432
43b 4.4 24 2.778 8.496
28xv 6.6 210 3.918 8.347
67b 12 88 16.157 7.732
67a 15 46 29.519 7.47
12d 19 191 6.771 8.109
48b 22 41 7.699 8.054
28xl 50.01 169 2.114 8.615
70
Discussion
Pharmacophore models of sPLA2 lethal factor inhibitors are generated in HypoGen

module in DS software. HypoGen attempts to construct the simplest hypotheses that best
correlates the activities (experimental vs. predicted).
The dataset was divided into training set (22 compounds) and test set (89 compounds,),
considering both structural diversity and wide coverage of the activity range. The compounds with
activity with < 1 μM were considered as highly actives (+++), compounds with an activity range
of 1-10 μM as moderate actives (++) and activity of >10 μM as least actives (+).At end of run,
HypoGen generated 5 pharmacophore models. The Null cost for ten hypotheses was 128.556, the
fixed cost of the run was 79.954 and the configuration cost was 18.83. A difference of 48.602 bits
obtained between fixed and null costs is a sign of highly predictive nature of hypotheses. All 5
hypotheses generated showed high correlation coefficient between experimental and predicted
IC50 values, in the range of 0.95 to 0.89 and moreover, these are having cost difference less than
45 bits between the cost of each hypothesis and the null cost. It indicates that all the hypotheses
are having true correlation between 80-95%. The cost values, correlation coefficients (r), RMSD,
and pharmacophore features are listed in Table12.The best pharmacophore (Hypo 1) consisted of
two H-bond acceptor (HBA), an H-bond donor (HBD), and a hydrophobic feature with a
correlation coefficient (r) of 0.95, total cost (92.689), and lowest RMSD value (0.89) was chosen
to further validate its predictive power by estimating the activity of test set.
The predictability of Hypogen one was evaluated by using diversifies test set compounds.
The generated pharmacophore model has predicted the activity of a diverse dataset of 89 test set
compounds with correlation value of 0.7987. Hence from this analysis, Hypo1 was able to
distinguish active compounds from the inactive compounds
71
5.4 Structure based pharmacophore:
Structure based pharmacophore approach was to find an out the essential feature of
active site which can contribute for ligand binding.
.
Interaction generation:
Enumerates pharmacophore features from a protein active site. The site finding algorithm
from Ludi to identify points in the active site that could interact with the receptor. Creates a
pharmacophore query containing Hydrogen bond acceptor, donor and hydrophobic features from
these points
After interaction generation run, it Found 329 features: minimizied1DB41
Found 98 lipophilic features
Found 131 H-acceptor features(
Found 100 H-donor features
Found 0 Link features
Figure 22: cluster feature of interaction generation.
72
Figure 23: center points of cluster feature
73
Figure24: Mapping of 28v molecule with structure based pharmacophore feature.
This structure based pharmacophore features are useful for virtual screening of large
database.
74
5.5 LIGAND FIT
Every molecule in the prepared bio active compound SD file will be docked
into the binding site chosen, the fits will be automatically processed according to the preferences
chosen and saved into the output SD file. The results containing RMS calculations perform by
comparing the RMS difference of every fit and the first conformer in the input SD file.
Minimization energies of the fits in the presence of the protein and ludi score according to the
references can be seen in the input SD file the option of performing ligand fit using flexible fit
method carried out initially in a random conformation. The docking score is the negative values of
the non-bonded inter molecular energy; if the ligand atom has partial charge on it, the electrostatic
grid is used to estimate electrostatic energy. If it is a hydrogen atom, the hydrogen grid is used for
vanderwaals energy. Otherwise carbon grid is used. The following table enlists the docking score
and the corresponding minimization energies obtained for the beast conformer for each molecule.
The activity of the each molecule may be contributed by the best lowest energy obtained in the
ligand fit with the corresponding dock scores in table14 are as follows:-
Table14: Docking scores of inhibitors molecules of LF obtained after subjecting to ligand fit.
s.no Compound dock score Internal energy

name
Highest active compounds
1. 28V 75.456 -16.873

2.
28II 73.954 26.318
lowest acting compounds
3 28Xl 81.063 -18.654

4
51a 69.885 19.771
Intermediate active compounds

5 43d 57.36 -2.282
6 11d 44.873 -2.553
75
Figure25: conformation search of high active compound (28v) inside the protein
(1DB4) binding site.
76
Figure26: highest acting Molecule 28V which has been subjected to ligand fit
showing its interaction.
77
Figure27: Hydrogen bond interaction of high active compound 28v with active site
amino acids
78
Figure28: Hydrogen bond interaction of low active compound 11d with active site
amino acids
79
Discussion:
Docking studies shows that the compound 28v having the high dock score of 75.456. And
compound 11d has the low dock score of 44.873.the following table shows distance and active site
amino acids forming the hydrogen bond interactions with 28V and 11d.
Table15: Hydrogen bond distances and hydrogen bond forming amino acids with 28v and 11d
compound
Compound name Hydrogen bond Hydrogen bond

Forming amino acids Distance(Å)
28v 1.gly29 3.159

2.his27 2.930
3.his47 2.17
4.lys62 2.95
11d 1.gly29 3.11

2.his47 2.14
80
5.6 C docker:
Every molecule in the SD file will be docked into the binding site chosen and in these
docking Docks ligands into an active site using CHARMm.
Uses a CHARMm-based molecular dynamics (MD) scheme to dock ligands into a receptor
binding site. Random ligand conformations are generated using high-temperature MD. The
conformations are then translated into the binding site. Candidate poses are then created using
random rigid-body rotations followed by simulated annealing. A final minimization is then used to
refine the ligand poses. The following table enlists the docking energy and the corresponding
minimization energies obtained for the beast conformer for each molecule. The activity of the each
molecule may be contributed by the best lowest energy obtained in the c- docker with the
corresponding dock in energy in table 15 are as follows:-
Table15: C-Dock energy of inhibitors molecules of C-Dock obtained after subjecting to legend
fit.
s.no Compound C docker energy C docker Interaction

name energy
Highest active compounds
1. 28V -29.754 -57.136

2. -29.436 -52.133
28II
lowest acting compounds
3 28Xl -38.076 -50.647

4 -36.065 -47.674
51a
Intermediate active compounds

5 43d -31.718 -55.081
6 11d -29.952 -55.592
81
Figure29: conformation search of high active compound (28v) inside the protein
(1DB4) binding site.
82
Figure30: highest acting Molecule 28V which has been subjected to C-Dock
83
amino acids
84
Figure32: Low acting Molecule 11d which has been subjected to C-Dock showing
its interaction.
85
Discussion:
C-Docking studies shows that the compound 28v having the high c-docker energy of
28.754. And compound 28xl having the low dock score of -38.076. The following table shows
distance and active site amino acids forming the hydrogen bond interactions with 28V and 28xl.
Table16: Hydrogen bond distances and hydrogen bond forming amino acids with 28v and 11d
compound
Compound Hydrogen bond Hydrogen bond

name Forming amino acids Distance(Å)
28v 1.gly29 2.723

2.38
2.his47
28xl 1.his47 2.29
86
5.7 Lib dock
Docks ligands into an active site using hotspots. Hotspots are polar and apolar interaction sites.
Ligand conformations can be recalculated or generated on the using DS.
Table17: Lip Docking scores of inhibitors molecules of obtained after subjecting to ligand fit.
Name Activity Lib Dock Score Hot Spots

High active
58.71,27.73,46.89,A,73,2
55.71,31.12,44.29,P,36,24
28v 0.005 161.275 58.11,25.93,42.29,P,13,27

58.71,27.73,46.89,A,73,16
62.91,33.92,48.49,A,94,23
28ii 0.006 160.269 56.71,32.53,44.89,P,44,24

intermediate
55.91,30.53,43.09,P,18,21
62.71,33.73,47.09,A,77,23
28xl 50.01 146.517 63.91,33.13,47.49,A,83,24

64.11,33.73,48.09,A,90,19
57.31,26.32,45.89,A,56,23
51a 50.01 151.647 58.91,32.73,41.49,A,4,29

Low active
60.11,27.73,47.89,A,87,2
57.31,33.73,46.49,A,69,14
43d 3.4 160.571 62.11,33.13,47.89,A,88,30

60.11,34.53,47.09,A,78,11
59.31,28.73,44.49,A,30,36
11d 1.61 147.918 56.71,32.53,44.89,P,44,38
87
Figure33: highest acting Molecule 28V which has been subjected to Lib dock
88
amino acids
89
Figure35: Low acting Molecule 11d which has been subjected to Lib dock showing
its interaction.
90
Figure36: Hydrogen bond interaction of high active compound 11d with active site
amino acids
Discussion:
The lib dock score of the above stated molecules are all positive values. Thus the molecules
can be used as the potential ligands for the inhibition of sPLA2. The molecule 28v and 28xl are
found to have a dock score 161.275and 146.516respectively.
91
5.8 Ludi
Ludi is method for the denovo design of ligand for protein (inhibitor) it can be also
suggest modification of known ligand that may enhance the target protein. In these studies the
denovo legand UA6 found by ludi. Following table shows the 2d structure of the ligand and there
molecular properties
CH3
O
H3C CH3
S
5-(1-methoxy-4-methylpentan-3-yl)[1]benzothieno[3,2-b]furan
Molecular Formula = C17H20O2S
Formula Weight = 288.4045
Composition = C(70.80%) H(6.99%) O(11.10%) S(11.12%)
Molar Refractivity = 87.04 ± 0.3 cm 3
Molar Volume = 252.7 ± 3.0 cm3
Parachor = 643.3 ± 4.0 cm3
Index of Refraction = 1.604 ± 0.02
Surface Tension = 41.9 ± 3.0 dyne/cm
Density = 1.140 ± 0.06 g/cm 3
Dielectric Constant = Not available
Polarizability = 34.50 ± 0.5 10-24cm3
Monoisotopic Mass = 288.1184 Da
Nominal Mass = 288 Da
Average Mass = 288.4045 Da
Uses Ludi to search a library of small fragments to find candidates that bind in an active
site. Fragments in the library that overlay with a calculated interaction map are found.
92
Figure37: Ludi molecule with interaction map
93
C–docking result for ludi ligand:
Table18: C-Dock energy of Ludi molecules obtained after subjecting to C dock.
ludi ligand c docker energy c docker internal energy
ua6-1 -21.094 35.271
ua6-2 -21.172 35.408
ua6-3 -21.221 35.25
ua6-4 -21.373 35.031
ua6-5 -21.768 35.159
ua6-6 -21.769 35.173
ua6-7 -21.845 34.848
ua6-8 -21.889 35.449
ua6-9 -21.98 34.728
ua6-10 -22.007 34.815
94
Figure38: Ludi Molecule UA6 which has been subjected to C dock showing its
interaction
95
Figure42: Hydrogen bond interaction of Ludi compound ua6 with active site amino acids
96
Pharmacophore mapping of ludi ligand:
Table 19: Estimated value of ludi ligand by using pharmacophore model
Fit
Name Estimate Mapped Atoms value
4,0.231,-3.402,2.042,1.6,HBA 1.11 3.35
18,4.905,0.304,-1.932,1.6,HBA 2.11
19,2.548,5.188,1.714,1.6,Hydrophobic1
UA6 1.901 27,2.147,0.625,0.307,1.6,centroid1
Figure43: Feature mapping of compound UA6 with parmacophore model
97
Discussion
Docking shows that the new ligand molecule (UA6) has the c-dock energy -21.094
and UA6 compound forming the hydrogen bonding with active site amino acids gly 22, gly 27and
his 47. As per the pharmacophore feature mapping studies showed the new compound having
estimated value of 1.901 and fit value 3.35.
98
99
6. Conclusion
As far as Insilco studies considered for human phospholipase A2 (sPLA2) the

algorithms such as QSAR, Pharmacophore and docking were used. These algorithms showed good
results and further investigation for the drug collaboration can be done.
The 3D QSAR studies conducted for training set compound gave a good r2 score of 0.936
with four outliers with a GFA graph with a Fit line representing the good correlation of the
compounds with the activities. The pharmacophore studies gave the best quantitative
pharmacophore model in terms of predictive value consisted of three features like Hydrogen bond
acceptor, Hydrogen bond acceptor lipid, Hydrophobic, and Ring aromatic. Hypogen which is
further validated by using a set of sPLA2 inhibitors gave a correlation value of 0.968. The
Pharmacophore studies showed four regions which showed interactions i.e., hydrogen bond
acceptor, Hydrophobic, hydrogen bond acceptor lipid and ring aromatic. docking studies shows
that the compound 28v having the high dock score of 75.456and the compound 11d having less
dock score 44.87.
The Insilco modeling helped to guide the lead optimization and lead to the generation of a
highly potent series of sPLA2 inhibitors with good drug like properties and is subject of another
communication. However, the scope for fine tuning and optimizing this potent class of sPLA2
inhibitor could lead to the generation of new therapeutic agents.
The combined approach of analogue and structure based drug designing methods allowed
us to gain an insight into predicting the enhanced activity and exploring the docking interactions
between amino acid residues of lethal factor and the ligand. Good ligands may not act as good
drugs. Thus, the prime objective of this project to prove the authenticity of our techniques
obtained from the various journals is completed using computer aided drug designing. The results
obtained are used to develop new ligand molecules and find their activities Insilco and proving the
same in accordance to the experimental values. Thus, the results reported can successfully employ
in the rational drug designing of novel and potent lethal factor inhibitors.
100
101
Reference:
Humphrey P. Rang MB BS MA DPhil FMedSci FRS Drug Discovery and Development:
Technology In Transition (2006-01-12)
computer aided drug design by T. J. Perun1 edition (February 22, 1989) page 369, page 453,
and page 455
Essential of medical pharmacology by KD Tripathi page no 167-184 and 254-265
Goodman and gilman‟s The pharmacological basis of therapeutics 10 th edition
Henry,.D.R.; Ozkabak,A.G. “conformational flexibility in 3D structure searching,” in the

encyclopedia of computional chemistry,1998
6. Mayer, R. J.; Marshall, L. A. New Insights on Mammalian Phopspholipase A2(s); Comparison

of Arachidonoyl-Selective and Non Selective Enzymes. FASEB J. 1993, 7, 339-348. (b)
Bomalaski, J. S.; Clark, M. A. Phospholipase A2 and Arthritis. Arthritis Rheumatism 1993, 36,
190-198.
7. (A) Vadas, P.; Pruzanski, W.; Stefanski, E.; Ruse, J.; Farewell, V.; McLaughlin, J.; Bombardier,
C. Concordance of Endogeneous Cortisol and Phospholipase A2 Levels in Gram-Negative Septic
Shock: A Prospective Study. J. Lab. Clin. Med. 1988, 111, 584.
(b) Aufenanger, J.; Zimmer, W.; Kattermann, R. Characteristics and Clinical Application of a
Radiometric Escherichia Coli-Based Phospholipase A2 Assay Modified for Serum Analysis. Clin.
Chem. 1993, 39, 605-613.
8. Nevalainen, T. J. The role of phospholipase A in acute pancreatitis.Scand. J. Gastroenterol

1980, 15, 641-650.
9. (a) Vadas, P.; Pruzanski, W. Role of Secretory Phospholipase A2 in the Pathobiology of

Disease. Lab. Invest. 1986, 55, 391-404.
(b) Pruzanski, W.; Vadas, P. Secretory Synovial Fluid Phosphlipase A2 and its Role in the
Pathogenesis of Inflammation in Athritis. J. Rheumatol. 1988, 15, 1601-1603. (c) Pruzanski, W.,
Vadas, P. Phospholipase A2: A Mediator Between Proximal and Distal Effectors of Inflammation.
Immunol. Today 1991, 12, 143-146.
10. (a) Mobilio, D.; Marshall, L. A. Recent Advances in the Design and Evaluation of Inhibitors
of PLA2. Annu. Rep. Med. Chem. 1989, 24, 157-166. (b) Sofia, M. J.; Silbough, S. A. Novel
Approaches to Anti-Inflammatory Agents as Therapeutics for Pulmonary Disease. Annu. Rep.
Med. Chem. 1993, 28, 109-118.
102
(c) Wilkerson, W. W. Anti-Inflammatory Phospholipase A2 Inhibitors. Drugs Future 1990, 15,
139-148.
11. Schevitz, R. V.; Bach, N. J.; Carlson, D. G.; Chirgadze, N. Y.; Carlson, D. K.; Dillard, R. D.;
Draheim, S. E.; Hartley, L. W.;Jones, N. D.; Mihelich, E. D.; Olkowski, J. L.; Snyder, D. W.;
Sommers, C.; Wery, J.-P. Structure-Based Design of the First Potent and Selective Inhibitor of
Human non-Pancreatic Secretory Phospholipase A2. Nature Struct. Biol. 1995, 2, 458-465
12. Marison, L.; Cockburn, W. F. The synthesis of pseudoconhydrine. J. Am. Chem. Soc. 1949,
71, 3402-3404. Konakahara, T.; Takagi, Y. Convenient method for the preparation of 2-
phenacylpyridines. Heterocycles 1980, 14, 393-396.
13 (a) Uchida, T.; Matsumoto, K. Methods for the construction of the indolizine nucleus. Synthesis
1976, 209-236. (b) Casagrande, C.; Invernizzi, A.; Ferrini, R.; Miragoli, G. Indolizine analogues
of indomethacin. Farmaco. Ed. Sci. 1971, 26, 1059-1073.
14. (a) Desidiri, N.; Galli, A.; Seslili, I.; Stein, M. l. Synthesis and binding properties to GABA
receptors of 3-hydroxypyridyl- and 3-hydroxypiperidyl- analogues of Baclofen. Arch. Pharm.
1992, 325, 29-33. (b) Kimura. E.; Kotake, Y.; Koike, T.; Shionoya, M.; Shiro, M. A novel cyclam
appended with 3-hydroxypyridine. An ambient donor ligand comprising a pyridyl N and
apyridynolate O- donor. Inorg. Chem. 1990, 29, 4991-4996.
15 Lau, C. K.; Tardif, S.; Dufresne, C.; Scheigetz, J. Reductive deoxygenation of aryl aldehydes
and ketones by tert-butylamineborane and aluminum chloride. J. Org. Chem. 1989, 54, 491- 494.
16. Desideri, N.; Manna, F.; Stein, M. L.; Bile, G.; Filippelli, W.; Marmo, E. Eur. Synthesis of 3-
hydroxy-2-pyridineacetic acid and its evaluation on experimentallipaemia. J. Med. Chem. 1983,
18, 295-299.
17. Bellesia, F.; Ghelfi, F.; Grandi, R.; Pagnoni, U. M. Regioselective R-bromination of carbonyl
compounds with trimethylbromosilane- dimethylsulfoxide. J. Chem. Res. (s) 1986, 428-429.
Potent Inhibitors of Secretory Phospholipase A2 Journal of Medicinal Chemistry, 1996, Vol. 39,
No. 19 3657
18. (a) Rapport, H.; Volcheck jr., E. J. The synthesis of desoxycarpyrinic and carprinic acids. J.
Am. Chem. Soc. 1956, 78, 2451-2455. (b) Iorio, M. A.; Gatta, F.; Michalek, H. Synthesis
andconformational aspects of acetoxypiperidinium iodides related to acetylcholine. Eur. J. Med.
Chem. 1980, 15, 165-171.
19. Flitsh, W. Pyrroles with fused six-membered heterocyclic rings:
(i) a-fused. Comprehensive Heterocyclic Chemistry; Bird, C. W., Cheesman, G. W. M., Eds.;
Pergamon Press: London. 1984; Vol. 4, pp 443-495.
103
20. Kakehi, A.; Ito, S.; Yamada, N.; Yamaguchi, K. Preparation of new nitrogen-bridged
heterocycles. 21. A facile synthesis of 2-indolizinethiols using new protecting groups. Bull. Chem.
Soc. Jpn. 1990, 63, 829-834.
21. Gresham, T. L.; Jansen, J. E.; Shaver, F. W.; Bankart, R. A.; Beears, W. L.; Prendergast, M. G.
â-Propiolactone. VI. Reactions with phenols, thiophenols and their salts. J. Am. Chem. Soc. 1949,
71, 661-663.
22. Reynolds, L. J.; Hughes, L. L.; Dennis, E. A. Analysis of human synovial fluid phospholipase
A2 on short chain phosphatidylcholine- mixed micelles: development of a spectrometric assay
suitable for a microtiterplate reader. Anal. Biol. 1992, 204, 190-197.
23. Schadlich, H. R.; Buchler, M.; Berger, H. G. Improved method for the determination of
phospholipase A2 catalytic concentration in human serum and ascities. J. Clin. Chem. Clin.
Biochem. 1987, 25, 505-509.
24. Tojo, H.; Ono, T.; Okamoto M. Reverse-phase high performance liquid chromatographic assay
of phospholipases: application of spectrometric detection to rat phospholiase A2 isozymes J. Lipid
Res. 1993, 34, 837-844.
25. Hamanaka, Y.; Fukushima, S.; Hiyama, T. a convenient synthesis of substituted

heteroaromatic compounds via the palladium- catalyzed cross-coupling reaction of organosilicon
compounds. Heterocycles 1990, 30, 303-306.
26. Dole, V. P.; Meinertz, H. J. Microdetermination of long-chain fatty acids in plasma and
tissues. J. Biol. Chem. 1960, 235, 2595-2599.JM960395Q 3658 Journal of Medicinal Chemistry,
1996, Vol. 39, No. 19 Hagishita et al.
104

QSAR, Pharmacophore and Docking Studies On Human Phaspholipase A2 Inhibitors

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

QSAR, Pharmacophore and Docking Studies On Human Phaspholipase A2 Inhibitors

Uploaded by

Copyright:

Available Formats

QSAR, PHARMACOPHORE AND

DOCKING STUDIES ON HUMAN PLA2

GVK Biosciences Private Limited nirfan_05@rediffmail.com

# phase-1, technocrats industrial

HUMAN PLA2 INHIBITORS

Protein Modeling and Rational Drug Designing

In bioCampus Centre of Excellence

GVK Biosciences Private Limited

There are no sources in the current document.

3.1 Drug discovery 7

3.2 introduction to protein 9

4 Material and Methods 28

4.1 Analogue based drug designing

4.1.1 Quantitative structure activity relationships(Qsar) 51

4.2 structure based drug designing

4.2.1 Structure based pharmacophore generation 70

4.2.2 Docking studies

4.2.2a Ligand Fit 71

4.2.2c Lib Dock 73

5 Result and Discussions

5.2 Common feature pharmacophore generation 88

5.3 3D Qsar pharmacophore generation 90

5.4 structure based pharmacophore generation 97

5.5 Ligand fit 100

5.6 C – Docker 102

5.7 Lib Dock 112

5.8 Ludi 117

Phospholipase A2 is an enzyme which hydrolyzes the sn-2 position of certain

CADD Computer Aided drug design

NSAIDS Nonsteroidal Anti-inflammatory drugs

CNS Central Nervous system

HDL High density lipids

ASP Aspartic acid

CHARMM Chemistry at Harvard macromolecular mechanics

SD FILE Structural data file

IC50 Half maximal inhibitory concentration

PRESS Predicted residual error sum squares

LOF Lake of fit

CSD Cambridge structure data base

MLR Multiple linear regression

HBD Hydrogen bond donor

HBA Hydrogen bond acceptor

PDB Protein data bank

SBDD Structure based drug designing

ABGD Analog based drug designing

RMS Root mean square

HTS High throughput screening

DNA Deoxyribonucleic acid

NMR Nuclear magnetic resonance

QSAR Quantitative structure activity relationship

SAR Structure activity relationship

ADMET Adsorption distribution metabolism excretion toxicity

Figure 1. Drug Discovery and development.

Figure 2.Role of computer aided drug designing

3.2 Introduction to target protein:

The Inflammatory Response

figure3. The Inflammatory Response against pathogens.

Side effects of NSAIDS:

Figure5 Biosynthesis of Arachidonic acid

Figure6 Role of phospholipase A2

PLA2 can also be characterized as having a channel featuring a hydrophobic wall

Why phospholipase a2 inhibitores are needed:

Pipeline drugs against PLA2: