You are on page 1of 32

Computers & Chemistry 26 (2002) 601 /632

www.elsevier.com/locate/compchem

Automatic identification by 13C NMR of substituent groups


bonded in natural product skeletons
Marcelo J.P. Ferreira a, Francimeiry C. Oliveira a, Sandra A.V. Alvarenga b,
Patrcia A.T. Macari c, Gilberto V. Rodrigues d, Vicente P. Emerenciano a,*
a
Instituto de Qumica, Universidade de Sao Paulo, Caixa Postal 26077, 05513-970, Sao Paulo, Brazil
Faculdade de Engenharia de Guaratingueta, UNESP, CEP 12516-410, Guaratingueta, Sao Paulo, Brazil
c
Faculdade de Ciencias Exatas e Experimentais, Universidade Mackenzie, 01239-902, Sao Paulo, Brazil
d
Departamento de Qumica, ICEx, Universidade Federal de Minas Gerais, 30161-000 Belo Horizonte, Brazil
b

Received 10 December 2001; received in revised form 23 January 2002; accepted 15 March 2002

Abstract
The aim of this paper is to present a procedure that utilizes 13C NMR for identification of substituent groups which
are bonded to carbon skeletons of natural products. For so much was developed a new version of the program
MACRONO, that presents a database with 161 substituent types found in the most varied terpenoids. This new version
was widely tested in the identification of the substituents of 60 compounds that, after removal of the signals that did not
belong to the carbon skeleton, served to test the prediction of skeletons by using other programs of the expert system
SISTEMAT.
# 2002 Elsevier Science Ltd. All rights reserved.
Keywords: Terpenoids; Natural products;

13

C NMR; Substituent elucidation; Expert system

1. Introduction
The recognition of substructures, parts of a structure,
has always been one of the steps tracked during the
processes of spectral analysis, whose main objective is
the structural determination of a compound. However,
the automation of these processes is not simple, due to
the complexity and great structural diversity found, for
example, in natural product chemistry. In order to help
the user resolve these problems, numerous expert
systems have been developed and tested for automatic
identification of substructures (Lindsay et al., 1980;
Carhart et al., 1981; Shelley and Munk, 1982; Attias,
1983; Gray, 1986; Munk et al., 1986; Christie and Munk,

* Corresponding author. Tel.: /55-11-38182056; fax: /5511-38155579


E-mail
address:
vdpemere@quim.iq.usp.br
(V.P.
Emerenciano).

1987; Carabedian et al., 1988; Funatsu and Sasaki, 1996;


Will et al., 1996; Munk, 1998; Jaspars, 1999; Badertscher
et al., 2000).
In the 90s our artificial intelligence group has developed an expert system denominated SISTEMAT (Gastmans et al., 1990; Emerenciano et al., 1994). The major
goal of this novel system is to become an auxiliary tool
for chemists of natural products, thus enabling these
researchers to achieve the most likely carbon skeletons
of the compounds more quickly and effectively. In the
development of the new expert system, the notion of
carbon skeletons of the substances has been implemented, because it is one of the fundamental points in the
process of structural determination of substances occurring naturally. The knowledge about carbon skeletons
avoids a combinatorial explosion in the generator of
structures. Through this system, several classes of
natural products have already been studied, for example, sesquiterpenes (Emerenciano et al., 1993), diterpenes (Alvarenga et al., 1997), triterpenes (Macari et al.,

0097-8485/02/$ - see front matter # 2002 Elsevier Science Ltd. All rights reserved.
PII: S 0 0 9 7 - 8 4 8 5 ( 0 2 ) 0 0 0 2 9 - 3

602

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 1. Monoterpene used to demonstrated the influence of the signals of the substituents in the prevision of skeletons.

1994/1995) and flavonoids (Emerenciano et al., 1997).


Several programs using mainly data from the 13C NMR
spectra also have been developed (Macari et al., 1994/
1995; Fromanteau et al., 1993; Rodrigues et al., 1997;
Ferreira et al., 2001a).
During countless analyses already accomplished
through the SISTEMAT (Emerenciano et al., 1993;
Alvarenga et al., 1997; Macari et al., 1994/1995;
Emerenciano et al., 1997), it has been verified that
some erroneous results in the prediction of skeletons can
occur especially when the structure of the substance has
macronodes. Macronodes are groups of carbon atoms
which are not directly linked to the skeleton of the
substance. They are usually linked to the carbons of the
skeleton through a function ether or ester. So, errors
become more accentuated when the program SISCONST
is used (Fromanteau et al., 1993), for it predicts the
probable skeleton of the substance and supplies big
substructures with attribution of signals compatible with
the supplied 13C NMR data (chemical shifts and multiplicity). The probability of a carbon skeleton being
calculated by the program SISCONST is based on the
skeletons of the selected substructures. A substructure is
only selected if it presents a subspectrum with a
minimum number of signals demanded by the user,
and these signals are related to interlinked atoms of
carbon (substructure) and equivalent to the signals on
the tested spectra. Therefore, even in the presence of
signals that are not part of the skeleton of the substance,
the program can select substructures presenting chemical shifts belonging to the substituents. However, at the
end of the analysis, the results may show errors

concerning the proposed skeletons, for certain substructures are only selected because of the presence of the
signals of the macronodes.
In order to demonstrate the influence of the signals of
the substituents in the prediction of skeletons, a monoterpene (Fig. 1) was chosen and tested with and without
the presence of the signals of 13C NMR of the
substituents. The results obtained through the program
SISCONST are shown in Table 1.
In attempting to improve the SISCONSTs performance
as well as to eliminate such problems originated from
macronodes, an initial version of the program
MACRONO (Rodrigues et al., 1997) was, then, developed.
After some tests on sesquiterpene lactones, it presented
the correct information of their most common types of
substituents at a high hit level. But, on the other hand,
when the same program was submitted to tests on other
terpene types, its performance was drastically reduced if
is compared with that obtained with the former compounds. Undoubtedly the defficient results were due to

Table 1
Results presented by the

SISCONST

program

Skeletal type

With substituents

Without substituents

Myrcane
Menthane
Ionane
Cyclogeraniolane
Bornane

25.0
13.9
35.0
20.0
16.1

84.7
10.6
4.7
/
/

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

the great variety of substituents in terpenes whose data


had not been entered yet into the utilized database.
Considering the above commented facts, it was
necessary to create a new version of the program
MACRONO which is being introduced as the main

603

purpose in this paper. This new version corrects some


imperfections of the first one and works with a more
updated database possessing the most diversified kinds
of macronodes which are generally encountered in
various classes of natural products. In addition, we

Fig. 2. Macronodes inserted in database.

604

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

intend to evaluate, through this program, the efficiency


of the search for the most likely substituents present in
the carbon skeletons of several natural products and,
afterwards, to try to make the prediction of the complete
structures (main carbon backbones plus macronodes) of
such compounds more successful by employing both

programs
search.

Fig. 2 (Continued)

MACRONO

and

SISCONST

jointly in this

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

2. The program

MACRONOs

database

The initial version of the program MACRONO (Rodrigues et al., 1997) operated by comparing the 13C
NMR data obtained from a substance with information
contained in a database that presented only 58 macro-

605

node types. At the end of the analysis, the probable


substituents found in the substance were supplied
together with the range of average error and the number
of macronode carbons. The range of average error is the
average deviation among chemical shifts of the tested
sample and those of the database. Therefore, as smaller

Fig. 2 (Continued)

606

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

is this error range, more reliable will be the result.


Hence, the best results were obtained when the smallest
ranges of average error were displayed to the macronodes possessing the number of signals of carbon atoms
in excess in the spectra, when these signals were

compared with those of the substance without the


macronodes.
During the construction of the database of the programs initial version, we used in many macronodes
the chemical shifts of the free acids, what caused

Fig. 2 (Continued)

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

countless errors. In this new version, we took care of


inserting only substituents whose chemical shifts
are from the corresponding esterified acid forms,

607

and also corrected the chemical shifts of the actual


substituents, in order to avoid the same cause of

Fig. 2 (Continued)

608

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

mistakes detected previously (Rodrigues et al.,


1997).
As we need a program for general use, but with main
emphasis on elucidation of terpenoid structures, we
collected the present macronodes in the most varied
types of terpenes (Connolly and Hill, 1991) and built the
current programs database exhibiting the chemical
shifts and multiplicities of 161 different types of macronodes. Fig. 2 shows the names and structures of the
macronodes inserted into the database.

3. Results
In order to test the acting of the new program and to
evaluate its efficiency, we randomly selected from
the literature the 13C NMR spectra data of the 60
compounds bearing substituents of well-known structures (Table 2). The structures of these compounds,
used to test both programs jointly, are exhibited in Fig.
3.

Fig. 2 (Continued)

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

609

Fig. 2 (Continued)

The results obtained with the use of the programs


MACRONO and SISCONST are shown in Table 3. This
contains the names, the error ranges and the number of
carbons of the macronodes proposed by the program
MACRONO; the names of existing macronodes in the

structure of the substance and the three most probable


skeletons proposed by the program SISCONST.

610

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 2 (Continued)

4. Discussion of the results


The analysis of 121 substituents was carried out
through the program MACRONO. A summary of the

numer of hits and errors performed by the program are


presented in Table 4.

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

611

Fig. 2 (Continued)

The program MACRONO showed a 95.04% accuracy.


In the six incorrect proposals of macronode, one can
verify the following:

1)

In test 12, the program supplies the substituents


glucopyranosyl and allopyranosyl with similar
ranges of average error. Therefore, the user
is not able to discern between the two proposals.

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

612

Table 2
Chemical shifts of the substances used to test the programs

1
2
3
4
5
6
7
8
9
10
OR

1
3
4
5
6
7
8
9
10
11
OMe
Gly-1?
2?
3?
4?
5?
6?
OR

1M

7P

8P

9W

10M

66.3
121.4
142.1
36.6
34.2
76.1
148.7
111.5
29.9
16.6
102.5
75.5
84.3
70.6
75.5
64.7
102.7
72.3
72.3
70.4
70.1
17.9
169.0
115.0
146.8
127.2
131.2
116.9
161.2
116.9
131.2

66.0
119.8
140.4
37.8
26.9
153.1
139.8
195.0
9.2
16.8
161.4
113.5
143.3
111.5
108.3
146.7
151.9
101.2
149.9
56.4
/
/
/
/
/
/
/
/
/
/
/

69.2
121.0
140.5
35.5
28.7
88.9
143.8
114.1
17.1
16.1
132.4
103.7
153.7
132.4
153.7
103.7
127.9
131.2
63.6
56.1
56.1q
/
/
/
/
/
/
/
/
/
/

139.6
138.6
67.4
46.8
73.8
195.0
60.5
27.8
19.7
19.6
165.8
143.4
124.8
66.6
21.5
170.0
21.0
/
/
/
/
/
/
/
/
/
/
/
/
/
/

139.7
124.6
69.2
40.2
68.9
65.9
63.0
26.7
20.1
20.0
167.0
128.3
138.2
11.9
14.5
167.0
128.3
138.2
11.9
14.5
170.4
21.3
/
/
/
/
/
/
/
/
/

45.0
37.4
118.2
31.9
50.6
141.9
83.4
26.5
22.7
22.6
99.0
73.9
75.4
73.7
70.7
63.1
171.0
20.8
171.8
21.0
/
/
/
/
/
/
/
/
/
/
/

88.7
85.8
44.5
105.7
43.6
22.7
71.3
61.2
101.5
19.6
100.1
74.9
78.0
71.6
75.0
64.7
130.4
129.8
128.6
133.2
128.6
129.8
166.2
122.7
131.8
114.1
163.6
114.1
131.8
166.4
55.2

88.7
85.8
44.5
105.7
43.6
22.7
71.4
60.6
101.5
19.6
100.1
74.7
78.0
71.6
74.9
64.6
121.1
132.2
115.9
163.4
115.9
132.2
166.4
123.3
131.8
114.0
163.7
114.0
131.8
166.0
55.2

48.8
37.6
75.2
83.6
40.0
82.3
13.9
47.3
17.7
17.6
101.8
73.8
76.8
70.5
76.5
61.5
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/

37.1
48.2
202.0
126.0
165.8
56.7
128.7
138.1
76.5
20.9
27.7
28.0
23.8
100.9
79.2
77.8
71.4
77.9
62.6
110.7
78.5
80.6
75.3
66.0
/
/
/
/
/
/
/

11M

12W

13

14M

15M

16M

17M

18M

19M

20M

94.0
152.2
115.7
69.0
45.5
80.3
78.7
58.2
21.3
167.9
51.7
99.6
74.5
77.5
71.7
78.4
62.8
128.7
133.6
114.4
162.1
114.4
133.6

95.1
148.8
116.5
32.5
30.0
39.8
81.1
51.5
24.4
170.9
/
98.8
73.6
76.4
70.8
74.8
63.9
170.9
127.9
145.6
24.2
40.7
74.4

93.8
152.3
105.8
37.6
76.0
43.8
87.0
48.0
21.2
165.6
50.4
98.9
72.2
76.0
69.9
76.1
61.7
124.6
128.9
115.2
158.9
115.2
128.9

97.4
152.5
113.1
32.5
40.4
78.7
40.9
47.1
13.7
169.2
51.8
100.1
74.6
77.8
71.5
78.2
62.7
168.9
129.3
142.9
28.1
31.5
138.5

95.7
141.8
103.8
39.1
79.5
62.8
63.6
43.7
64.3
/
/
100.4
74.8
78.5
71.5
77.9
63.0
128.9
112.5
151.6
148.0
114.9
122.9

93.0
140.9
105.8
38.1
83.5
73.5
79.5
49.1
66.2
/
/
99.9
74.8
78.1
71.4
78.0
62.7
128.4
131.1
115.5
163.3
115.5
131.1

66.1
68.2
33.2
52.2
84.4
127.8
152.4
98.2
59.2
/
/
/
/
/
/
/
/
168.1
122.5
132.8
116.2
163.6
132.8

96.5
154.2
109.5
28.6
34.6
174.5
75.5
41.8
21.4
168.2
51.9
101.0
74.6
77.7
71.7
75.7
64.6
113.0
155.9
115.3
119.3
150.9
125.2

94.5
154.4
108.8
31.0
40.5
172.5
124.1
130.0
12.9
167.5
/
100.2
74.1
77.7
70.8
70.3
62.0
65.7
34.6
129.7
130.3
115.5
156.3

94.7
154.8
109.3
30.8
41.1
173.2
129.3
131.9
59.4
167.6
52.4
97.8
74.9
79.5
68.6
74.6
65.1
130.8
116.9
146.2
144.8
117.5
121.8

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

613

Table 2 (Continued )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
OR

1
2
3
4
5
6
7
8
9

145.0
117.8
167.4
/
/
/
/
/
/
/
/
/
/

144.6
113.2
12.5
27.0
/
/
/
/
/
/
/
/
/

113.5
144.0
165.7
170.3
20.8
/
/
/
/
/
/
/
/

126.5
59.1
12.5
23.5
/
/
/
/
/
/
/
/
/

146.9
115.9
168.9
56.4
/
/
/
/
/
/
/
/
/

146.5
116.0
168.7
55.9
/
/
/
/
/
/
/
/
/

116.2
/
/
/
/
/
/
/
/
/
/
/
/

170.7
/
/
/
/
/
/
/
/
/
/
/
/

115.5
130.3
66.2
34.7
129.5
116.4
145.5
144.2
115.8
120.6
/
/
/

72.6
36.1
127.1
115.2
148.6
147.3
116.7
123.4
116.5
147.8
169.6
171.0
21.3

21M

22P

23

24

25

26

27A

28

29

30

130.3
147.5
143.4
130.3
122.6
136.5
44.6
24.1
77.2
34.1
35.9
19.3
22.4
17.2
14.8
105.5
75.5
77.6
71.2
66.9
107.8
75.4
78.3
71.7
78.0
62.8
103.8
75.3
78.4
71.7
78.0
70.1

225.6
53.2
46.1
81.6
125.0
141.7
29.9
71.3
64.2
21.5
19.2
6.2
9.9
27.4
67.8
99.6
75.4
78.4
72.1
78.9
63.1
/
/
/
/
/
/
/
/
/
/
/

36.5
25.6
72.9
85.7
48.6
140.1
145.1
200.3
57.6
40.0
71.7
28.9
29.1
18.5
18.7
167.0
127.8
138.1
15.7
20.6
170.7
22.8
/
/
/
/
/
/
/
/
/
/

77.5
29.3
76.2
37.2
46.7
22.5
117.0
134.9
57.7
37.4
98.2
69.4
27.1
15.5
9.1
169.7
20.8
170.0
20.8
170.9
21.0
/
/
/
/
/
/
/
/
/
/
/

124.3
21.3
31.3
31.9
42.5
72.1
119.7
151.3
26.3
135.3
117.4
138.2
8.7
16.6
15.1
177.0
34.4
19.4
18.7
/
/
/
/
/
/
/
/
/
/
/
/
/

209.0
36.0
29.8
30.3
44.1
67.9
116.0
149.8
22.3
51.1
119.4
138.4
8.2
17.8
14.8
176.6
41.2
26.4
17.0
11.7
/
/
/
/
/
/
/
/
/
/
/
/

64.5
135.6
132.6
77.0
77.0
33.7
39.7
22.1
44.8
37.6
44.5
43.5
39.8
72.4
27.1
172.0
105.5
166.1
101.6
163.7
112.5
144.6
24.3
/
/
/
/
/
/
/
/
/

48.2
80.3
41.6
85.6
54.6
70.5
40.6
133.6
124.6
35.0
37.0
18.2
17.7
20.1
25.8
166.6
122.6
113.7
131.6
163.5
131.6
113.7
55.3
/
/
/
/
/
/
/
/
/

41.2
48.8
205.3
144.0
174.6
21.1
40.4
85.5
207.2
42.8
25.3
20.1
20.5
30.1
21.4
167.6
127.0
140.3
15.9
20.5
/
/
/
/
/
/
/
/
/
/
/
/

69.7
67.6
41.2
72.2
91.1
78.7
50.2
34.4
68.8
53.6
84.8
26.4
30.0
24.6
65.8
170.3
21.4
109.9
118.7
143.7
148.9
161.8
11.3
11.6
15.3
16.4
25.5
26.5
40.6
41.6
174.5
175.1

31

32

33

34

35

36

37

38M

39

40

129.5
26.3
34.7
143.8
128.7
76.6
52.7
72.5
49.0

129.5
26.3
34.7
143.7
128.7
76.5
53.1
72.5
48.8

129.4
26.1
34.7
142.8
129.1
76.1
58.1
73.3
49.0

129.4
26.1
34.7
142.8
129.1
76.1
58.4
73.4
49.1

78.0
27.2
22.4
44.9
48.9
76.1
53.8
69.9
43.9

37.3
25.1
73.6
86.9
47.5
22.6
160.3
77.4
50.2

37.9
25.2
73.4
87.3
49.0
22.7
159.4
102.9
54.0

48.3
40.2
77.0
85.2
59.2
77.3
46.9
75.0
35.6

153.0
188.9
153.7
146.1
52.4
85.1
47.9
24.1
36.9

45.5
36.3
75.0
152.2
51.4
77.7
47.7
74.1
37.5

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

614
Table 2 (continued )
10
11
12
13
14
15
OR

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
OR

132.6
135.3
169.7
125.8
16.7
61.4
172.1
38.4
32.7
67.1
16.7
/
/
/
/
/

132.9
135.5
169.9
125.2
16.8
61.5
166.7
128.4
138.6
14.6
12.8
/
/
/
/
/

132.7
40.3
178.1
17.2
16.6
61.5
174.8
41.9
64.4
13.5
/
/
/
/
/
/

132.8
40.2
178.0
17.0
16.6
61.5
166.4
131.9
141.4
14.4
56.9
/
/
/
/
/

41.5
136.3
169.2
120.5
13.9
201.8
165.2
139.0
126.7
62.3
/
/
/
/
/
/

35.8
120.8
174.2
8.0
19.2
16.8
166.8
127.9
137.9
15.7
20.6
170.2
22.5
/
/
/

35.8
122.6
172.9
8.0
19.8
17.2
167.9
127.8
138.0
15.8
20.6
170.2
22.7
/
/
/

145.0
139.3
170.8
122.1
117.2
50.1
167.4
132.1
141.9
14.3
55.7
/
/
/
/
/

124.6
139.5
169.1
117.9
21.7
14.9
101.6
75.3
78.3
71.1
78.3
62.3
/
/
/
/

141.6
137.2
169.1
122.9
118.4
116.0
167.2
128.4
141.3
59.9
12.8
166.4
128.0
140.6
59.8
12.8

41

42

43

44

45

46

47

48

49

50M

46.2
86.2
78.1
49.5
68.1
143.3
68.0
73.8
205.5
48.5
136.3
133.3
44.8
210.7
91.5
17.6
113.0
22.5
27.7
20.3
170.4
170.1
169.5
169.2
169.2
22.1
21.4
20.9
20.8
20.6
175.7
33.5
19.1
18.5
/

46.3
86.2
78.1
49.5
68.0
143.3
68.0
73.9
205.5
48.5
136.3
133.3
44.8
210.7
91.5
17.7
113.1
22.6
27.8
20.3
170.4
170.1
169.5
169.2
169.2
22.1
21.4
20.9
20.8
20.6
174.2
40.4
26.7
16.3
11.1

131.6
136.4
82.3
84.9
74.6
136.2
128.1
43.1
205.5
72.0
38.4
30.8
24.2
23.4
27.6
24.5
65.1
16.8
15.5
66.5
170.9
20.9
168.2
127.8
139.7
15.8
20.7
168.2
127.1
137.8
15.7
20.6
/
/
/

131.6
136.2
82.8
84.7
77.5
138.3
122.1
42.6
205.1
72.0
38.1
35.2
68.7
28.8
34.1
18.6
65.6
17.8
15.9
21.8
170.8
21.2
168.4
126.9
140.2
15.6
20.7
166.7
130.2
129.7
128.4
132.9
128.4
129.7
/

31.7
29.6
76.7
73.4
116.5
140.2
76.4
71.3
24.9
25.3
31.1
70.5
43.6
207.2
71.2
17.0
17.4
24.5
62.6
13.3
172.2
21.2
170.6
20.6
166.8
138.5
128.4
14.6
12.2
166.4
138.0
128.1
14.5
12.0
/

31.5
29.5
76.9
73.3
117.2
139.7
76.2
71.5
24.7
19.3
30.7
70.6
43.1
207.6
71.1
16.9
17.6
29.0
16.1
13.4
170.5
21.0
170.4
20.9
170.3
20.5
171.8
43.5
26.0
22.2
22.1
/
/
/
/

38.9
18.8
36.0
37.4
56.1
24.4
38.3
147.0
56.0
39.6
21.3
27.4
171.0
115.2
174.3
73.1
107.0
27.5
67.0
15.1
177.6
28.9
28.9
172.2
/
/
/
/
/
/
/
/
/
/
/

39.3
18.0
36.3
36.3
46.9
26.2
74.2
75.0
54.4
38.4
24.1
34.9
147.1
138.7
113.5
115.8
22.9
27.0
67.3
15.3
126.9
132.1
115.1
157.3
115.1
132.1
143.8
116.7
167.4
/
/
/
/
/
/

27.3
70.2
36.9
61.4
45.1
72.2
33.3
36.4
39.7
42.5
84.6
30.8
46.1
101.8
146.9
107.6
16.2
49.7
61.8
13.9
170.6
21.2
170.1
21.1
175.3
41.4
26.7
11.7
16.6
/
/
/
/
/
/

41.9
20.3
39.1
44.1
49.0
36.9
77.1
51.7
49.6
40.8
19.0
34.2
43.5
35.7
80.8
160.3
109.1
29.3
181.4
16.4
136.0
130.0
129.2
131.3
129.2
130.0
145.5
120.1
168.1
/
/
/
/
/
/

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

615

Table 2 (continued )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
OR

51P

52M

53

54P

55M

56M

57P

58M

59M

60

32.5
30.4
89.1
42.9
54.2
67.9
38.6
46.9
21.2
29.5
26.6
33.9
46.6
47.0
49.0
73.0
56.6
21.2
30.4
86.8
26.5
37.7
24.5
85.1
70.4
28.3
27.0
29.0
16.7
20.6
106.7
75.6
78.4
71.5
77.9
62.7
105.1
83.7
78.5
71.7
78.1
63.0
106.2
77.2
78.2
72.0
78.3
63.0

33.1
30.5
91.0
42.1
48.9
22.1
27.2
49.6
21.1
27.4
27.3
34.4
49.9
47.9
47.7
73.0
52.9
19.4
31.0
33.4
15.9
80.9
36.3
78.3
84.0
26.0
23.1
20.4
26.1
15.5
104.2
81.8
77.7
71.6
77.1
62.6
/
/
/
/
/
/
/
/
/
/
/
/

38.3
23.7
80.6
37.8
55.3
18.2
34.2
40.8
50.3
37.1
20.9
25.1
38.0
42.8
27.4
35.5
43.0
48.3
48.0
151.0
29.8
40.0
27.9
19.3
18.0
16.6
16.2
15.9
109.3
14.5
173.7
34.8
24.8
31.3
22.3
13.9
/
/
/
/
/
/
/
/
/
/
/
/

39.0
26.6
89.0
39.5
55.7
18.3
34.9
41.6
50.9
36.9
21.8
29.6
38.6
43.5
30.5
32.4
59.5
49.2
49.8
72.1
29.0
37.0
27.9
16.5
16.3
16.7
15.2
175.2
26.9
31.5
104.9
83.2
77.9
71.5
78.2
62.7
105.8
76.7
77.8
71.5
78.1
62.1
95.2
74.1
78.8
71.0
79.2
62.6

49.0
67.9
77.8
41.0
48.6
19.0
33.1
40.3
47.5
37.7
24.8
128.9
140.1
41.2
28.0
26.1
48.0
51.1
154.0
38.1
32.5
38.0
67.1
14.0
16.9
17.1
23.8
182.0
107.8
21.5
168.7
115.1
145.8
125.3
130.5
116.0
160.4
116.0
130.5
/
/
/
/
/
/
/
/
/

39.6
28.5
90.8
40.2
57.0
19.5
34.1
41.3
47.8
37.9
24.7
129.7
139.6
42.6
29.7
26.5
48.7
54.8
73.7
42.8
28.5
38.2
28.6
17.7
16.1
17.0
27.1
178.6
24.7
16.6
107.2
75.5
78.0
71.2
67.0
95.9
73.8
78.1
71.0
77.9
69.6
102.5
72.4
72.7
68.9
75.5
63.1
/

47.9
68.7
85.8
44.0
56.6
19.4
33.9
40.4
47.8
38.3
24.4
127.9
140.4
42.1
29.3
27.0
48.3
54.6
72.7
42.4
26.4
38.5
24.2
71.9
17.3
17.1
24.7
180.7
27.1
16.8
65.7
30.8
19.4
13.8
/
/
/
/
/
/
/
/
/
/
/
/
/
/

47.9
68.1
87.2
45.3
47.8
18.8
33.1
40.6
47.0
38.5
24.6
123.4
145.0
42.9
28.8
24.0
47.7
42.4
47.0
30.6
34.7
33.2
64.1
14.7
17.7
17.9
26.4
177.5
33.5
24.0
105.1
75.1
78.1
73.7
75.6
/
95.9
74.1
78.7
71.3
79.1
62.5
/
/
/
/
/
/

44.5
71.1
87.2
53.1
52.9
21.4
33.4
40.9
49.0
37.2
24.5
124.0
144.0
42.4
28.6
23.9
48.6
48.2
42.8
149.2
30.7
38.2
181.6
13.4
16.9
17.5
26.2
177.2
107.2
/
105.5
75.6
83.2
72.3
76.6
172.6
95.6
74.0
78.2
71.0
78.6
62.3
/
/
/
/
/
/

150.1
127.8
202.9
53.4
134.1
140.9
196.9
45.5
42.3
48.4
72.5
78.7
40.0
77.2
55.1
32.0
41.6
121.7
140.4
111.0
142.8
26.8
24.7
/
/
/
/
22.6
21.1
15.6
169.3
20.9
168.9
21.2
/
/
/
/
/
/
/
/
/
/
/
/
/
/

Solvents: CDCl3; M/CD3OD; P /C5D5N; W/D2O; A/Acetone-d6. Compound (Reference): (1) Tian et al., 1998; (2) Kwak et
al., 1997; (3) Ma et al., 1997; (4,5) Ahmed and Mahmoud, 1997; (6) Ragasa et al., 1997; (7,8) Lin et al., 1996; (9) Lemmich, 1995; (10)
Takeda et al., 1997; (11) Schuquel et al., 1998; (12) Taskova et al., 1998; (13) Tuntiwachwuttikul et al., 1998; (14) Damtoft et al., 1997;
(15,16) Sudo et al., 1997; (17) Kanai et al., 1996; (18) Tan and Kong, 1997; (19) Iossifova et al., 1998; (20) Hosny, 1998; (21) Otsuka et
al., 1996; (22) Castillo et al. 1997; (23) Guilhon and Muller, 1996; (24) Siems et al., 1996; (25,26) Torres et al., 1998; (27) Donnelly et al.,
1997; (28,29) Mahmoud, 1997; (30) Ujita et al., 1992; (31 /35) Lazari et al., 1998; (36,37) Guilhon and Muller, 1998; (38) Youssef, 1998;
(39) Ma et al., 1998; (40) Helal et al. 1997; (41,42,44) Jakupovic et al., 1998b; (43) Jakupovic et al., 1998a; (45,46) Marco et al., 1998;
(47) Galal et al., 1998; (48) Lin et al., 1998; (49) Bruno et al., 1998; (50) Lobitz et al., 1998; (51) Verotta et al., 1998; (52) Ahmad et al.,
1998; (53) Brum et al., 1998; (54) Elgamal et al., 1998; (55) Tommasi et al., 1998; (56) Baykal et al., 1998; (57) Wang and Jia, 1998; (58)
Pollmann et al., 1998; (59) Junkuszew et al., 1998; (60) Mulholland et al., 1998.

616

2)

3)

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

In test 15, the isoferuloyl macronode was


proposed. However, the isomeric cis - and trans feruloyl, which present a structure very
related to the isoferuloyl macronode, were
also proposed with smaller ranges of average
error.
In the case of the three incorrect proposals for
the angelate macronodes (tests 23, 29 and
36), the proposed macronode was the tiglate,
which is the structural isomer of the angelate
together with the isovaleroyl macronode in test 29.

4)

In test 50, the macronodes cis - and trans cinnamoyl


were
proposed,
but
the
incorrect
macronode
(cis -isomer)
was
proposed with smaller range of average
error.

It is also necessary to comment that in tests 1 and 10,


in spite of the macronodes glucopyranosyl-(1 0/3)rhamnopyranosyl and glucopyranosyl-(6 0/1)-apiofuranosyl have not been registered in the database, even so
the program was able to supply correctly the two
glucosides belonging to the mentioned structures. It

Fig. 3. Substances used to test the programs.

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

617

Fig. 3 (Continued)

competes to the analyst to consider the position of the


connection between them. In tests 25 and 26, despite the
propionyl macronode has presented a smaller range of
average error in both cases, this macronode was

discarded, for the corresponding spectra showed 4 and


5 signals, respectively, besides the signals of the skeleton
of the substance. The program SISCONST, after
the removal of the signals of the substituents, displayed

618

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 3 (Continued)

the correct skeleton of the substance in 96.67% of the


cases. In the two incorrect proposals for the carbon
skeletons, it was verified that the correct skeletons

were the second (test 55) and the fourth (test 59). In
the last case, we could deduce the following: the
biosynthetic precursor of such skeleton was that that

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

619

Fig. 3 (Continued)

presented the highest probability; the detected error was


due to presence of few 13C NMR spectra, only 12
(Macari et al., 1994/1995) of the correct skeleton in the
database.

5. Conclusions
The probable problem that led to the errors verified
during the tests may have been originated from the own
database structure, that does not allow yet, until these
experiments, the separation and consultation of the
macronodes of solvents employed usually in 13C
NMR. Because of this problem, when the database

620

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 3 (Continued)

was built and the ranges of chemical shifts of the


carbons of the macronodes were created, we utilized
the chemical shifts of the macronodes in different
solvents. This procedure for many times ended up
generating very wide displacement ranges that, during
the tests, could lead the program to supply the false
verified results. An exit for the solution of this problem
should be the construction of another database and

further research on macronodes of used solvents. Hence,


if the 13C NMR spectrum from the researched substance
is obtained, for example, in CD3OD, the searching for
the possible macronodes present in its structure will be
accomplished more efficiently from the file presenting
the ranges of chemical shifts of the substituents in
CD3OD. With this change in the methodology that
will be provided as soon as possible, we believe that the

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

621

Fig. 3 (Continued)

observed errors can be significantly minimized or fully


extinguished.
Regarding the obtained results, it can be concluded
that, with the increase of the database, the program MACRONO might be used with any kind of

natural product once its database has a completer


inclusion of macronodes than that of the initial
version.
It is worth to stand out here that, after the identification and removal of the data of 13C NMR from

622

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 3 (Continued)

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 3 (Continued)

623

624

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 3 (Continued)

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Fig. 3 (Continued)

625

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

626

Fig. 3 (Continued)

Table 3
Results obtained by the
Substance

SISCONST

and

MACRONO

Present substituents

programs
MACRONO

program

program
Skeletal type (%)a

SISCONST

Proposed substituents

Error
range

No. of
carbons

Trans -p -coumaroyl
Glucose-(10/3)-rhamnose

Trans -p -coumaroyl
Glucopyranosyl
Rhamnopyranosyl

0.267
1.250
1.033

9
6
6

Myrcane : 92.5
8nor-myrcane: 5.0
Menthane: 2.5

6OMe-7O-Cumarin

6OMe-7O-Cumarin

0.100

10

Myrcane : 96.5
Menthane: 2.3
10nor-myrcane: 1.2

Sinapyl-alcohol

Sinapyl-alcohol

0.823

11

Myrcane : 100.0

Acetoyl
2OH-et-acryloyl

Acetoyl
2OH-et-acryloyl
3OH-metacriloyl

0.500
0.600
1.050

2
5
4

Menthane : 96.8
Skeletal-09: 3.2

Tigloyl
Tigloyl
Acetoyl

Tigloyl
Tigloyl
Acetoyl

0.410
0.410
0.850

5
5
2

Menthane : 100.0

Glucopyranosyl-2?,6?diacetate

Glucopyranosyl-2?,6 acetate

0.515

10

Pinane : 85.8
Others: 5.2
7nor-pinane: 3.0

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

627

Table 3 (Continued )
Substance

Present substituents

MACRONO

program

program
Skeletal type (%)a

SISCONST

Proposed substituents

Error
range

No. of
carbons

p -MeO-benzoyl (anisate)
Benzoyl
Glucopyranosyl

p -MeO-benzoyl (anisate)
Benzoyl
Glucopyranosyl

0.413
1.814
1.067

8
7
6

Pinane : 69.4
Menthane: 30.6

p -MeO-benzoyl (anisate)
p -OH-benzoyl
Glucopyranosyl

p -MeO-benzoyl (anisate)
p -OH-benzoyl
Glucopyranosyl

0.300
0.850
1.033

8
7
6

Pinane : 84.7
Menthane: 15.3

Glucopyranosyl

Glucopyranosyl

0.633

Bornane : 95.0
2et-bornane: 3.7
Others: 1.4

10

Glucose-(10/6)-apiose(fur)

Apiofuranosyl
Glucopyranosyl

0.470
0.750

5
6

Ionane : 98.6
Cyclogeraniolane: 1.1
Others: 0.3

11

Glucopyranosyl
p -MeO-cis -cinnamoyl
Metoxi

Glucopyranosyl
p -MeO-cis -cinnamoyl
Metoxi

0.683
1.915
1.700

6
10
1

Iridane : 90.2
11nor-iridane: 9.8

12

Menthiafoloyl
Glucopyranosyl

Menthiafoloyl
Glucopyranosyl
Allopyranosyl

1.415
1.567
1.567

10
6
6

Iridane : 97.8
11nor-iridane: 1.2
Others: 1.0

13

Acetoyl
Glucopyranosyl
Cis -p -coumaroyl
Metoxi

Acetoyl
Glucopyranosyl
Cis -p -coumaroyl
Trans -p -coumaroyl
metoxi

0.750
1.483
2.217
2.367
0.650

2
6
9
9
1

Iridane : 78.5
7,8seco-iridane: 12.9
11nor-iridane: 5.8

14

Foliamenthoyl
Glucopyranosyl

Foliamenthoyl
Nerol-8oyl
Glucopyranosyl

0.880
1.140
0.617

10
10
6

Iridane : 90.1
7,8seco-iridane: 8.5
Others: 1.4

15

Glucopyranosyl
Isoferuloyl

Glucopyranosyl
Trans -feruloyl
Cis -feruloyl
Isoferuloyl

0.850
0.685
1.365
2.735

6
10
10
10

11nor-iridane : 94.2
10,11dinor-iridane: 2.5
10nor-iridane: 1.7

16

Glucopyranosyl
p -Meo-trans -cinnamoyl

Glucopyranosyl
p -Meo-trans -cinnamoyl

0.683
0.670

6
10

11nor-iridane : 73.6
Iridane: 20.5

17

p -OH-benzoyl

p -OH-benzoyl

0.321

11nor-iridane : 96.6
Iridane: 3.4

18

Gentisoyl
Glucopyranosyl

Gentisoyl
Glucopyranosyl
Acetoyl

0.443
0.933
0.600

7
6
2

7 ,8seco-iridane : 72.2
Iridane: 27.7
Others: 0.1

19

Glucopyranosyl
2[p -OH-phenyl]-ethoxide
2[3,4diOH-phenyl]-ethoxide

Glucopyranosyl
2[p -OH-phenyl]-ethoxide
2[3,4diOH-phenyl]-ethoxide

0.617
0.431
0.631

6
8
8

7,8seco-iridane : 95.1
Iridane: 4.9

20

Acetoyl
Trans -caffeoyl
2[3,4diOH-phenyl]-ethoxide
Glucopyranosyl
Metoxi

Acetoyl
Trans -caffeoyl
2[3,4diOH-phenyl]-ethoxide
Glucopyranosyl
Metoxi

0.450
0.517
0.781
2.167
0.825

2
9
8
6
1

7,8seco-iridane : 68.7
Iridane: 30.9
Others: 0.4

21

Glucose-(10/6)-xilose
Glucopyranosyl

Glucose-(10/6)-xilose
Glucopyranosyl

0.686
0.650

11
6

Cadinane : 37.9
Valerenane: 14.8
Eudesmane: 8.3

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

628
Table 3 (Continued )
Substance

Present substituents

MACRONO

program

program
Skeletal type (%)a

SISCONST

Proposed substituents

Error
range

No. of
carbons

22

Glucose

Glucose

1.092

Iludane : 100.0

23

Angeloyl
Acetoyl

Tigloyl
Acetoyl

1.300
1.050

5
2

Eudesmane : 86.2
Nardosinane: 5.3
Others: 5.0

24

Acetoyl
Acetoyl
Acetoyl

Acetoyl
Acetoyl
Acetoyl

0.250
0.450
0.500

2
2
2

Drimane : 86.7
Eudesmane: 8.1
Others: 4.0

25

Isobutiroyl

Propionyl
Isobutiroyl

0.800
2.825

3
4

Eremofilane : 73.3
Furodisane: 4.8
Germacrane: 3.3

26

2Me-isobutyroyl

Propionyl
2Me-isobutyroyl

0.733
1.030

3
5

Eremofilane : 36.7
1-3-Eudesmane: 27.8
Eudesmane: 8.2

27

Ortoselinoyl

Ortoselinoyl

0.537

protoiludane : 44.7
Trifarane: 23.1
Eudesmane: 5.8

28

p -MeO-benzoyl (anisate)

p -MeO-benzoyl (anisate)

0.550

Carotane : 90.7
Eudesmane: 4.5
Others: 2.8

29

Angeloyl

Isovaleroyl
Tigloyl

0.970
0.970

5
5

Carotane : 100.0

30

2Me-butyroyl
Acetoyl
2Me-butyroyl
Furoyl

2Me-butyroyl
Acetoyl
2Me-butyroyl
Furoyl

0.820
0.850
1.380
1.820

5
2
5
5

Eudesmane : 98.2
Others: 1.8

31

4OH-3Me-butiroyl

4OH-3Me-butiroyl
Etoxide
Isobutiroyl

0.010
1.125
1.425

5
2
4

Germacranolide : 53.2
Guaianolide: 38.3
Eudesmanolide: 8.5

32

Tigloyl

Tigloyl
Metacryloyl

0.830
0.925

5
4

Germacranolide: 65.8
Guaianolide: 23.7
Eudesmanolide: 10.5

33

4OH-isobutyroyl

4OH-isobutyroyl
2Me-butyroyl

0.900
1.210

4
5

Germacranolide : 54.7
Guaianolide: 30.7
Eudesmanolide: 14.7

34

5OH-tigloyl

5OH-tigloyl
Etoxide

0.540
1.025

5
2

Germacranolide : 62.5
Guaianolide: 27.8
Eudesmanolide: 9.7

35

3OH-metacryloyl

Metacryloyl
3OH-metacryloyl
Butoxi

1.750
0.550
3.663

4
4
4

Eudesmanolide : 86.2
Guaianolide: 13.1
Others: 0.7

36

Acetoyl
Angeloyl

Acetoyl
Tigloyl
Angeloyl

0.800
0.410
0.480

2
5
5

Eudesmanolide : 100.0

37

Angeloyl
Acetoyl

Angeloyl
Tigloyl
Acetoyl

0.480
0.690
1.200

5
5
2

Eudesmanolide : 87.1
Pseudoguaianolide: 6.5
Germacranolide: 6.5

38

5OH-tigloyl

5OH-tigloyl

0.920

Guaianolide : 76.9
Eudesmanolide: 23.1

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

629

Table 3 (Continued )
Substance

Present substituents

MACRONO

program

program
Skeletal type (%)a

SISCONST

Proposed substituents

Error
range

No. of
carbons

39

Glucopyranosyl

Glucopyranosyl
Etoxide

0.683
0.475

6
2

Guaianolide : 48.8
Eudesmanolide: 45.9
Pseudoguaianolide: 3.9

40

4OH-tigloyl
4OH-tigloyl

4OH-tigloyl
4OH-tigloyl
3OH-metacryloyl

0.260
0.480
2.425

5
5
4

Guaianolide : 80.4
Eudesmanolide: 19.6

41

(5) Acetoyl
Isobutyroyl

(5) Acetoyl
Isobutyroyl

0.250
0.513

2
4

Jatrophane : 97.2
Briarane: 2.8

42

(5) Acetoyl
2Me-butyroyl

(4) Acetoyl
2Me-butyroyl
Acetoyl

0.250
0.810
0.400

2
5
2

Jatrophane : 88.6
Briarane: 11.4

43

Angeloyl
Acetoyl
Angeloyl

Angeloyl
Tigloyl
Angeloyl
Acetoyl
Tigloyl

0.340
0.890
0.620
0.750
1.050

5
5
5
2
5

Ingenane : 100.0

44

Angeloyl
Acetoyl
Benzoyl

Angeloyl
Acetoyl
Benzoyl

0.620
0.750
1.357

5
2
7

Ingenane : 77.6
Latirane: 16.9
Abietane: 2.5

45

Tigloyl
Acetoyl
Tigloyl
Acetoyl

Tigloyl
Acetoyl
Tigloyl
Acetoyl

0.450
0.650
0.750
3.100

5
2
5
2

Latirane : 100.0

46

(3) Acetoyl
Isovaleroyl

(3) Acetoyl
Isovaleroyl

0.500
0.910

2
5

Latirane : 100.0

47

Succinoyl

Succinoyl

0.125

Labdane : 30.7
Ent-caurane: 26.2

48

Cis -p -coumaroyl

Cis -p -coumaroyl
Angeloyl
Butoxi

1.039
1.140
1.387

9
5
4

Labdane : 51.0
Others: 11.0
Ent-caurane: 10.7

49

Acetoyl
2Me-butyroyl
Acetoyl

Acetoyl
2Me-butyroyl
Acetoyl

0.350
0.590
0.650

2
5
2

Clerodane : 77.6
Ent-caurane: 9.1
Others: 9.0

50

Trans -cinnamoyl

Cis -cinnamoyl
Trans -cinnamoyl

0.456
0.811

9
9

Ent-caurane : 53.8
Others: 11.1
Isopimarane: 5.5

51

Sophoroside

Glucopyranosyl
Sophoroside
Cellobioside

0.983
1.104
1.442

6
12
12

Cicloartane : 47.5
Oleane: 11.9
Ursane: 8.2

52

Glucopyranosyl

Glucopyranosyl

0.783

Cicloartane : 66.3
Oleane: 6.0
Lupane: 5.0

53

n -Hexanoyl

n -Hexanoyl

0.283

Lupane : 34.5
Oleane: 21.4
Ursane: 9.7

54

Glucopyranosyl
Sophoroside

Glucopyranosyl
Sophoroside
Etoxide

0.700
1.163
0.225

6
12
2

Lupane : 30.5
Oleane: 20.3
Ursane: 15.5

630

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Table 3 (Continued )
Substance

Present substituents

MACRONO

program

program
Skeletal type (%)a

SISCONST

Proposed substituents

Error
range

No. of
carbons

55

Trans -cinnamoyl

Trans -cinnamoyl
Cis -cinnamoyl

0.978
1.194

9
9

Oleane: 38.4
Ursane : 31.4
Others: 9.6

56

Glucose-(10/6)-allose
Xilopyranosyl

Glucose-(10/6)-allose
Butoxi
Xilopyranosyl

0.021
1.212
2.500

12
4
5

Ursane : 36.0
Oleane: 29.0
Lupane: 6.5

57

Butoxi

Butoxi

1.013

Ursane : 39.8
Oleane: 34.1

58

Glucuronic-acid
Glucopyranosyl

Glucuronic-acid
4OH-isobutyroyl
Butoxi
Glucopyranosyl

0.758
1.275
1.388
1.467

6
4
4
6

Oleane : 63.9
Ursane: 17.1
Lanostane: 5.6

59

Glucuronic-acid
Glucopyranosyl

Glucuronic-acid
Propionyl
Glucopyranosyl

0.692
1.600
1.650

6
3
6

Oleane: 44.9
Ursane: 18.7
Lupane: 14.0
30nor-oleane : 3.9

60

Acetoyl
Acetoyl

Acetoyl
Acetoyl

0.100
0.250

2
2

Meliacane : 74.4
Abeolimonoid: 8.3
Abeoursane: 7.4

In italics represents the correct skeleton of the substance.

Table 4
Summary of the analyses realized by the

MACRONO

program

Macronodes tested

Number of hits

Number of errors

trans -p -Coumaroyl
Glucopyranosyl-(1 0/6)-Rhamnopyranosyl
6OMe-7O-Cumarin
Sinapyl-alcohol
Acetoyl
2OH-et-acryloyl
Tigloyl
Glucopyranosyl-2?,6?diacetate
p -OMe-benzoyl (anisate)
Benzoyl
Glucopyranosyl
p -OH-benzoyl
Glucopyranosyl-(1 0/6)-Apiofuranosyl
p -MeO-cis -cinnamoyl
Metoxi
Menthiafoloyl
Cis -p -Coumaroyl
Foliamenthoyl
Isoferuloyl
p -MeO-trans -cinnamoyl
Gentisoyl
2[p -OH-phenyl]-ethoxide

1
1
1
1
32
1
5
1
3
2
18
2
1
1
3
1
2
1
/
1
1
1

/
/
/
/
/
/
/
/
/
/
1
/
/
/
/
/
/
/
1
/
/
/

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

631

Table 4 (Continued )
Macronodes tested

Number of hits

Number of errors

2[3,4-diOH-phenyl]-ethoxide
Trans -Caffeoyl
Glucose(61/1)Xilose
Angeloyl
Isobutyroyl
2Me-butiroyl
Ortoselinoyl
Furoyl
4OH-3Me-butiroyl
4OH-isobutiroyl
5OH-tigloyl
3OH-metacryloyl
4OH-tigloyl
Isovaleroyl
Succinoyl
Trans -cinnamoyl
Sophoroside
n -hexanoyl
Glucopyranosyl-(1 0/6)-Allopyranosyl
Xilopyranosyl
Butoxi
Glucuronic acid

2
1
1
4
2
5
1
1
1
1
2
1
2
1
1
1
2
1
1
1
1
2

/
/
/
3
/
/
/
/
/
/
/
/
/
/
/
1
/
/
/
/
/
/

Total

115

macronodes present in the substance, the use of the


other SISTEMAT programs, for example the program
SISCONST, facilitates inormously the prediction of the
probable skeleton of a given substance, for the skeletons
based on chemical shifts not belonging to the same
substance skeleton will not be predicted. The program
13
MACRONO in next future, if coupled with the
C NMR
characteristic ranges already obtained for several terpenoids (Emerenciano et al., 1993; Alvarenga et al., 1997;
Macari et al., 1994/1995; Ferreira et al., 1998, 2001b;
Oliveira et al., 2000), might be utilized as a restriction
module for the structure generator. This latter is already
being developed for the expert system SISTEMAT, thus,
instead of the generator working with all the 13C NMR
data of the substance to start the process of generation
of likely structures, it will have to initiate the process by
utilizing only the 13C NMR data of the skeleton, since
that the other chemical shifts have been removed
previously by the program MACRONO. The immediate
consequence of the utilization of this novel program
onto the structure generator will be the reduction of the
spent computational time and the number of displayed
structural proposals, what avoids the combinatorial
explosion problem observed in other expert systems
developed up to now (Carhart et al., 1981; Attias, 1983;
Munk, 1998; Jaspars, 1999; Badertscher et al., 2000).

Acknowledgements
This work was supported by grants from the Fundacao de Amparo a` Pesquisa do Estado de Sao Paulo
(FAPESP), CAPES and by the Conselho Nacional de
Desenvolvimento Cientfico e Tecnologico (CNPq). The
authors thank Antonio J.C. Brant for helpful discussion
during the preparation of the manuscript.

References
Ahmad, V.U., Ali, A., Ali, Z., Baqai, F.T., Zafar, F.N., 1998.
Phytochemistry 49, 829.
Ahmed, A.A., Mahmoud, A.A., 1997. Phytochemistry 45, 533.
Alvarenga, S.A.V., Gastmans, J.P., Rodrigues, G.V., Emerenciano, V.P., 1997. Spectroscopy 13, 227.
Attias, R., 1983. J. Chem. Inf. Comp. Sci. 23, 102.
Badertscher, M., Korytko, A., Schulz, K.P., Madison, M.,
Munk, M.E., Portmann, P., Junghans, M., Fontana, P.,
Pretsch, E., 2000. Chemom. Intell. Lab. Syst. 51, 73.
Baykal, T., Panayir, T., Tasdemir, D., Sticher, O., .C
alis, I.,
1998. Phytochemistry 48, 867.
Brum, R.L., Honda, N.K., Hess, S.C., Cavalheiro, A.J.,
Monache, F.D., 1998. Phytochemistry 49, 1127.
Bruno, M., Cruciata, M., Bondi, M.L., Piozzi, F., Torre,
M.C.L., Rodriguez, B., Servettaz, O., 1998. Phytochemistry
48, 687.

632

M.J.P. Ferreira et al. / Computers & Chemistry 26 (2002) 601 /632

Carabedian, M., Dagane, I., Dubois, J.-E., 1988. Anal. Chem.


60, 2186.
Carhart, R.E., Smith, D.H., Gray, N.A.B., Nourse, J.G.,
Djerassi, C., 1981. J. Org. Chem. 46, 1708.
Castillo, U.F., Wilkins, A.L., Lauren, D.R., Smith, B.L.,
Towers, N.R., Alonso-Amelot, M.E., Jaimes-Espinosa, R.,
1997. Phytochemistry 44, 901.
Christie, B.D., Munk, M.E., 1987. Anal. Chim. Acta 200, 347.
Connolly, J.D., Hill, R.A., 1991. Dictionary of Terpenoids.
Chapman & Hall, London.
Damtoft, S., Franzys, H., Jensen, S.R., 1997. Phytochemistry
45, 743.
Donnelly, D.M.X., Konishi, T., Dunne, O., Cremin, P., 1997.
Phytochemistry 44, 1473.
Elgamal, M.H.A., Soliman, H.S.M., Elmunajjed, D.T., Toth,
G., Simon, A., Duddeck, H., 1998. Phytochemistry 49, 189.
Emerenciano, V.P., Bussolini, A.C., Furlan, M., Rodrigues,
G.V., Fromanteau, D.L.G., 1993. Spectroscopy 11, 95.
Emerenciano, V.P., Melo, L.D., Rodrigues, G.V., Gastmans,
J.P., 1997. Spectroscopy 13, 181.
Emerenciano, V.P., Rodrigues, G.V., Macari, P.A.T., Vestri,
S.A., Borges, J.H.G., Gastmans, J.P., Fromanteau, D.L.G.,
1994. Spectroscopy 12, 91.
Ferreira, M.J.P., Brant, A.J.C., Rodrigues, G.V., Emerenciano,
V.P., 2001. Anal. Chim. Acta 429, 151.
Ferreira, M.J.P., Emerenciano, V.P., Linia, G.A.R., Romoff,
P., Macari, P.A.T., Rodrigues, G.V., 1998. Prog. Nucl.
Magn. Reson. Spectrosc. 33, 153.
Ferreira, M.J.P., Rodrigues, G.V., Brant, A.J.C., Emerenciano,
V.P., 2001. Spectroscopy 15, 65.
Fromanteau, D.L.G., Gastmans, J.P., Vestri, S.A., Emerenciano, V.P., Borges, J.H.G., 1993. Comp. Chem. 17, 369.
Funatsu, K., Sasaki, S., 1996. J. Chem. Inf. Comp. Sci. 36, 190.
Galal, A.M., Abdel-Sattar, E., El-Feraly, F.S., Mossa, J.S.,
Meselhy, M.R., Kadota, S., Namba, T., 1998. Phytochemistry 48, 159.
Gastmans, J.P., Furlan, M., Lopes, M.N., Borges, J.H.G.,
Emerenciano, V.P., 1990. Qum. Nova 13, 10.
Gray, N.A.B., 1986. Computer-Assisted Structure Elucidation.
John Wiley & Sons, New York.
Guilhon, G.M.S.P., Muller, A.H., 1996. Phytochemistry 43,
417.
Guilhon, G.M.S.P., Muller, A.H., 1998. Phytochemistry 49,
1347.
Helal, A.M., Nakamura, N., Meselhy, M.R., El-Fishawy,
A.M., Hattori, M., Mahran, G.H., 1997. Phytochemistry
45, 551.
Hosny, M., 1998. Phytochemistry 47, 1569.
Iossifova, T., Vogler, B., Kostova, I., 1998. Phytochemistry 49,
1329.
Jakupovic, J., Jeske, F., Morgenstern, T., Tsichritzis, F.,
Marco, J.A., Berendsohn, W., 1998. Phytochemistry 47,
1583.
Jakupovic, J., Morgenstern, T., Marco, J.A., Berendsohn, W.,
1998. Phytochemistry 47, 1611.
Jaspars, M., 1999. Nat. Prod. Rep. 16, 241.
Junkuszew, M., Oleszek, W., Jurzysta, M., Piancente, S., Pizza,
C., 1998. Phytochemistry 49, 195.
Kanai, E., Machida, K., Kikuchi, M., 1996. Chem. Pharm.
Bull. 44, 1607.

Kwak, J.H., Jang, W.Y., Zee, O.P., Lee, K.R., 1997. Planta
Medica 63, 474.
Lazari, D., Garcia, B., Skaltsa, H., Pedro, J.R., Harvala, C.,
1998. Phytochemistry 47, 415.
Lemmich, J., 1995. Phytochemistry 38, 427.
Lin, H., Ding, H., Wu, T., Wu, P., 1996. Phytochemistry 41,
237.
Lin, W., Fang, J., Cheng, Y., 1998. Phytochemistry 48, 1391.
Lindsay, R.K., Buchanan, B.G., Fergenbaum, E.A., Lederberg,
J., 1980. Applications of Artificial Intelligence for Organic
Chemistry: The DENDRAL Project. McGraw-Hill, New
York.
Lobitz, G.O., Tamayo-Castillo, G., Poveda, L., Merfort, I.,
1998. Phytochemistry 49, 805.
Ma, B., Gao, K., Shi, Y., Jia, Z., 1997. Phytochemistry 46, 915.
Ma, J., Wang, Z., Xu, L., Kadota, S., Namba, T., 1998.
Phytochemistry 48, 201.
Macari, P.A.T., Gastmans, J.P., Rodrigues, G.V., Emerenciano, V.P., 1994/1995. Spectroscopy 12, 139.
Mahmoud, A.A., 1997. Phytochemistry 45, 1633.
Marco, J.A., Sanz-Cervera, J.F., Ropero, F.J., Checa, J.,
Fraga, M., 1998. Phytochemistry 49, 1095.
Mulholland, D.A., Monkhe, T.V., Coombes, P.H., Rajab,
M.S., 1998. Phytochemistry 49, 2585.
Munk, M.E., 1998. J. Chem. Inf. Comp. Sci. 38, 997.
Munk, M.E., Farkas, M., Lipkis, A.H., Christie, B.D., 1986.
Mikrochim. Acta II, 199.
Oliveira, F.C., Ferreira, M.J.P., Nunez, C.V., Rodriguez, G.V.,
Emerenciano, V.P., 2000. Prog. Nucl. Magn. Reson.
Spectrosc. 37, 1.
Otsuka, H., Yao, M., Hirata, E., Takushi, A., Takeda, Y.,
1996. Phytochemistry 41, 1351.
Pollmann, K., Schaller, K., Schweizer, U., Elgamal, M.H.A.,
Shaker, K.H., Seifert, K., 1998. Phytochemistry 48, 875.
Ragasa, C.Y., Rideout, J.A., Sy, J.O., Alcachupas, D., Inte,
V.M.L., Coll, J.C., 1997. Phytochemistry 46, 151.
Rodrigues, G.V., Campos, I.P.A., Emerenciano, V.P., 1997.
Spectroscopy 13, 191.
Schuquel, I.T.A., Malheiros, A., Sarragiotto, M.H., Vidotti,
G.J., 1998. Phytochemistry 49, 2409.
Shelley, C., Munk, M.E., 1982. Anal. Chem. 54, 516.
Siems, K., Weigt, F., Wollenweber, E., 1996. Phytochemistry
41, 1119.
Sudo, H., Ide, T., Otsuka, H., Hirata, E., Takushi, A., Takeda,
Y., 1997. Phytochemistry 46, 1231.
Takeda, Y., Zhang, H., Matsumoto, T., Otsuka, H., Oosio, Y.,
Honda, G., Tabata, M., Fujita, T., Sun, H., Sezik, E.,
Yesilada, E., 1997. Phytochemistry 44, 117.
Tan, R.X., Kong, L.D., 1997. Phytochemistry 46, 1035.
Taskova, R., Handjieva, N., Peev, D., Popov, S., 1998.
Phytochemistry 49, 1323.
Tian, J., Zhang, H., Sun, H., Pan, L., Yao, P., Chen, D., 1998.
Phytochemistry 48, 1013.
Tommasi, N.D., Rastrelli, L., Lauro, M.R., Aquino, R., 1998.
Phytochemistry 49, 1123.
Torres, P., Ayala, J., Grande, C., Macas, M.J., Grande, M.,
1998. Phytochemistry 47, 57.
Tuntiwachwuttikul, P., Pancharoen, O., Taylor, W.C., 1998.
Phytochemistry 49, 163.
Ujita, K., Takaishi, Y., Iida, A., Fujita, T., 1992. Phytochemistry 31, 1289.

You might also like