You are on page 1of 10

International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

RESEARCH ARTICLE OPEN ACCESS

Testing Ambiguity through Horizontal and Vertical Ambiguity


Using Grammar Approximation
M. Tiwari [1], S. Shukla [2], A. Pandey [3]
Department of Computer Science
Kanpur Institute of Technology
Kanpur – India
ABSTRACT
The problem to detect, that a single sentence of a grammar can be interpreted in multiple ways; in computer science is known to be
undecidable. This unsolvable problem is known as the ambiguity of grammar. For a grammar which can generate an infinite number
of strings, it is even more difficult to locate and remove the cause of ambiguities in general. So it is natural to study and analyze the
various methods present so for which are used to detect the ambiguity in various types of grammars. One of the four types described
by Chomsky is context free grammar. CFG has a vital role in defining the constructs of programming languages. The grammars to
define various syntax rule and construct of languages are often ambiguous. This paper tries to use some parameters to compare
various ambiguity detection methods and at the end, one method is also proposed.
Keywords :— Ambiguity, Context Free Grammar, Chomsky Normal Form, Horizontal Ambiguity, Veritical Ambiguity

restricts its rules to a single non-terminal on the left-hand


I. INTRODUCTION side of the terminal or a non terminal on the right-hand
side consisting of a number of terminals. Regular
Languages can be defined by a syntactic generative way languages are commonly used to define search patterns
provided by formal grammars. Languages are sets of and the lexical structure of programming languages [2].
strings. A language consists of three different entities
letters, words, and sentences, and there is a certain trivial
relation between them as well. If we collect some random
letters from an alphabet it does not make a valid word all
the time. Same way any random collection of words
(stings) make a grammatically valid sentence all the time.
Grammar uniquely determines a structure for each string in
the language [1].
Chomsky classification [2] is a hierarchy of formal
grammars which is shown from the diagram below Figure
1.1 According to Chomsky hierarchy formal languages
can be classified as regular languages, context-free, the
Figure 1: Chomsky Hierarchy
context- sensitive and the recursively enumerable
languages.
Context Free Grammar
 Type-0 grammar (unrestricted grammars): These
grammars can generate all languages that can be The sentences of a language can be generated by some set
recognized by a Turing machine [2] .These languages of rules and these rules are described in form of a
are also called as the recursively enumerable languages. grammar. All languages, whether programming or natural
have a associated grammar [1]. These grammars can be
 Type-1 grammar: These grammars generate the represented using some construction rules in form of
context-sensitive languages. A linear bounded productions.
automation recognizes these set of languages [2].
 Type-2 grammar: They generate context-free Context-free grammars have a wide variety of
languages (CFLs). A non-deterministic pushdown the applications [1]. One of the mostly used applications is to
system. But there is a semantic gap between automaton use these grammars to make compilers so that they can
can recognize these languages. Context free languages verify the syntax of high level language program. Some
are the basis for the syntax of most programming applications that require the vefication of definite
languages [2]. structure of sentences. But major problem with context-
free grammars is that in these grammars multiple
 Type-3 grammar: The languages generated by these structures for a single sentence is allowed. This problem
grammars are called regular languages. Such a grammar is known as ambiguity [2]. This ambiguity creates

ISSN: 2347-8578 www.ijcstjournal.org Page 155


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

multiple meanings of a single sentence that can cause given string can be derived from the grammar
serious problems and results unclear non-unique meaning unambiguously. If at any sentential form a situation
of a sentence. arises where we have two options, option to shift as well
as option to reduce, then this is called shift reduce
A grammar G has a 4-tuple, (NT, T, P, S), where NT is conflict [3]. It may also be the case where parser have
the set of non-terminals (also called variables), also two options to reduce the sequence. That is two
called non terminals, T is a set of symbols that will be productions are present in the grammar by which we can
contained by sentence(strings) of a language, also called reduce the given sentential form. This is called reduce-
terminals, S is known to be start symbol of the grammar, reduce conflict [3]. If any of the conflicts found during
all the derivations starts with S, and P is the finite set of the parsing of string, it is stated that grammar is
productions or rules that represent the basic construction ambiguous. The following example illustrates these
rules to define the language [1]. We require that in conflicts.
grammar one production which has start symbol S on the Example: 1 Given the following
left hand side be present always [2]. grammar and the string x = cac,
A production is a construction rule which has a variable S → SaS | SbS
(not-terminal) in one side which can be replaced by the A → SaS
combination of variables and terminals which is present on S→c
the other side of production ( A → where A belongs to For string “cac” we have two choices to reduce this with S
set of not-terminals and belongs to any combination as well as with A. So this results in a reduce reduce
of variables and terminals. At any point of time left side conflict. While for the string x = cacac, we have again two
can be replaced by right[1]. The symbol “→” is used to choice while parsing second ‘a’, we have a shift choice as
write in between left-hand-side and right-hand-side. We well as a reduce option resulting a shift reduce conflict.
start derivation of any string from the start symbol Chomsky Normal Form an specific CFG:
always [1],[2]. All intermediate stages of the strings
which come in the derivation process starting from start A grammar G = (NT, T, P, S) is in Chomsky normal form
symbol(S) are called sentential forms [2]. To verify if its productions are restricted to the forms:
whether a string can be derived from grammar or not, we  X YZ where X, Y and Z are non-terminals in
build a parse tree also called derivation tree.
NT and exactly two non-terminals are on the
right (i.e. X,Y,Z
 X a where A is a variable in NT and a is
exactly one terminal symbol in T (i.e.
Xand aT
Given a string, the detection can be done by Cocke-
Younger- Kasami (CYK) algorithm that whether a string
can be derived from a given grammar or not. This can be
done in polynomial time. That means an algorithm with a
Figure:2 Representation of parse tree for an English language polynomial time can decide that a string is in language or
not. We can easily prove that if a grammar is in Chomsky
The class of languages recognized by context-free Normal Form, a binary tree is constructed each time we
grammars is known as context free languages. The parse the given string [3].
Context free languages are rich enough to describe the
syntax of languages that we use to make programs. One A grammar in CNF makes parsing a string even easier just
of the sub-class of the context free grammar is the LALR because of its very simple structure. In an arbitrary CFG,
(1) grammars (Look Ahead Left to Right). In these type we don’t have any upper bound on the length of string
of grammars we try to reduce the given string by looking that is to be parsed (there could be many useless rules
a symbol which will come next. Instead of thinking and that later expand into  ), so it is not possible to test
predicting the entire sequence, LALR parser just looks membership by generating all possible derivations. In
into next single symbol. By looking the symbol it can contrast, in grammars in CNF, there is a upper limit on
perform two kinds of actions – namely shift or reduce. If the length of any derivation of a a string of length n, it
the symbol under consideration along with previous is exactly 2n-1.
sequence of symbols (non reducible to any of the left The normal forms (Chomsky, Greibach etc.) were
hand side of the given production rules) does not match invented to solve the elementary problems involving
with any of right hand side of any production, shift CFLs, such as deciding membership and testing
operation takes place. Starting from given string for emptiness, more easily. The Chomsky normal form of a
which this parsing is done, if this parser is able to reach grammar yields efficient algorithms.
the starting point that is start symbol of grammar then

ISSN: 2347-8578 www.ijcstjournal.org Page 156


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

Ambiguity in Context Free Grammar The problem of determining whether a language is


inherently ambiguous or not is undecidable. There is no
If a CFG G = (NT, T, P, S) generates at least one string w algorithm which by taking input a grammar can tell that
for which there are more than one distinct parse trees it is ambiguous or not all the time. If we define a
exist, each with its starting node as S and generating language L which is the union of two languages say L1
string w, then that CFG is knows to be ambiguous[1]. and L2, and if contains the strings which are present in
Every parse tree generated so either belong to left most both the L1 and L2, then these strings will have two
derivation tree or right most derivation tree. Degree of different meanings, first meaning to define the property
ambiguity of a string is called as the number of different of first language (L1) and second meaning to describe the
parse trees for w by a grammar G [5]. Comparing the property of language L2. Clearly we would say this
ambiguities of all the strings of a grammar, the maximum ambiguity is inherent in L, and it is not possible to find
degree of any string becomes the degree of ambiguity of the equivalent language which is free from ambiguity.
that grammar. We can classify the ambiguous grammars
For example, the language L={anbncm | n 1, m 1} U
according to their degree of ambiguity. If for a grammar
G, we gradually increase the length of string and the {anbmcm | n 1, m 1}, with the language {anbncn | n 1}
resulting number of parse trees also increases with that as L1∩L2, is inherently ambiguous [6].
then it may be the case that the degree of ambiguity of
that grammar is infinite [5].
II LITERATURE REVIEW
Example:1.3 Consider the following grammar
Gorn Method
S→S+S
In [12] Gorn describes, a Turing machine that generates
S→S*S
all possible strings that can be derived from a given
S→(S) S→a grammar. This machine is called as ‘generator’ of
derivations. Whenever a new string is derived it
There are two left-most derivations for “a*a+a”
immediately checked with other strings which are already
generated. A simple BFS is used in this method, i.e. every
possible string is generated one by one replacing the
variables present at any point of time in sentential form.
So the process starts with very generating very first string,
and then another and comparison is done. The comparison
is done between strings generated only, there is no
comparison done between sentential forms. Sentential
forms need to be reached to all terminals form.
Figure 3

Cheung and Uzgali’s Method


The method is an optimized version of Gorn’s method.
In this method. It does breadth first search the same way
Inherent Ambiguity as in Gorn’s method but with the reduced extent of
There may exist more than one grammars generating sentential forms by removing superfluous or unwanted
the same language. If all grammars generating the parts. All the derivations are pruned then the method
same language have at least one string that can be applied here is same as in Gorn’s method. Cheung and
generated ambiguously, then the ambiguity exists in Uzgalis [13] checks all sentential forms for repetition
language itself [1]. Such language is called inherently with other forms as well. So it does tempt to reach all
ambiguous [1]. The existence of any ambiguous terminal stage, it checks for duplicate sentential form and
grammar for a language does not necessarily mean report ambiguity immediately. In general it generates all
that ambiguity exists in language i.e. out of many possible intermediate forms but under some predefined
grammars generating language, if an unambiguous conditions it can terminate the process also. When there
grammar exists which is unambiguous, then language is no possibility of reaching to an ambiguous state or it is
is known to be unambiguous. In other words, a confirmed that this kind of string has already been
language L will be unambiguous, if at least one searched. It checks for all prefix terminals and post fix
grammar exists to generate the language which is terminals with other sentential forms. While generating a
unambiguous. sentential form, it matches its prefix and postfix
terminals with all other sentential forms, if no match is
there, it terminate the expansion further. Now it will

ISSN: 2347-8578 www.ijcstjournal.org Page 157


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

generate the unique string, i.e. o nly single of string. Parsing is a process to find a tree structure also
derivation for that string. So by this process called parse tree. If we are able to reach at the starting
repetition of same derivation is stopped and symbol starting from the given string. The basis for the
some patterns expanded only once [13]. decision is the symbol which is treated as look ahead. If
we are k symbols as look ahead then it is known as
To study this lets take an example of a grammar. We LR(k) at every step k symbols are taken into
take a grammar which is infinite and unambiguous. This consideration for reducing these k symbols by a variable.
can generate endless number of derivations recursively. If a variable has these k symbols in the rhs of its
S  V1 | V2 , production, these symbols can be reduces to that variable
at this step [17]. This process is done using parse table.
V1  abV1 | ab , Following actions are possible; first- next symbol is
shifted, second- reduction with a variable which contains
B  cdV2 | cd
symbol in its rhs, third- symbols reduced with start
The expansion at its first stage will give strings “V1” and symbol (showing acceptance of string) or at last- an error
“V2”. Expanding further strings “abV1”, “ab”, “cdV2” is reported. The grammar called LR(k) grammar if we
and “cd” will be generated. After this the expansion is deterministically parse with this algorithm. The table
terminated. If we look into strings “abV1” and “cdV2”. constructed by this method has entries for k symbols
They have altogether dissimilar prefixes, and remaining with no conflicts. The conflict is known to be a state
parts “V1” and “V2” have already been expanded where more than one actions are possible at a particular
during the second expansion. So algorithm now can be state. There is a nondeterministic situation for k look-
terminated and un-ambiguity can be reported ahead symbols indicates either shift-reduce or reduce-
successfully. reduce conflicts [17].
If we are getting a parse table which has not any non-
AMBER deterministic situation of above type then for every
string in grammar, will have unique parse tree. This
AMBER is developed by Schroer [14]. To generate results in a unambiguous grammar. But the grammar of
strings AMBER uses Earley parser [15]. Getting such type are the subset of context free grammars. It
strings, all are verified for duplicity. This works just does not include entire class of context free grammars
like Gorn’s method with some minor variations. The [20]. Grammar which do not belong to LR(k), we will
paths which results in all derivations are need to be not be able to say anything about that grammar using this
traced. Basically this is the same as Gorn’s method method. Only if the grammar belongs to LR(k)
[12]. This method can take some parameters with some verification can be done by creating its derivation table
deviations as in previous methods, to find the for k symbols. A parse table without conflict concludes
ambiguity. Not only all the strings are compared, but that grammar is LR(k) and is unambiguous.
also an ellipses alternative is there to match all
intermediate forms present too. This way in less number Brabrand, Giegerich and Møller Method
of steps, we conclude about ambiguity. The search can
be bounded at a certain stage, by applying the condition The method of Brabrand et al. [18] uses horizontal and
of maximum length or any upper limit can be put on the vertical ambiguity as basis for to detect ambiguity for a
number of strings [15] it generated. If we combine this grammar. Every production rule is checked for
parameter with another option (limiting the expansion) horizontal ambiguity. And every combination of
to apply each step, certain derivations can be stopped to productions is tested for vertical ambiguity, for exactly
parse. similar non terminals. By this ambiguity (vertical-
horizontal) is explained in form of language. In the
Jampana Method previous methods it was explained using grammar itself.
Vertical ambiguity is verified by finding the common
Jampana’s ambiguity detection method works in strings generated by two producton rules. Say first
grammar when it is in Chomsky Normal Form (CNF). production generates set L1 and second production
It is assumed that if no duplicate production (live generates set say L2. We find intersection of L1 and L2.
production) is present in derivation then only ambiguity If this is not empty, we say vertical ambiguity is verified.
can be present in that string otherwise not. This method If in a production rhs, can be broke into parts whose
follows the same principle. Rest method is same as languages are overlapped, horizontal ambiguity is
searching the duplicate strings as in Gorn’s method. verified in that rule. So in this method languages
The only requirement is that the grammar must be in generated by productions are taken into consideration not
CNF [16]. actual production rules. The approximation is done on
the languages generated by the production rules, making
LR(k) test the operations of intersection and overlap problems
This is a parser which uses a parse table to do the parsing decidable. For various approximation methods this

ISSN: 2347-8578 www.ijcstjournal.org Page 158


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

algorithm provides a base for building blocks. In Input: A grammar G(NT,T,P,S)


algorithm by Mohri et al. [19] extension of the original
language is done into a language which essentially Output: Variables which generates nothing.
shown with a regular grammar. 1. Wold 

2. Wnew Wold
Proposed Technique
Following steps are used for ambiguity detection in the 3. do
grammar.
4. WoldWnew
 Convert given Grammar into the Chomsky
normal form (CNF). 5. For A W do
6. For ( (A  w)  P ) ) do
 Computation of first and last function for
production P.
7. If w  (T Wold)* then
 Identify the productions which have a
possibility of Vertical and Horizontal 8. Wnew  Wnew{A}
ambiguity.
9. While (Wold  Wnew )
 Check the Vertical and Horizontal
Ambiguity for the productions. Return Wnew

1. Conversion of Grammar into Chomsky Normal The output grammar is free from the production which
Form cannot generate strings containing all terminals.
Following steps are used for converting CFG into CNF.
Removal of Null Production
Removal of Useless Symbol
Given a grammar G= (NT, T, P, S) then the production
Useless symbol are the variables which do not appear in A  ,(where A is in NT) generates null, and is called
any string generated from the starting non terminal. null-production. To remove these types of productions
from grammar which generate the null string, apply the
following algorithms.
Algorithm 1: Identify the useless non-terminal in
grammar G Algorithm 3: CompNullable(G)
Input: A grammar G(NT, T, P, S)
Input: A grammar G(NT,T,P,S) Output: Productions which generates null productions.
Output: Productions which are not reachable from start 1. Wnull
symbol S. 2. Do
1. Wold WoldWnull
3.
2. Wnew {S}
4. For A NT do
3. Repeat step 4 to 7 While Wold  Wnew do
5. for ( (A  w)  P ) do
4. WoldWnew
6. if w or w  (Wnull)*
5. for AWold do
7. Wnull  Wnull {A}
6. for (Aw)P
8. while (Wnull  Wold)
7. Add all Variables appearing in w to W new
Return Wnull
8. return Wnew

Removal of unit production


Algorithm 2: Remove the useless symbols from G A unit production is form of X  Y, where X,Y NT
Algorithm 4: Identification of unit production

ISSN: 2347-8578 www.ijcstjournal.org Page 159


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

Input: A grammar G Output: Identify Unit Step 1: Removal of Useless Symbol


Productions.
For the given grammar there is no the start production
1. Pnew  {(X  Y ) | (X  Y )  P}
which is unreachable from symbol S
2. do
3. Pold  Pnew Step 2: Removal of Null Production
For the given grammar there exist some null production
4. For ( X  Y )  Pnew do
V2 and replace V2 with V1
5. For (Y Pnew do
V1 
6. Pnew  Pnew {X  Z} After Null production Removal, Grammar G is

Return Pnew S  V1S V2 | AV2 | a | SV1 | V1S

V1  V2 | S

V2  b
Algorithms 5: Removal of unit production
Step 3: Removal of Unit Production
Input: A grammar G
After removal of Unit production, grammar G is
Output: Production which does not contain the Unit
production S  V1S V1| aS | a | SV1 | V1S

1. U  CompUnitPairs(G) V1  b | V1SV1 | V1b | a | SV1 | V1S


2. P  P \U V2  b
3. For (X  A) U do
Step 4: Making Production with two variables on the
4. For ( A  w)  Pold right side
These production which consists of more than two
5. P  P {X  w} variable on right hand side such as(S  V1SV1 and V1
 V1SV1), make them into maximum two variables.
Return (G) Final CNF form is

Making production with two variables on the right side S  V1V3 | V4V2 | a | SV1 | V1S

It could be possible that the grammar of the form V1  b | V1V3 | V4V2 | a | SV1 | V1S
X  V1 V2 V3 ..........Vn V2  b

Rules are converted into CNF as below: V3  SV1


X  V1Z1 V4  a

Z1 V2 Z2 2. Computation of First and Last Function


Following are the functions used in the algorithm
……
Production ( ): Returns the each production in the
Zn 3  Vn 2 Zn 2 grammar.

Zn 2  Vn 1 Vn Terminal (T): Adds T to the set of terminals. An


error occurs if T is already a Non terminal symbol.
Nonterminal ( ): Returns the set of non terminal
Following steps are used for conversion of CFG to CNF Isterminal (T): Returns true if T is a terminal;
otherwise, returns false. RHS (P): Returns an
iterator for the symbols on the RHS of production P

ISSN: 2347-8578 www.ijcstjournal.org Page 160


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

LHS (P): Returns the non terminal defined by participate in the computation of First(α).
production P
Computation of Last function
InternalFirst (V): Returns an iterator that visits
Last function are computing by reversing the
each production for non terminal V. VisitedFirst
production and calculating the first function
(X): Returns an iterator that visits each occurrence
which is the last function for the given
of X in the RHS of all rules.
productions
SymbolDerivesEmpty( ): Check the non
terminal symbol derive empty string or not.
Detection of Vertical Ambiguity Productions
Computation of First function
Given a grammar G‟ in CNF form. First of all
First Function- First function of a non- check all the productions which have the
terminal is the set of first symbols from the possibility of ambiguity. All productions have
right hand side of the production. First function been checked with function CheckProduction()
are used from scanning from left to right of the and then return the LHS value which consists of
production. non terminals.
Algorithm 6: ComputeFirst (α)
Algorithm 7: CheckVProduction()
Input- Grammar G’
Input: Grammar G’ containing production
Output- First functions corresponding to Grammar G.
P1, P2----------Pi
First (α): Set
Output: Production PPi containing Vertical
1. For each ANonterminal() do Visited First(A)  Ambiguity Procedure
false
CheckVProduction()
2. First  InternalFirst( ) 1. For each productions ()
Return (First) 2. Visited LHS(P)  true
End 3. Count RHS(P)>1
Return LHS(P)
Function InternalFirst(Xβ):Set
// This method checks the vertical ambiguity in the
1. If Xβ = empty Input grammar Procedure CheckVambiguity()
then return ( ) 1. For each LHS(P) containing type production for
non terminal A  P1|P2
2. If X =V
2. RHS(P)FirstLast(P1,P2)
3. then return ({X}) X is a nonterminal.
4. First  3. CALL [FirstLast(P1)]

4. CALL [FirstLast(P2)]
5. If not VisitedFirst(X) then
5. If First,Last(P1) First,Last(P2)  
6. VisitedFirst(X)
LHS(P) contain vertical ambiguity
7. For each rhs ProductionsFor(X) do
8. First  First InternalFirst(rhs) Detection of Horizontal Ambiguity Production
9. If SymbolDerivesEmpty(X) Given a grammar in CNF form. First of all
check all production which have the
10. then First  First InternalFirst(β) possibility of ambiguity and then detect the
Return (ans) horizontal ambiguity in these production
First(α) is computed by invoking FIRST(α).
Algorithm 8: CheckHProduction()
Before any sets are computed, VisitedFirst(X)
for each nonterminal A. VisitedFirst(X) is to
Input: Grammar G’ containing productions
indicate that the productions of X already

ISSN: 2347-8578 www.ijcstjournal.org Page 161


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

P1,P2--------Pi 3.
Output: Productions PPi containing Horizontal 4. Visited LHS(P)  true
Ambiguity Procedure
5. Corresponding RHS(P)  true
CheckHProduction()
1. For each production () 6. RHS(P)[Nonterminal(P)  1]
2. Visited LHS(P) true
the right hand sides of productions from the
7. Production(Nonterminal(P)>1) variable for that line
 The first variable on the first line is always the start
Retrun LHS(P) variable
// This method check the Horizontal ambiguity in the
input Grammar Procedure
 Any character which is not a variable, is a terminal

CheckHambiguity() Example 6.1:

1. For each production() containing type A P1P2 Grammar file for the language {anbn | n > 0} ∪
{bnan | n > 0} has the context free grammar
2. CallNonterminal()
S →X | Y
3. For each Nonterminal(P1,P2)
X →aXb | ab
4. Call First(P1,P2) Y →bYa | ba
Grammar G is stored in the txt file as follows:
5. Call last(P1,P2) S XY
6. FirstLast(P1)=First(P1)  Last(P1) X aXb ab
Y bYa ba
7. FirstLast(P2)=First(P1)  Last(P2)
The proposed technique for ambiguity detection in context
8. Production(P1)h= FirstLast(P1) free grammar implemented in Java. The output shown in
9. Production(P2)h=FirstLast(P2)
10. If [Production(P1)h  Production(P2)h]  
LHS(P) .contain Horizontal ambiguity
Implementation
This section is given for the implementation of the
proposed algorithms on various grammars. The proposed
algorithms have been applied on set of grammar in CNF
form, which can be easily converted into from given CFG.
For the test grammars are included of different sizes,
ambiguous grammars which contain horizontal, vertical
ambiguity or unambiguous grammar.

Structure of Input Grammar

Grammars are stored as text files which should end with following snapshots.
the .txt extension. They are specified as follows: Figure 4 Demonstration of input grammar into CNF
 Each line consists of space separated strings
which represent productions from a common Figure 4 shows the grammar information of the input
variable grammar file gg.txt and their CNF form. If the input grammar
already in CNF unchanged then grammar will be shown. The
 The length first string of any line must have grammar information such as no of variables, start symbol,
no of production is also shown in the output.
one and is a type of variable.

 The corresponding strings on a given line are

ISSN: 2347-8578 www.ijcstjournal.org Page 162


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

( Vertical or Horizontal )
Figure 7 shown an another grammar is taken as input for
checking Horizontal ambiguity.
The algorithm proposed here is tested with some known
grammars i.e. ambiguity or non-ambiguity is known in
advance. The results after implementing the algorithm
gave correct results. Yet detecting the ambiguity is un-
decidable for any context-free grammar. That means that
even though the proposed algorithm halts and gives an
Figure 5 First function of the Input grammar output of whether a given arbitrary grammar is ambiguous
Figure 5 shows the First and Last symbol of the input
or not, the result is not provable in general.
grammar which is now in CNF, with the help of First and
Last symbol of the given grammar horizontal and vertical
Conclusion and Future Work
ambiguity will be calculated.
We have presented a technique for statically analyzing
ambiguity of context-free grammars, which is based on a
linguistic characterization. This thesis gives the ambiguity
detection technique with their background work. The
presented algorithms implemented in Java, identify the
horizontal and vertical ambiguity for the context free
grammar after converting into CNF because CNF form
easily implemented for the parsing and beneficial in terms
of computation. The presented algorithm is less complex
than others and applicable for simple grammar.

We have presented a technique for statically analyzing


ambiguity of context-free grammars, which is based on a
linguistic characterization. This thesis gives the
ambiguity detection technique with their background
work. The presented algorithms implemented in Java,
Figure 6 Vertical or Horizontal Ambiguity type detection identify the horizontal and vertical ambiguity for the
context free grammar after converting into CNF because
Figure 6 shows the type of ambiguity present in grammar or CNF form easily implemented for the parsing and
not, the grammar input could be contain horizontal or beneficial in terms of computation. The presented
vertical ambiguity. algorithm is less complex than others and applicable for
Figure 7 Another Scenario the Ambiguity test simple grammar.
Further there is scope to characterize formally, the sub-
class of context free grammars for which the proposed
algorithm works. By this way the problem of detecting
ambiguity in CFGs can be studied in depth. A
mathematical proof or a model which can show the
working and correctness of proposed algorithm, can be
developed. This algorithm works only for the limited sets
of grammars, so that a larger set of well known grammars
should be used to verify the correctness of the given
method. More improvement and optimization may also be
done.

ISSN: 2347-8578 www.ijcstjournal.org Page 163


International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019

REFERENCES [16] S. Jampana. “Exploring the problem of


[1] J.E. Hopcroft, R. Motwani, and J. D. Ullman, ambiguity in context-free grammars”, Master‟s
“Introduction to Automata Theory, Languages, and thesis, Oklahoma State University, 2005
Computation”, Pearson Education Asia, Inc. Delhi, [17] D. E. Knuth, “On the translation of languages from
India, 2001. left to right. Information and Control”, vol.8, no.6,
[2] D.I.A.Cohen, “Inroduction to Computer Theory”, pp. 607–639, 1965.
John Wiley & Sons, Inc. Canada, 1986. [18] C. Brabrand, R. Giegerich, and A. Møller,
[3] S. C. Johnson, B. W. Kernighan, and M. D. “Analyzing ambiguity of context free grammars”,
Mcllroy,“YACC: Yet Another compiler-compiler”, Proc. 12th International Conference on
UNIX Programmar‟s manual, Bell laboratories, Implementation and Application of Automata,
7 th edition murray hills, 1978. CIAA, 2007.
[4] D. Parks, Appalachian State University, Lecture [19] M. Mohri and M. Nederhof, “Regular
Notes, “Chapter 16: Non- Context-Free approximations of context-free grammars through
Languages”, 2004 . transformation”, Robustness in Language and
[5] W. Kuich and A. Salomaa, “Semirings, Automata, Speech Technology, chapter 9, pp.153–163,
Languages”, Springer- Verlag, London, UK, 1985. Kluwer Academic Publishers, 2001.
[6] S. Ginsburg and J. Ullian, “Ambiguity in Context Free [20] H. J. S. Basten, “Ambiguity Detection
Languages” , Journal of the ACM (JACM), vol. 13, methods for context free grammars”,
no. 1, pp. 62-89, 1966. Master‟s thesis, University of Amsterdam,
[7] A. Frank, Tracy H. King, Jonas Kuhn, and John 2007.
Maxwell, Stanford University, “Optimality Theory
Style Constraint Ranking in Large-Scale LFG
Grammars”, 1998.
[8] L. Paulson, Stanford University, “A Compiler
Generator for Semantic Grammars”, Ph.D.
Dissertation Report, Department of Computer
Science, 2005.
[9] A. Gacek, University of Minnesota, IT Lab
Forums, “Ambiguous Method Call in Java”, 2005.
[10] J. C. Cleaveland and R. C. Uzgalis, “Grammars for
Programming Languages”, Elsevier, New York,
1977.
[11] B. Wilson, The University of New South Wales,
“Grammars and Parsing”, 2004.
[12] S. Gorn, “Detection of generative
ambiguities in context-free mechanical
languages”, J. ACM, vol.10, no.2, pp.196–208,
1963.
[13] B. S. N. Cheung and Robert C. Uzgalis,
“Ambiguity in context-free grammars”, In SAC
‟95: Proceedings of the 1995 ACM
symposium on Applied computing, pp. 272–276,
New York, USA, 1995.
[14] F. W. Schroer, “AMBER, an ambiguity checker
for context-free grammars”, Technical report,
2001.
[15] J. Earley. An efficient context-free parsing
algorithm. Communications of the ACM, vol.13,
no.2, pp. 94–102, 1970.

ISSN: 2347-8578 www.ijcstjournal.org Page 164

You might also like