Context-Free Grammar: Sojharo Mangi BS-2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

CONTEXT-FREE GRAMMAR 1

27 June,2011

Sojharo Mangi BS-2

CONTEXT-FREE GRAMMAR
The grammar whose productions have the form X-->a, where X is a nonterminal and is a nonempty string of terminals and non-terminals. Most programming languages can be approximated by context-free grammar and compilers for them have been developed based on properties of context-free languages.

Logic and Discrete Structure


Instructor: Hishaam bin Zubair

CONTEXT-FREE GRAMMAR 2 Context-Free Language The languages generated by context-free grammars are called context-free languages. The language of the grammar represented above i-e G= (V, T, S, P) is the set L (G) = {v T : P => p} Proper Context-Free Grammars A context-free grammar is said to be proper, if it has following properties. A CFG has no inaccessible symbols. It has not any unproductive symbol. It has no -productions It has no cycles.

A context-free grammar is a grammar in which every production rule is of the form V p Where V is a single non terminal symbol and p is a string of terminals or non-terminals i.e. p can be empty. A context-free grammar provides a simple precise mechanism for describing the methods. Through these methods, phrases in natural languages are built from smaller blocks, by capturing the block structure of sentences. The formalism and classification of context-free grammars, which was named as phrase-structure grammar, was developed in 1950s by Noam Chomsky. The context-free grammars are important in computer science fields for describing the structure of programming and other artificial languages. Context-Free Grammar Definition A context-free grammar denoted by G is defined by the 4-tuple: G = (V, T, S, P) where: V is called a non terminal character, which represents a different type of phrase in the sentence. Each variable denotes the sub-language of the particular language defined by the context-free Grammar G.

CONTEXT-FREE GRAMMAR 3 T is a finite set of terminals dislodge from V which makes the actual content of the sentence. R is a finite relation from V to (V U T) such that p (V U T) (P, p) S. here the members of S are called the re write productions of grammar. P is the starts Symbol, used to represent the whole sentence. It should be an element of V. Note: The asterisk represents the Kleene star operation. Notation A popular notation of a context-free grammar is Backus-Naur Form. The notation was named after the scientist John Backus. Later, Peter Naur refined it for the use of the programming language ALGOL. However, the Backus Naur Form is used to specify the syntactic rules of many computer languages, including Java. In the Backus Naur Form, we can combine all the productions with the same non-terminal symbol on the left hand side into single statement rather than using an arrow i.e. (implies or tends to) in a production. Furthermore, we enclose all non-terminal symbols in brackets, and enlisted right hand side productions in single statement, parting them through bars. For Example: The production, A Aa, A a, A AB can be combined into (A):= (A)a | (A) (B). A context-free grammar is a description of a language in a set of rules. Each rule says how a single symbol can be expanded into a sequence of other symbols. Those symbols may, in turn, have rules and can be expanded again. Some symbols are terminals and have no rules. A legal sentence in the described language is a sequence of terminal symbols that can be produced starting with the start symbol of the grammar and proceeding via some sequence of expansions. Grammars are designed to describe languages, where in our context a language is just a set of strings. Abstractly, we think of strings as a sequence of so-called

CONTEXT-FREE GRAMMAR 4 terminal symbols. Inside a compiler, these terminal symbols are most likely lexical tokens, produced from a bare character string by lexical analysis. (Pfenning, 2009)

Tree Interpretation Apart from rewrite interpretation of context-free grammars, there is a more basic method called as parse-tree. Parsing can be viewed as a search problem, using a finite and ordered rooted tree. In which the starting symbol is considered the root of the tree. The internal vertices of this represent the non-terminal symbols that arise during the derivation, where as terminal symbols are represented by the leaves. If there arises the production in the derivation of the form B w Then B has children vertices that denote every symbol in the word w from left to right. A basic structure of parse tree is as following:

Sentence

Noun phrase

Verb phrase

Article

Adjective

noun

Verb

Adverb

Quick

Fox

Jumps

wildly

CONTEXT-FREE GRAMMAR 5 Some Examples Of Context-Free Grammar Example No: 1 Well formed nested parentheses and square brackets: In this form two different types of nested parenthesis are described by the productions: SSS S () S(S) S [] S[S] With terminal symbols [] and () and the non-terminal symbol S. In such grammar following sequences of parenthesis can be derived: ([[[()()[ ] [ ] ] ] ([ ]) ]) However following pattern is wrong. [ [ [ [ (((( ] ] ] ] )))) (([ )) ([) (]) (]) (]) Example No: 2 S a S aS S bS In above examples terminals are a and b and the non terminal is S. Every regular grammar corresponds directly to a non deterministic finite automaton, so we know that this is a regular language. Example No: 3 Matching pairs; in a context-free grammar we can pair up the character in the same way as we did with brackets. Such as below S aSb S ab Example No: 4 The following is a form of context-free grammar for syntactically correct algebraic expressions in the variables x, y and z.

CONTEXT-FREE GRAMMAR 6 S x S y S z S S+S S S-S S S*S S S/S S (S) This grammar can generate the string as below (x +y) * x- z * y / (x + x) Solution: S S-S S*SS

(S)*SS/S (S)*SS/(S) ( S + S )* S S/ (S ) ( S + S )* S S *S/ (S ) ( S + S )* S S *S/ (S + S) ( x+ S )* S S *S/ (S + S ) ( x + y)* S S *S/ (S + S ) ( x+ y )* x S *y/ (S + S) ( x+ y )* x S *y/ (x + S) ( x+ y )* x z *y/ (x + S) ( x+ y )* x z *y/ (x + x) We can observe the regular grammar in more detail through the parse tree. A parse tree is an ordered rooted tree that represents the syntactic structure of the string in the contextfree grammar.

CONTEXT-FREE GRAMMAR 7 Derivations And Syntax Trees A derivation of a string for a grammar is a sequence of grammar rule applications that transforms the start symbol into the string. A derivation is determined by giving the rule applied in each step and the occurrence of its right hand side to which it is applied. For instance: 1. S S+S 2. S 1 3. S a Therefore the string 1 + 1 + a can be derived with the help of following steps. S S+S (rule 1 on first S) (rule 1 on second S)

S+S+S (rule 2 on second S) S+1+S (rule 3 on second S) S+1+a (rule 2 on first S) 1+1+a The steps given above can be summarized as: Rule 1, rule 2, rule 1, rule 2, rule 3. It is necessary to distinguish between the left most and right most derivation. As the pattern denotes the order in which the chunks of the code will be executed. Context-Free Grammar Versus Regular Expressions Context-free grammars are more powerful than that of the regular expressions. Anything that can be generated by regular expressions can be generated by context-free grammar. However, there are languages that can be generated by context-free grammar but not by any regular expression. The method to stimulate a RE using a context-free grammar is more powerful than a regular expression. A context-free grammar is build into pieces, where each piece matches to the operand and operators in the regular expression. Let RE is a single operand. Then if RE is a character in the alphabet, add to G the production <RE> RE

CONTEXT-FREE GRAMMAR 8 If RE is null then dont add a production Let RE is R1R2. Add G to the production <RE> <R1> <R2> and create the productions for regular expressions R1 and R2. Let Re is R1|R2. Add G to the production <RE> <R1> | <R2> and create the productions for regular expressions R1 and R2.

Example Of RE To CFG Is Given Below: The CFG G is built here for the RE (0|1)*111 Firstly the operands are: <0> 0 <1> 1 Secondly, the inner most operation will be like this <R1> <0> | <1> Now the closure operator will be <R2> <R1> <R2> | epsilon After that, the concatenation operators are: <RE> R2, R3, R4, R5 <R3> <1>

<R4> <1> <R5> <1> Finally the grammar is <RE> R2, R3, R4, R5 <R2> < R1> <R2> | epsilon <R1> <0> <1> <R3> <1>

<R4> <1> <R5> <1> <0>0 <1> 1

CONTEXT-FREE GRAMMAR 9 Conclusion In the area of software engineering, context-free grammars are of supreme importance for defining the syntactic component of programming languages. Grammars are increasingly being used in various software development scenarios, and recent efforts seek to shape an engineering discipline for grammars. Hence, we can conclude that this discipline will emerge as fundamental for programming languages as far as computer science is concerned.

CONTEXT-FREE GRAMMAR 10 Bibliography

H.Rosen, K. (1999). Discrete Mathamatics and its Applications. AT&T Laboratories. LeBlanc, T. (1996, January 16). Retrieved June 23, 2011, from

http://www.cs.rochester.edu/~nelson/courses/csc_173/grammars/ McCallum, A. (2007, Fall). Introduction to Natural Language Processing. Retrieved May 10th, 2011, from Umass: http://www.cs.umass.edu/~mccallum/courses/inlp2007/lect5-cfg.pdf

You might also like