Documentation

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 36

1

Contents
S. NO. 1. TOPICS NAMES PAGE NO.

INTRODUCTION

2.

EXISTING SYSTEM

3.

HARDWARE REQUIREMENTS

&

SOFTWARE

4.

TECHNOLOGIES USED

5.

S.D.L.C.

6.

SCREENS

32

7.

REFERENCES

44

0506CS081037

0506CS081037

PARSER GENERATOR
In computer science, a compiler-compiler or compiler generator is a tool that creates a parser, interpreter, or compiler from some form of formal description of a language and machine. The earliest and still most common form of compiler-compiler is a parser generator, whose input is a grammar of a programming language, and whose generated output is the source code of a parser often used as a component of a compiler. A typical parser generator associates executable code with each of the rules of the grammar that should be executed when these rules are applied by the parser. Parser generator is an extensive, visual, interactive tool for designing and experimenting with grammar. These pieces of code are sometimes referred to as semantic action routines since they define the semantics of the syntactic structure that is analyzed by the parser. Depending upon the type of parser that should be generated, these routines may construct a parse tree (or abstract syntax tree), or generate executable code directly. Parser generator reads a grammar and generates a recognizer for the language defined by the grammar (i.e. a program that reads an input stream and generates an error if the input stream does not conform to the syntax specified by the grammar). If there are no syntax errors, then parser generator creates a parse tree.

0506CS081037

EXISTING SYSTEM: Yacc


The computer program Yacc is a parser generator developed by Stephen C. Johnson at AT&T for the UNIX operating system. The name is an acronym for "Yet Another Compiler Compiler." It generates a parser (the part of a compiler that tries to make syntactic sense of the source code) based on an analytic grammar written in a notation similar to BNF. The parser generated by Yacc requires a lexical analyzer. Lexical analyzer generators, such as Lex or Flex are widely available. Some versions of AT&T Yacc have become open source. For example, source code (for different implementations) is available with the standard distributions of Plan 9 and Open Solaris.

ANTLR
In computer-based language recognition, ANTLR (pronounced Antler), or Another Tool for Language Recognition, is a parser generator that uses LL(*) parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), first developed in 1989, and is under active development. ANTLR takes as input a grammar that specifies a language and generates as output source code for a recognizer for that language. A language is specified using a context-free grammar which is expressed using Extended BackusNaur Form (EBNF). ANTLR allows generating parsers, lexers, tree parsers, and combined lexer-parsers. Parsers can automatically generate abstract syntax trees which can be further processed with tree parsers. ANTLR provides a single consistent notation for specifying lexers, parsers, and tree parsers. This is in contrast with other parser/lexer generators and adds greatly to the tool's ease of use. By default, ANTLR reads a grammar and generates a recognizer for the language defined by the grammar (i.e. a program that reads an input stream and generates an error if the input stream does not conform to the syntax specified by the grammar). If there are no syntax errors, then the default action is to simply exit without printing any message. In order to do something useful with the language, actions can be attached to grammar elements in the grammar. These actions are written in the programming language in which the recognizer is being generated. When the recognizer is being generated, the actions are embedded in the source code of the recognizer at the appropriate points. Actions can be used to build and check symbol tables and to emit instructions in a target language, in the case of a compiler. As well as lexers and parsers, ANTLR can be used to generate tree parsers. These are recognizers that process abstract syntax trees which can be automatically generated by parsers. These tree parsers are unique to ANTLR and greatly simplify the processing of abstract syntax trees.

0506CS081037

Goals of Project: -

0506CS081037

0506CS081037

MINIMUM HARDWARE REQUIREMENTS

256 MB RAM. Pentium II 800 MHz Processor. Hard Disk 20 GB. Monitor. Key Board - 101 Keys. Mouse.

MINIMUM SOFTWARE REQUIREMENTS

0506CS081037

0506CS081037

Java
Java is a programming language originally developed by James Gosling at Sun Microsystems (which has since merged into Oracle Corporation) and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities. Java applications are typically compiled to byte code (class file) that can run on any Java Virtual Machine (JVM) regardless of computer architecture. Java is a general-purpose, concurrent, class-based, object-oriented language that is specifically designed to have as few implementation dependencies as possible. It is intended to let application developers "write once, run anywhere" (WORA), meaning that code that runs on one platform does not need to be recompiled to run on another. Java is currently one of the most popular programming languages in use, particularly for client-server web applications, with a reported 10 million users. It promised "Write Once, Run Anywhere" (WORA), providing no-cost run-times on popular platforms. Fairly secure and featuring configurable security, it allowed network- and file-access restrictions. Major web browsers soon incorporated the ability to run Java applets within web pages, and Java quickly became popular. With the advent of Java 2 (released initially as J2SE 1.2 in December 19981999), new versions had multiple configurations built for different types of platforms. For example, J2EE targeted enterprise applications and the greatly stripped-down version J2ME for mobile applications (Mobile Java). J2SE designated the Standard Edition Principles There were five primary goals in the creation of the Java language: 1. 2. 3. 4. 5. It should be "simple, object-oriented and familiar" It should be "robust and secure" It should be "architecture-neutral and portable" It should execute with "high performance" It should be "interpreted, threaded, and dynamic"

Java platform One characteristic of Java is portability, which means that computer programs written in the Java language must run similarly on any hardware/operating-system platform. This is achieved by compiling the Java language code to an intermediate representation called Java byte code, instead of directly to platform-specific machine code. Java byte code instructions are analogous to machine code, but are intended to be interpreted by a virtual machine (VM) written specifically for the host hardware. End-users commonly use a Java Runtime Environment (JRE) installed on their own machine for standalone Java applications, or in a Web browser for Java applets.

0506CS081037

10

FRONT END:

Lexical Analyzer:
Lexical Analyzer reads the source program character by character and returns the tokens of the source program. A token describes a pattern of characters having same meaning in the source program. (such as identifiers, operators, keywords, numbers, delimeters and so on) Puts information about identifiers into the symbol table. Regular expressions are used to describe tokens (lexical constructs). A (Deterministic) Finite State Automaton can be used in the implementation of a lexical analyzer. The main job of a lexical analyzer (scanner) is to break up an input stream into more usable elements (tokens) Lexical analyzer is an utility to help you rapidly generate your scanners Ex: newval := oldval + 12 tokens:Newval = identifier := = assignment operator Oldval = identifier + = add operator 12 = a number The Input: Read string input Might be sequence of characters Might be sequence of lines Character set ASCII

0506CS081037

11 The Output: ISO Latin-1 ISO 10646 (16-bit = Unicode) Others (EBCDIC, JIS, etc)

A series of tokens Punctuation ( ) ; , [ ] Operators + - ** := Keywords begin end if Identifiers Square Root Character literals x Numeric literals 123 4_5.23e+2 16#ac# Syntax Analyzer A Syntax Analyzer creates the syntactic structure (generally a parse tree) of the given program. A syntax analyzer is also called as a parser. A parse tree describes a syntactic structure.

Ex:

The Input:

= E -> T E E-> +T E | T-> F T T-> *F T | F-> (E) | id

The Output:

0506CS081037

12

From text to abstract syntax

0506CS081037

13

BACK END:
PARSING TECHNIQUES:

0506CS081037

14 TOP DOWN PARSING Top down parsing is constructing a parse tree for the input starting from the root and create nodes of the parse tree in preorder (depth first). A general form of top down parsing is the recursive descent parsing. A recursive descent parsing is a top down parsing technique that execute a set of recursive procedures to process the input, that involves backtracking(means scanning the input repeatedly). Backtracking is time consuming and therefore, inefficient. Top down parsing is constructing a parse tree for the input starting from the root and create nodes of the parse tree in preorder(depth first). A general form of top down parsing is the recursive descent parsing.A recursive descent parsing is a top down parsing technique that execute a set of recursive procedures to process the input, that involves backtracking(means scanning the input repeatedly). Backtracking is time consuming and therefore, inefficient. Thats why a special case of top down parsing was developed, called predictive parsing, where no backtracking is required. A dilemma can occur if there is a left recursive grammar. Even with backtracking, you can find the parser to go into an infinite loop. There are two types of recursion, left recursive and right recursive, based on its name, a left recursive grammar build trees that grows down to the left, while right recursive is vice versa.

Types of top down parsing


Top Down Parsing Techniques

LL 1 Parsing

Brute Force Parsing

Top down parsing: a*b+c

0506CS081037

15

LL(1)

It is also known as Predictive Parsing-a parsing technique that uses a look ahead symbol to determine if the current input arguments matches the look ahead symbol. First and Follow Construction of Predictive Parsing Tables LL(1) Grammars Error Recovery

1. First and Follow: First and Follow aids the construction of a predictive parser. They allow us to fill in the entries of a predictive parsing table. a is any string of terminals , then First(a) is the set of terminals that begin the strings derived from a. If a is an empty string (), then is also in First (a). Follow (A), for a non terminal A , to be the set of terminals a that can appear immediately to the right of A in a sentential form. Rules in computing FIRST (X) where X can be a terminal or nonterminal, or even (empty string). 1) If X is a terminal, then FIRST(X) = X. 2) If X is , then FIRST (X) = . 3) If X is a non terminal and Y and Z are non terminals, with a production of

0506CS081037

16 Consider again grammar G: E -> T E E -> +T E | T -> F T T -> *F T | F -> ( E ) | id ANSWER FIRST(E) = FIRST(T) = FIRST(F) = { ( , id } FIRST (E) = { + , } FIRST (T) = { *, }

Rules in computing FOLLOW ( X) where X is a no terminal 1) If X is a part of a production and is succeeded by a terminal, for example: A -> Xa; then Follow(X) = { a } 2) If X is the start symbol for a grammar, for ex: X -> AB A -> a B -> b then add $ to FOLLOW (X); FOLLOW(X)= { $ } 3) If X is a part of a production and followed by another non terminal, get the FIRST of that succeeding no terminal. Consider again grammar G: 1) E -> T E E -> +T E | T -> F T T -> *F T | F -> ( E ) | id ANSWERS FOLLOW(E) = FOLLOW(E)= { ) , $} FOLLOW (T)= FOLLOW(T)= { +, ), $} FOLLOW (F) = { +, * , ), $}

LL(1) Grammars
What does LL(1) mean?

The first L in LL(1) stands for scanning the input from left to right, the second L is for producing a leftmost derivation, and the 1 for using one input symbol of look ahead at each step to make parsing action decisions .No ambiguous or left recursive grammar is LL (1).

What should be done when a parsing table has multiple-defined entries .

One solution is to transform the grammar by eliminating all left recursion and then left factoring when possible, but not all grammars can yield an LL(1) grammar at all.

0506CS081037

17 The main difficulty in using a predictive parsing is in writing a grammar for the source language such that a predictive parser can be constructed from the grammar.To alleviate some of the difficulty, one can use a operator precedence, or even better the LR parser, that provides both the benefits of predictive parsing and operator precedence automatically.

Error Recovery
When does an error possibly occur?

An error is detected when the terminal on the top of the stack does not match the next input symbol or when the non terminal A is on the top of the stack, a is the next input symbol, and the parsing table entry M[A, a] is empty. How can we deal with errors?

Panic-mode error recovery is based on the idea of skipping symbols on the input until a token in a selected set of synch tokens appears. How does it work?

Using follow and first symbols as synchronizing tokens works well. The parsing table will be filled with synch tokens obtained from the FOLLOW set of the non terminal. When a parser looks up entry M [ A ,a ] and finds it blank, then a is skipped. If the entry is synch, then the non terminal is popped in an attempt to resume parsing.

2. Brute Force Parsing


A recursive-descent parsing program consists of a set of procedures, one for each non terminal. Execution begins with the procedure for the start symbol, which halts and announces success if its procedure body scans the entire input string. General recursive-descent may require backtracking; that is, it may require repeated scans over the input. Consider the grammar with input string cad: S -> c A d A -> a b | a

BOTTOM-UP PARSING Bottom-up parsing is a strategy for analyzing unknown information that attempts to identify the most fundamental units first, and then to infer higher-order structures from them.

0506CS081037

18 The name comes from the concept of a parse tree, in which the most fundamental units are at the bottom, and structures composed of them are in successively higher layers, until at the top of the tree a single unit, or start symbol, comprises all of the information being analyzed. Bottom-up parsing is a strategy for analyzing unknown information that attempts to identify the most fundamental units first, and then to infer higher-order structures from them. The name comes from the concept of a parse tree, in which the most fundamental units are at the bottom, and structures composed of them are in successively higher layers, until at the top of the tree a single unit, or start symbol, comprises all of the information being analyzed. Bottom-up parsing occurs in the analysis of both natural languages and computer languages. It is contrasted with top-down parsing, in which complex units are divided into simpler parts until all of the information is parsed. The language for a bottom-up parser is not as restrictive as that for a top-down parser. Left-recursion is not a problem because the tree is built from the leaves up. Strictly speaking, right-recursion, where the left-hand non terminal is repeated at the end of the right-hand side, is not a problem. However, as we will see, the bottom-up parsing method described here pushes terminals and non terminals on a stack until an appropriate right-hand side is found, and right-recursion can be somewhat inefficient in that many symbols must be pushed before a right-hand side is found

Types of bottom-up parsing


Bottom-up Parsing

Shift reduce parser

Operator precedence parser

L R parser

Bottom-up parsing:

0506CS081037

19

SLR parser
In computer science, a simple LR or SLR parser is an LR parser which uses a follow set to remove conflicts from its action table. It has fewer conflict states than an LR (0) Parser, but will typically have more conflict states than an LALR parser. For real-world computer languages, an SLR parser is usually not adequate, but for student projects in a compiler class it is a good learning tool. A grammar that has no conflicts reported by an SLR parser generator is an SLR grammar. The parsing algorithm for an SLR parser is the same as for an LR parser; the only difference between the two is how their action tables are constructed.

0506CS081037

20

0506CS081037

21 SDLC stands for Software Development Life Cycle. What is it all about? What are the various phases of it? Why should every client be aware of it? Here is the explanation to all these questions. As the name implies, it is primarily used to provide methodologies to develop computer software. It is a must to be aware of the guidelines and models in order to efficiently develop computer software. Various phases of SDLC are to be pursued in order to gain a better view of how a project could be handled. The phases of SDLC include requirements gathering and analysis, system designing, development, testing, operations and maintenance.

Requirements gathering and analysis: Under this phase, the project's goal is determined and the functions are brought to focus. Gathering of information and analysis of the user's requirement is also done in this phase. System design: A sample structure of the entire project is created in this phase and all necessary data are gathered. Development: Coding of the whole project is done in this phase. Codes are easily generated if proper design is performed in the previous stage. According to the needs of the application, programming language is selected.

0506CS081037

22 System testing: After the generation of codes, testing of all the modules is performed. All modules are integrated together and proper testing tools are selected for error checking. Operations and maintenance: Under the final stage of SDLC, the developed software is given to the users. Maintenance is necessary after the development of a successful project. It is obvious that changes occur once the project is handed over to the end user. The developers must develop the project in such a way that it is adaptable to those changes. The main operation of the software must not be affected by those changes. It is necessary for the client to be aware of all the phases of SDLC so that they can get to know about the status of the project under proposal. The client must also be aware of various operations performed under various stages. This would enable them to handle the project with ease. This also helps them to collect good quality products. Only when the clients are aware of the stages of SDLC, they can sort out the problems that arise while using the product. This would thus produce a good quality product.

System Requirement Specification:It is the study of user information requirement that is needed before the design of new information system can be completed Information gathering: Information gathering is an important tool in any analysis method. The information gathering is done through various like as follows. For information gathering we met to our project leader he provides as most of the information, which is required for the development of project. We also take some help from Internet and also from project guide, which is provided by the company.

Feasibility Study: All projects are feasible given unlimited resources and infinite time. Unfortunately, the development of a computer-based system is more likely to be plagued by a scarcity of resources and difficult delivery date. It is both necessary and prudent to evaluate the feasibility of a project at the earliest possible time. In the development of the present project, no such limitation was imposed that was not feasible, as all the resources were easily available and the time given was enough. The result of feasibility study is a formal proposal. This is simply a report- A formal document detailing the nature and scope of a proposed system.

0506CS081037

23

Feasibility Considerations:
Three key Considerations are involved in the feasibility analysis. Economic Feasibility:

Among the most important information contained in feasibility study is cost benefit analysis, an assessment of the economic justification for a computer based system project. Cost-benefit analysis delineates Costs for project development and weights them against tangible and intangible benefits of a system. The cost incurred in the development of the present project was outweighed by the benefit offered. Benefits of a new system were determined relative to the existing system. The existing system was not efficient; the flaws in that system have been mentioned previously.

Technical Feasibility: During technical analysis, the analyst evaluates the technical merits of the system concept, while at the time collecting additional information about performance, reliability, maintainability and reducibility. Technical analysis begins with an assessment of the technical viability of the proposed system, it is analyzed what kind of development environment is required. Technical feasibility centers on the existing computer system hardware, software etc. and to what extent it can support the proposed additions. What new methods and processes are required to accomplish system function and performance? A model is created based on the observation at the word or approximation based on the goals.

Behavioral Feasibility: An estimate should be made of new strong reaction; the user is likely to have toward the development of a computerized. It is common knowledge that computer installations have something to do with turnovers. Our system follows behavioral feasibility because of its friendliness in nature. Anyone can operate easily for this we have developed user interface and userfriendly system.

0506CS081037

24 We have provided with the maximum information and instruction to the users and company going to use the system. Our main emphasis is to create a users service environment.

System Analysis: It is truly said A machine can perform job of several men, so is the case with computerized system because under manual system, several people are engaged to perform the same task, which can be handled by a single machine. First, information is now recognized as a vital resource and must be managed. It is equal in importance to cash, physical facilities, and personnel. Second, more and more financial resources are committed to information systems. As computer systems are Becoming integral to business operations, top management is paying more attention to their development. Third, there is a growing need for formal longrange planning with information systems that are complex, require months or years to build, use common data bases, or have a greater competitive edge. The objectives are to map out the development of major systems and reduce the number of small, isolated systems to be developed and maintained. Proper planning for information systems. Ensures that the roe played by the systems will be congruent with that of the organization. In this project we have understood the importance of the analysis. Without a proper analysis we cannot develop proper software.

SYSTEM DESIGN: As we move towards implementation, documentation of the models becomes more precise and closer to the code that will eventually be produced from them. System Design is a solution, a how to approach to the creation of a new system .It provides the understanding and procedural details necessary for implementing the system.

A design phase is the transition from a user-oriented document (system proposal) to a document oriented to the programmers or database personnel. It emphasizes on translating performance specification into design specifications. System design goes through two phases of development: Logical design

0506CS081037

25 Physical design The Data flow diagrams and Entity-relationship Diagrams in the System Analysis section have represented logical design of the Training and Placement cell. The logical Design defines the system boundaries; it describes the inputs (source), output (destinations), databases (data stores) and procedure (data flow) all in a format that meets the user requirements. Following the logical design is the physical design. This produces the working by defining the design specifications that tell programmers what the system must do. Physical model of the Training and Placement specifies how the system will behave, when put in use.

DESIGN OBJECTIVES: The following goals are kept in mind while designing the new system: To make the system completely user friendly. The system acts as a catalyst in achieving the required objectives. To make the system compatible that it should fit in the total integrated system. To design the system in such a way that future maintenance and enhancements are negligible. To make the system understandable, reliable and cost effective. To improve the management of information by keeping it in properly structured tables.

0506CS081037

26

Screens Shots
0506CS081037

27 You can enter the grammar

0506CS081037

28

Screens Shots
You can enter the grammar (shown in Enter Grammar tutorial) then you can click on Input, followed by Brute Force Parse.

0506CS081037

29

Screens Shots
After clicking Brute Force Parse, you should see a window similar to this.

0506CS081037

30

Screens Shots
Now you can input a string and check whether it is accepted. Enter the string aabbbbbccc (a2b5c3) in the input text field. Then either click Start or press Enter key.

Screens Shots

0506CS081037

31 Brute Force Parser notifies you that the string you entered is accepted. Now, we can click on the Step button to see how we can derive that string using our productions. After clicking the Step button twice, your window should look like this.

Screens Shots

0506CS081037

32 Notice that at the bottom of the window there is a message, Derived bBc from B. This message tells us about the last production that was applied. We can also view the Derivation Table to see how the string is derived by the grammar. To change to Derivation Table mode, click on No inverted Tree and select Derivation Table. The Brute Force Parse window should change to:

Screens Shots

0506CS081037

33 The derivation table shows the productions that are applied (shown under Production) and the result of applying those productions to the start variable (shown under Derivation). It gives a clear picture of how Brute Force Parsing is working. After clicking Step until we reach our input string, the Brute Force Parse is finished and the Derivation Table and No inverted Tree should look like these:

0506CS081037

34

Screens Shots

0506CS081037

35

Presser Pereira, Luis Carlos, Jos Maria Marvell, and Adam Przeworski, 1993. Economic reforms in new democracies. Cambridge: Cambridge University Press.

0506CS081037

36 Dubois, Hans F.W. 2002. Harmonization of the European vaccination policy and the role TQM and reengineering could play. Quality Management in Health Care 10(2): 4757. J. A. Estes, M. T. Tinker, T. M. Williams, D. F. Dock "Killer Whale Predation on Sea Otters Linking Oceanic and near shore Ecosystems", Science, 16 October 1998: Vol. 282. no. 5388, pp. 473 476 Malone, T. C., D. J. Conley, T. R. Fisher, P. M. Gilbert, L.W. Harding & K.G. Sellner, 1996. Scales of nutrient-limited phytoplankton productivity in Chesapeake Bay. Estuaries, 19: 371385. Galati, K. (2008). Cognitive Psychology: In and out of the laboratory. USA: Wadsworth. Goldstein, E.B. (2010). Sensation and Perception. USA: Wadsworth. Compiler textbook references A collection of references to mainstream Compiler Construction Textbooks Aho, Alfred V.; Sethi, Ravi; and Ullman, Jeffrey D., Compilers: Principles, Techniques and Tools (ISBN 0-201-10088-6) link to publisher. Also known as The Dragon Book. Allen, Frances E., "A History of Language Processor Technology in IBM", IBM Journal of Research and Development, v.25, no.5, September 1981. Allen, Randy; and Kennedy, Ken, Optimizing Compilers for Modern Architectures, Morgan Kaufmann Publishers, 2001. ISBN 1-55860-286-0 Apple, Andrew Wilson o Modern Compiler Implementation in Java, 2nd edition. Cambridge University Press, 2002. ISBN 0-521-82060-X o Modern Compiler Implementation in ML, Cambridge University Press, 1998. ISBN 0-521-58274-1 Born at, Richard, Understanding and Writing Compilers: A Do It Yourself Guide, Macmillan Publishing, 1979. ISBN 0-333-21732-2 Cooper, Keith D., and Torsion, Linda, Engineering a Compiler, Morgan Kaufmann, 2004, ISBN 1-55860-699-8. Leveret; Cattle; Hobbs; Newcomer; Reiner; Schatz; Wolf, An Overview of the Production Quality Compiler-Compiler Project, in Computer 13(8):38-49 (August 1980) McKee man, William Marshall; Horning, James J.; Workman, David B., A Compiler Generator, Englewood Cliffs, N.J. : Prentice-Hall, 1970. ISBN 0-13155077-2 Much nick, Steven, Advanced Compiler Design and Implementation, Morgan Kaufmann Publishers, 1997. ISBN 1-55860-320-4 Scott, Michael Lee, Programming Language Pragmatics, Morgan Kaufmann, 2005, 2nd edition, 912 pages. ISBN 0-12-633951-1 (The author's site on this book). Srikant, Y. N.; Shankar, Priti, The Compiler Design Handbook: Optimizations and Machine Code Generation, CRC Press, 2003. ISBN 0-8493-1240-X

0506CS081037

You might also like