Programming Language Syntax - Exam Notes
1. Introduction to Syntax
● Syntax refers to the structure and rules governing valid program statements.
● It is essential for defining the correct form of a programming language.
● Key Components:
○ Alphabets: Characters allowed in a language.
○ Tokens: Smallest units of meaning (e.g., keywords, identifiers, operators).
○ Grammars: Rules defining how tokens combine to form valid constructs.
2. Lexical Analysis
● The process of converting source code into tokens.
● Lexical Rules: Defined using regular expressions.
● Lexical Errors: Occur due to invalid character sequences.
● Tools Used: Lexical analyzers like Lex/Flex.
3. Context-Free Grammars (CFGs)
● Used to define syntax rules formally.
● Components:
○ Terminals: Actual symbols from the language.
○ Non-terminals: Abstract symbols representing groups of terminals.
○ Production Rules: Define how non-terminals expand into terminals.
○ Start Symbol: The root of the grammar tree.
Example CFG Rule:
expr -> expr + term | expr - term | term
term -> term * factor | term / factor | factor
●
4. Parse Trees & Abstract Syntax Trees (ASTs)
● Parse Tree: Shows the full structure as per grammar.
● AST: Simplified tree focusing on essential constructs, ignoring unnecessary details.
● Used for syntax analysis and semantic processing.
5. Parsing Techniques
● Top-Down Parsing: Constructs the parse tree from root to leaves.
○ Recursive Descent Parsing: Uses recursive procedures.
○ LL(k) Parsers: Predictive parsers reading left-to-right.
● Bottom-Up Parsing: Constructs the parse tree from leaves to root.
○ LR(k) Parsers: More powerful than LL parsers, used in compiler design.
○ Shift-Reduce Parsing: Used in LALR parsers (e.g., Yacc/Bison).
6. Ambiguity in Grammars
● A grammar is ambiguous if a string has multiple valid parse trees.
Example:
expr -> expr + expr | expr * expr | id
●
● Solution:
○ Define operator precedence & associativity explicitly.
○ Use transformations to remove ambiguity.
7. Syntax-Directed Translation (SDT)
● Links syntax with semantic processing.
● Annotated Parse Trees: Parse trees with additional semantic rules.
● Synthesized vs. Inherited Attributes:
○ Synthesized: Computed from child nodes.
○ Inherited: Derived from parent or sibling nodes.
8. Regular Expressions vs. CFGs
● Regular Expressions: Define lexical structure (tokenization).
● CFGs: Define syntactic structure (grammar rules).
● Example Comparison:
○ Regular Expression: [a-zA-Z]+ (identifiers)
○ CFG Rule: identifier -> letter identifier | letter
9. Compiler Phases Related to Syntax
● Lexical Analysis: Tokenization.
● Syntax Analysis: Parsing using CFGs.
● Semantic Analysis: Ensures meaningful constructs.
● Code Generation: Translates AST to machine code.
10. Key Exam Tips
● Understand CFGs and be able to construct simple grammars.
● Practice parsing techniques (LL, LR, Shift-Reduce).
● Be clear on ambiguity resolution and operator precedence.
● Revise lexical vs. syntactic analysis concepts.
● Learn differences between Parse Trees & ASTs.