0% found this document useful (0 votes)
24 views3 pages

What Do You Mean by Code Optimization

Uploaded by

ymohit0795
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views3 pages

What Do You Mean by Code Optimization

Uploaded by

ymohit0795
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

----What do you mean by code optimization? Why is the code optimized?

----Peephole Optimization:
Code optimization refers to the process of improving the e4iciency of a program by making it Peephole optimization is a local optimization technique in compilers that improves a small
run faster, use less memory, or consume fewer resources while maintaining the same part (or "window") of the generated code, typically a few consecutive instructions. The goal is
functionality. to replace a sequence of instructions with a more e4icient sequence, often using patterns or
Why is code optimized? 1.Performance: Faster execution of programs (e.g., reduced CPU simplifications.
time). 2. Resource e4iciency: Lower memory usage and reduced disk space. Characteristics of Peephole Optimization: 1.Local Optimization: Focuses on optimizing a
3.Energy e4iciency: In some systems, optimized code reduces power consumption, which is small region of code (a "peephole"), usually just a few consecutive instructions.
important for mobile devices or embedded systems. 4.Scalability: Optimized code can handle 2.Pattern Matching: Identifies and replaces common instruction patterns with more e4icient
larger inputs or more concurrent tasks more e4ectively. ones (e.g., replacing ADD R1, R2, #0 with MOV R1, R2). 3.Simple & Fast: It is a simple and fast
The principle sources of optimization in code are: 1. Time Optimization: Reducing optimization technique, typically applied at the final stages of code generation.
execution time, often through faster algorithms or minimizing unnecessary computations (e.g., 4.Reduces Redundancy: Eliminates redundant or unnecessary instructions, such as
loop unrolling, parallelization). 2.Space Optimization: Reducing memory usage, typically by constant folding or eliminating no-op instructions. 5.Improves EYiciency: Can result in
using more e4icient data structures or eliminating redundant data storage. 3.Algorithmic reduced code size and improved performance by optimizing common subexpressions,
Optimization: Choosing more e4icient algorithms (e.g., using quicksort instead of bubble eliminating dead code, or simplifying instructions.
sort) to solve the same problem in less time or space. 4.Compiler Optimizations: Leveraging
compiler features like instruction reordering, constant folding, and loop optimization to ------Symbol Table:
enhance performance without changing the source code. 5.Memory Access Optimization: A symbol table is a data structure used by compilers to store information about variables,
Improving the way memory is accessed (e.g., cache optimization, memory pooling) to reduce functions, objects, and other symbols in a program. It holds details such as the symbol's
latency and improve throughput. 6.Parallelization: Breaking the program into concurrent name, type, scope, memory location, and other attributes required for code generation and
tasks that can be executed simultaneously, making use of multiple processors or cores. semantic analysis.
------Code generation refers to the process in a compiler or programming toolchain where Key Functions of a Symbol Table: 1.Tracking Identifiers: Stores symbols (e.g., variable
intermediate representations (IR) of a program are translated into machine code, assembly names, function names). 2.Scope Management: Keeps track of the scope (e.g., local, global)
language, or an executable format that can be directly run by a computer's hardware. of each symbol. 3.Type Checking: Ensures the correct types are assigned to variables and
Key Aspects of Code Generation: 1.Intermediate Representation (IR): After parsing and expressions. 4.Memory Allocation: Helps in assigning memory addresses to variables.
semantic analysis, a compiler typically produces an intermediate representation (e.g., Data Structures Used for Symbol Tables: 1.Hash Table:Most common, provides fast
abstract syntax tree, control flow graph). Code generation takes this IR and converts it into lookups and insertions (average O(1) time complexity).E4icient for handling a large number of
low-level code. 2.Target Code: The generated code is typically specific to the target machine symbols. 2.Binary Search Tree (BST): Symbols are stored in a sorted order, enabling e4icient
architecture (e.g., x86, ARM) or a virtual machine (e.g., Java bytecode for the JVM). searching, insertion, and deletion. O(log n) complexity for lookups and insertions.
3.Optimization: During code generation, the compiler might also perform optimization to 3.Linked List:Simple to implement but ine4icient for large datasets (O(n) for search
improve the performance of the generated code, such as register allocation or instruction operations).Used in simple or small-scale implementations.
selection. 4.AVL Tree / Red-Black Tree:Self-balancing binary search trees that provide O(log n) time
Various issue in code generation------Here are the key issues in code generation: 1.Target complexity for all operations.Useful for maintaining sorted order while ensuring fast lookup.
Architecture Dependence: Di4erent processors have di4erent instruction sets and features, 5.Array of Buckets:A hash table with a predefined array, where each array element (bucket)
making it challenging to generate e4icient code for various platforms. 2.Register Allocation: contains a list of symbols.Suitable for handling collisions in hash tables.
Limited registers require careful allocation to variables and intermediate values to minimize
memory access and improve performance. 3. Instruction Selection: Not all high-level ------What is meant by syntax directed definition? Explain the order of syntax directed
operations map directly to a single machine instruction, requiring optimal instruction definitions in detail?
sequences. 4.Instruction Scheduling: Reordering instructions to avoid delays due to A Syntax-Directed Definition (SDD) is a formal method used in compilers to associate
dependencies and maximize processor pipeline e4iciency. 5. Memory Management: E4icient semantic rules with the grammar of a programming language. It specifies how the semantic
handling of memory (e.g., stack vs. heap) and minimizing memory accesses. values (like types, values, or attributes) of constructs in a language are derived based on the
syntax of the language.
-------LOOP OPTIMISATION - Loop optimization in compiler design focuses on improving the Order of Syntax-Directed Definitions:--SDDs are classified based on the order of the
performance of loops by reducing execution time and resource usage. Key techniques include: semantic rules: 1.Syntactic Order (SDO):The order of semantic rules is determined by
1. Loop Unrolling: Expanding the loop body to reduce loop control overhead. the structure of the grammar., Top-down (Left-to-Right): For each production, the rules are
2.Loop Fusion: Merging multiple loops that iterate over the same range to reduce overhead. 3. applied in the same order as the grammar. 2.Attribute Propagation:Inherited Attributes:
Loop Tiling (Blocking): Breaking loops into smaller blocks to improve cache locality and Propagated from parent to child in the parse tree (top-down)., Synthesized Attributes:
memory access. 4. Loop Parallelization: Executing loop iterations concurrently across Computed from child nodes to parent nodes (bottom-up).
multiple processors. 5. Strength Reduction: Replacing costly operations (e.g., multiplication) 3.SDD Orders:L-attributed Definition: A definition is L-attributed if the inherited attributes of a
with simpler ones (e.g., addition). non-terminal in a production are determined by its left-hand side and its preceding symbols.
6.Dead Code Elimination: Removing unnecessary computations within loops. S-attributed Definition: A definition is S-attributed if all attributes are synthesized, i.e.,
7.Loop Inversion: Reversing loop conditions to simplify and speed up execution. computed bottom-up.

-------Global Data Flow Analysis is a technique used in compilers to track how data values -----Intermediate code generation is a phase in a compiler where the high-level source code
flow through the entire program or across multiple basic blocks. It helps optimize code by is translated into an intermediate representation (IR), which is a lower-level, machine-
analyzing variable usage, dependencies, and lifetimes throughout the entire control flow of a independent code. This code serves as a bridge between the front-end (parsing) and back-end
program. (code generation) of the compiler.
Key Concepts: 1.Data Flow Facts: Information about how values are assigned, modified, and Key Points: 1.Purpose: The main goal is to abstract away the details of the target machine
used by variables. 2.Reaching Definitions: Tracks where each variable is assigned a value and while retaining enough information for optimization and final machine code generation.
whether that value "reaches" a particular point in the program. 3.Live Variable Analysis: 2.Machine-Independent: The intermediate code is independent of any specific hardware
Determines which variables are "live" (i.e., have values that will be used in the future). architecture, allowing the same intermediate code to be translated into machine code for
4.Available Expressions: Identifies expressions that are already computed and can be di4erent platforms. 3.Simplicity: It simplifies further optimization and code generation tasks,
reused, avoiding redundant computations. 5.Constant Propagation: Propagates constant as the intermediate code is usually easier to manipulate and optimize than high-level source
values throughout the program to optimize expressions. code.Examples of Intermediate Code:Three-Address Code (TAC): A common form of IR,
Purpose: 1.Optimization: Identifies opportunities for optimizations like dead code where each instruction has at most one operation and two operands.Quadruples: A tuple of
elimination, constant folding, and common subexpression elimination. (operator, operand1, operand2, result).Abstract Syntax Trees (ASTs): Some compilers
2.Correctness: Ensures correctness by maintaining valid data flow relationships across the generate IR as a tree-like structure.
program.
-------Syntax Trees:
-------A simple code generator in a compiler converts an intermediate representation (IR) of a A syntax tree (or parse tree) is a tree representation of the syntactic structure of a program
program into target machine code or assembly language. The design of a simple code according to its grammar. Each node in the tree corresponds to a grammatical construct
generator involves the following key steps: (either terminal or non-terminal), and the tree structure represents how the program's
1. Input: Intermediate Representation (IR): The input to the code generator is typically an components are hierarchically organized.
intermediate representation such as three-address code (TAC) or abstract syntax tree (AST). Construction of Syntax Tree:
2. Instruction Selection: The code generator selects machine or assembly instructions 1. Grammar Definition: Define the grammar rules (e.g., E → E + T or T → id).
corresponding to each operation in the IR. 2. Parsing the Input: Parse the source code based on the grammar using a parser
3. Register Allocation: Variables and temporary values are assigned to processor registers to (like top-down or bottom-up parsing).
minimize memory access. The code generator must decide which variables will be stored in 3. Apply Grammar Rules: For each expression, apply the appropriate grammar
registers and which will stay in memory. rule, expanding non-terminals into further sub-expressions.
4. Instruction Scheduling: Instructions are ordered to minimize delays and maximize 4. Build the Tree: Create a node for each non-terminal and connect them according
processor e4iciency, considering dependencies between instructions. to the grammar rules, with leaf nodes for terminals.
5. Code Emission: The final output is the generated target machine code or assembly
instructions, ready for execution.
------Run-Time Storage Allocation Strategies
Run-time storage allocation refers to how memory is managed for variables, functions, and
data during the execution of a program. There are several strategies used for dynamic memory
allocation during the run-time phase of a program's execution. ------Phases of a Compiler:A compiler typically works in several phases:
Summary: 1.Stack Allocation: Fast, scope-based, but limited in size. 2.Heap Allocation: 1.Lexical Analysis:Converts the source code into tokens (keywords, operators, identifiers,
Flexible, but slower and requires manual or garbage collection. 3.Static Allocation: Fixed at etc.).Output: Tokens. 2.Syntax Analysis:Builds a syntax tree (parse tree) from the tokens
compile-time, fast, but lacks flexibility. 4.Data Segment Allocation: Fixed, stores global based on the language's grammar. Output: Syntax Tree. 3.Semantic Analysis:
variables and constants. 5.Memory Pool: E4icient for managing specific types of memory Checks for logical errors (e.g., type mismatches, undeclared variables).Output: Annotated
allocations. syntax tree or intermediate representation. 4.Intermediate Code Generation:
------General Structure of Activation Record Converts the syntax tree into an intermediate representation (IR), often a low-level code.
An activation record (also called a stack frame) is a data structure used by a compiler to Output: Intermediate code (e.g., Three-Address Code). 5.Optimization:Optimizes the
manage function calls and local variables during the execution of a program. It is pushed onto intermediate code to improve performance or reduce resource usage.Output: Optimized
the call stack when a function is called and popped when the function returns. Each intermediate code. 6.Code Generation:Converts the intermediate code to the target machine
activation record corresponds to a single function invocation. code or assembly.Output: Machine code or assembly code. 7.Code Linking and Assembly:
| Return Address | <- Return point (where to continue after function call) Links the generated machine code with libraries and other modules to create the final
| Parameters (Args) | <- Arguments passed to the function executable.Output: Executable program.
| Local Variables | <- Variables declared within the function
| Saved Registers | <- Registers to restore upon return
| Control Link | <- Points to the caller's activation record ----Bottom-Up Parsing: Bottom-Up Parsing starts from the input tokens and reduces them to
| Access Link | <- Points to the activation record of the enclosing function the start symbol of the grammar.
| Return Value | <- Value returned by the function Operations:1.Shift: Push the next input symbol onto the stack. Reduce: Replace matched
symbols on the stack with the corresponding non-terminal from the production rule.
Purpose of Each Component : Return 1.Address: Ensures control returns to the correct Advantages of Bottom-Up Parsing: 1.Handles Complex Grammars: Can parse a larger set of
point after function execution. 2.Parameters: Stores the arguments passed to the function for context-free grammars. 2.Error Recovery: Easier to detect and recover from errors.
use within the function. 3.Local Variables: Allocates memory for variables that are local to 3.E4iciency: In many cases, more e4icient than top-down parsers.
the function and are discarded after execution. 4.Saved Registers: Preserves registers that Types of LR Parsers: 1.SLR (Simple LR):Description: Uses simpler parsing tables based on
the function might alter, so the caller can resume execution correctly.5.Control Link: Allows the first sets of productions. B - Advantages: Simple to implement. C - Limitations: Can’t
the function to return to the correct location in the caller’s activation record. 6.Access Link: handle all grammars (conflicts in some cases). 2.LR(1):Description: Uses a 1-token lookahead
Provides access to non-local variables (used in nested functions or closures). to build more precise parsing tables. B-Advantages: Powerful, handles more grammars.
7.Return Value: Holds the result that will be returned to the caller. C-Limitations: More complex and memory-intensive. 3.LALR (Look-Ahead LR):
Description: Merges states from LR(1) to reduce table size. B- Advantages: Compact like SLR,
-------Algorithm to Convert NFA to DFA: but more powerful. C-Limitations: Can still encounter conflicts in some grammars.
The process of converting a Non-deterministic Finite Automaton (NFA) to a Deterministic
Finite Automaton (DFA)involves the following steps:1. Convert NFA to DFA: ----Top-Down Parsing:Top-Down Parsing is a parsing strategy where the parser starts from
1.Initialize:Start with the initial state of the NFA and its ε-closure (set of states reachable from the start symbol of the grammar and tries to derive the input string by applying production
the initial state through ε-transitions). B Create a new DFA state representing this ε-closure. rules, recursively expanding non-terminals into terminals. B- It constructs the parse tree from
2.Process Each Input Symbol:For each state in the DFA, and for each input symbol, compute the top (root) to the leaves.
the set of states the NFA can transition to (including any ε-closures).3.Create New DFA Problems with Top-Down Parsing: 1.Left Recursion:
States:If the set of states computed for a particular input symbol hasn’t been encountered Problem: If a grammar contains left recursion (e.g., A → Aα | β), top-down parsers can fall into
before, create a new DFA state and add it to the DFA.4.Accepting States:Any DFA state that infinite recursion. B-Solution: Left recursion must be eliminated before parsing.
includes an accepting state from the NFA is an accepting state in the DFA. 5.Repeat:Repeat 2.Backtracking:Problem: If the parser makes an incorrect choice at some step, it may need to
steps 2 and 3 until no new DFA states are created. backtrack and try another production, making the process ine4icient.Solution: Predictive
2. Minimize the DFA: parsers like LL(1) avoid backtracking by using lookahead tokens.
After converting the NFA to a DFA, you can minimize the DFA using state 3.Ambiguity:Problem: Some grammars may be ambiguous, where multiple parse trees can
minimization techniques such as the Partition Refinement Algorithm: be derived for the same input.Solution: Ambiguity must be resolved, usually by changing the
1.Group States:Initially, divide states into two groups: accepting and non-accepting states. grammar or using additional constraints.
2.Refine Groups:Iteratively refine the groups by splitting them based on transitions for each 4.Limited Grammar:Problem: Top-down parsers (like LL parsers) can only
input symbol. States that transition to di4erent groups for the same input symbol must be handle LL(k) grammars, which limits the types of grammars they can parse.
placed in di4erent groups.3.Final States:Continue refining the groups until no further splits Solution: More powerful parsers like LR parsers can handle a broader range of grammars.
are possible. Each group represents a single state in the minimized DFA.
-----Linkers and Loaders:Linker:A linker combines multiple object files generated by the
Role of Lexical Analysis: compiler into a single executable program.It resolves references between functions or
Lexical analysis is the first phase of a compiler, responsible for converting the source variables across di4erent object files, ensuring that addresses of functions, variables, and
code (a stream of characters) into a sequence of tokens. A token is a meaningful libraries are correctly assigned. 2-Loader:A loader loads the executable program into memory
sequence of characters that corresponds to a basic language construct, such as for execution.It assigns memory addresses to the program and its data and prepares it for
keywords, operators, identifiers, constants, and punctuation. execution by the CPU.
Error Recovery in Lexical Phase of a Compiler Static and Dynamic Scope for Linking: Static Linking:Definition: All external references
In the lexical phase, error recovery handles issues like unrecognized tokens or invalid (e.g., functions, variables) are resolved at compile time. B- How it works: The linker directly
characters. Here are common strategies: 1.Panic Mode: Skip characters until a valid token is includes libraries and resolves addresses before execution. C- Advantage: Faster execution
found (quick but may miss other errors). 2.Error Productions: Use special error rules to (no overhead at runtime). D-Disadvantage: Larger executable size, as all code is embedded.
create <ERROR> tokens for invalid input. 3.Input BuYering: Look ahead at characters to Dynamic Linking: Definition: External references are resolved at runtime. B- How it works:
detect and recover from errors based on context. 4.Backtracking: Re-parse and try di4erent The program links to libraries when it is executed, often using shared libraries.
lexical rules if an error is found. C-Advantage: Smaller executable size, allows for code updates without recompilation.
5.Error Reporting: Provide detailed error messages to help debug the source code. D-Disadvantage: Slight overhead during execution due to runtime linking.

-----A compiler is a program that translates the entire source code of a high-level ---General Loader Schemes:A loader is a system program that loads an executable file into
programming language into machine code (or intermediate code) in one go, creating memory for execution. It prepares the machine code and necessary data for the CPU to
an executable file. The translation is done before execution, and once compiled, the program execute. There are several loader schemes based on how they manage memory and load the
can be run multiple times without recompiling. program. Here are the general types of loader schemes:
DiYerence Between Compiler and Interpreter: 1.Absolute Loader:Definition: An absolute loader loads the program into a specific memory
Feature Compiler Interpreter address (predefined) without any modification.How it works: The program's machine code is
Translates entire source code at Translates line by line or statement by loaded directly into the specified memory locations.Advantages: Simple and e4icient for
Translation small systems.Disadvantages: It does not support relocation (moving programs to di4erent
once. statement.
Executes the code directly without memory locations), making it less flexible.
Execution Produces an executable file. Relocating Loader:Definition: A relocating loader adjusts memory references in the program
generating an intermediate file.
so it can be loaded at any address in memory.How it works: It modifies the addresses in the
Generally faster at runtime (after Slower execution, as translation happens
Speed code (like function calls or variables) to reflect the correct memory locations after the program
compilation). during runtime.
is loaded.Advantages: Allows programs to be loaded into any available memory space,
Error Detects all errors at once (after full
Stops at the first error and reports it. improving flexibility.Disadvantages: Requires extra time to adjust addresses, leading to
Detection compilation).
slightly higher overhead.
Memory Requires more memory for Requires less memory, no need to
Usage generating machine code. generate an intermediate file.
Direct Linking Loader:Definition: This loader is used when the program's external references
(e.g., function calls) are already resolved during linking, so no further adjustment is needed
during loading.How it works: It directly loads the program, and the necessary addresses are
already determined during the linking phase.Advantages: Faster, as no relocation or address
adjustments are necessary during loading.Disadvantages: Requires that all references be
resolved before loading.
Dynamic Linking Loader:Definition: This loader uses dynamic linking where libraries and
modules are linked during runtime rather than at compile time.How it works: The loader loads
the program and dynamically links shared libraries or modules into memory when the program
runs.Advantages: Reduces the size of the executable and allows for the updating of shared
libraries without recompiling the program.Disadvantages: Slight performance overhead at
runtime due to dynamic linking.
Bootstrap Loader:Definition: A bootstrap loader is a small program that loads the full
operating system or other loaders.How it works: It is typically the first program to run when
the computer is powered on, initiating the operating system or other system components.
Advantages: Essential for starting up the system.
Disadvantages: Extremely limited in functionality, as it’s very simple and small in size.

Load-and-Go Loader:Definition: A load-and-go loader loads a program into memory and


immediately starts execution.How it works: The program is loaded directly into memory, and
the CPU begins executing the code as soon as it's loaded.Advantages: Simple and fast for
small programs.Disadvantages: Limited flexibility, especially for complex systems or large
programs.

---------System Software Tools:


System software tools help manage and optimize system functions, aiding in software
development, execution, and system maintenance. Key tools include:
1. Compilers: Convert high-level code into machine code (e.g., GCC, Clang).
2. Linkers: Combine object files into executables and resolve references (
3. Loaders: Load executables into memory for execution (e.g., Windows Loader).
4. Assemblers: Convert assembly code to machine code (e.g., MASM, NASM).
5. Debuggers: Help find and fix bugs in code (e.g., GDB, Visual Studio Debugger).
6. IDEs/Editors: Provide coding, debugging, and project management features
7. Profilers: Analyze program performance (e.g., gprof, Valgrind).
8. Version Control Systems: Track code changes and support collaboration (
9. System Monitors: Track system performance (e.g., Task Manager, top).
10. File Management: Organize and manage files (e.g., File Explorer, ls).
11. Virtual Machines/Emulators: Simulate environments and hardware
12. OS Utilities: Tools for system maintenance (e.g., Disk Cleanup, Backup Tools).

-----Subroutine Linkage:Subroutine linkage refers to the process of transferring control to a


subroutine (function or procedure) and ensuring proper return after the subroutine execution.
It involves:
1. Saving the state (e.g., registers, program counter) of the calling function.
2. Passing parameters to the subroutine, typically via a stack or registers.
3. Calling the subroutine, transferring control to its entry point.
4. Returning from the subroutine by restoring the saved state and returning control
to the caller.
-----Error Handling in LR Parsing:
In LR parsing, error handling is crucial to manage invalid input or unexpected situations during
the parsing process. It involves:
1. Error Detection:The parser checks if the current input token matches any valid
production in the parsing table. If not, an error is detected.
2. Error Recovery:Panic Mode: The parser discards input symbols until it finds a
synchronizing token (like a semicolon or closing parenthesis) that can help it
resume parsing.Phrase-Level Recovery: The parser attempts to apply specific
error productions or shifts to recover from common errors, often by replacing a
mismatched token with a valid one.Error Productions: Some grammars include
special error rules to guide the parser in handling mistakes.
3. Error Reporting:The parser typically reports the error location (line, column) and
provides a descriptive message, helping the programmer identify the issue.

You might also like