Compile Farm - A Technique in Distributed Compilation
Compile Farm - A Technique in Distributed Compilation
[Term Paper/Journal]
by
RK2202B42-11202766
School of Computer Science and Engineering
Page No.:
Compile Farm – A Technique in Distributed Compilation
by
Tanmay Baranwal
School of Computer Science and Engineering
Lovely Professional University, IN
This paper represents my own work and is formatted based upon “Standard IEEE
Format for Research Journals” in accordance with University regulations.
Contents
1. Introduction....................................................................................................................1
a. Introduction to Compiler......................................................................................1
b. Compilation...........................................................................................................1
2. Structure of Compiler......................................................................................................2
3. Compiler Construction....................................................................................................3
5. Server Farm.....................................................................................................................6
6. Distributed Compilation..................................................................................................7
8. Compile Farm.................................................................................................................10
a. GCC Based...........................................................................................................11
9. Implementations............................................................................................................15
10. References...................................................................................................................16
Page No.: 0
Chapter - 1
Introduction:
Introduction To Compiler:
Compilation:
Compilers enabled the development of programs that are machine-independent. The first
higher-level language, in the 1950s, machine-dependent assembly language was widely
used. While assembly language produces more reusable and relocatable programs than
machine code on the same architecture, it has to be modified or rewritten if the program is
to be executed on different computer hardware architecture.
Structure:
A compiler consists of three main parts: the front-end, the middle-end, and the
back-end.
Front End:
Programs are checked here in term of Syntax and Semantics of respective
programming language. Type checking is also performed by collecting type
information. The front-end then generates an intermediate representation of the
source code for processing by the middle-end.
Middle End:
Here optimization takes place. Few transformations for optimization are removal of
useless or unreachable code, discovery and relocation of computation code etc.
Back End:
It gives the output as assembly code from optimized code from middle-end.
Memory allocation and register allocation are also performed here for process
register (required for some program variables).
Page No.: 1
Chapter – 2
Structure of Compiler:
In a compiler,
• linear analysis
• is called LEXICAL ANALYSIS or SCANNING and
• is performed by the LEXICAL ANALYZER or LEXER,
• hierarchical analysis
• is called SYNTAX ANALYSIS or PARSING and
• is performed by the SYNTAX ANALYZER or PARSER.
• During the analysis, the compiler manages a SYMBOL TABLE by
• recording the identifiers of the source program
• collecting information (called ATTRIBUTES) about them: storage allocation,
type, scope, and (for functions) signature.
• When the identifier x is found by the lexical analyzer
• generates the token id
• enters the lexeme x in the symbol-table (if it is not already there)
• associates to the generated token a pointer to the symbol-table entry x. This
pointer is called the LEXICAL VALUE of the token.
• During the analysis or synthesis, the compiler may DETECT ERRORS and report on
them.
• However, after detecting an error, the compilation should proceed allowing
further errors to be detected.
• The syntax and semantic phases usually handle a large fraction of the errors
detectable by the compiler.
Page No.: 2
Chapter – 3
Compiler Construction:
All but the smallest of compilers have more than two phases. However, these phases are
usually regarded as being part of the front end or the back end. The point at which these
two ends meet is open to debate. The front end is generally considered to be where
syntactic and semantic processing takes place, along with translation to a lower level of
representation (than source code).
This phase involves grouping the characters that make up the source program into
meaningful sequences called lexemes. Lexemes belong to token classes such as "integer",
"identifier", or "whitespace". A token is produced for each lexeme. Lexical analysis is also
called scanning.
Syntax Analysis:
The output of lexical analyser is used to create a representation which shows the
grammatical structure of the tokens. Syntax analysis is also called parsing
Due to the extra time and space needed for compiler analysis and optimizations, some
compilers skip them by default. Users have to use compilation options to explicitly tell the
compiler which optimizations should be enabled.
Page No.: 3
Chapter – 4
Cluster Computing:
Parallel and distributed computing is the solution. Today a wide range of applications are
eager for higher computing power, and faster execution.
An application may desire more computational power for many reasons, but the following
three are the most common:
Real-time constraints: That is, a requirement that the computation finish within a
certain period of time. Weather forecasting is an example. Another is processing
data produced by an experiment; the data must be processed (or stored) at least
as fast as it is produced.
Memory: Some of the most challenging applications require huge amounts of data
as part of the simulation.
Architecture:
Cluster Applications:
Cluster computing is rapidly becoming the architecture of choice in Grand Challenge
Applications.
The high scale of complexity such as processing time, memory space, and
communication bandwidth.
Page No.: 4
Scientific computing.
Making movie.
Cluster Development:
The main components of a cluster are the Personal Computer and the interconnection
network. The computer can be built out of Commercial off the shelf components (COTS)
and is available economically.
1. Network,
2. Compute nodes
3. Master server
4. Gateway.
Each part has a specific function that is needed for the hardware to perform its function.
Network: It provides communication between nodes, server, and gateway. Also consists
of fast Ethernet switch, cables, and other networking hardware.
Compute Nodes: Serve as processors for the cluster. Each node is interchangeable, there
are no functionality differences between nodes and consists of all computers in the
cluster other than the gateway and server.
Master Server: Provides network services to the cluster. Actually runs parallel programs
and spawns processes on the nodes and should have minimum requirement.
Gateway: Acts as a bridge/firewall between outside world and cluster and should have
two ethernet card.
Page No.: 5
Chapter – 5
Server Farm:
A server farm, also called a computer cluster, is a group of servers that is kept in a single
location. These servers are networked together, making it possible for them to meet
server needs that are difficult or impossible to handle with just one server. With a server
farm, workload is distributed among multiple server components, providing for expedited
computing processes. In the past, these farms were most frequently used by institutions
that were academic or research-based. In today’s world, they are commonly employed in
companies of all types, providing a way to streamline weighty computerized tasks.
A server farm or cluster might perform such services as providing centralized access
control, file access, printer sharing, and backup for workstation users. The servers may
have individual operating systems or a shared operating system and may also be set up to
provide load balancing when there are many server requests.
Applications:
Server farms are commonly used for cluster computing. Many modern supercomputers
comprise giant server farms of high-speed processors connected by either Gigabit
Ethernet or custom interconnects such as Infiniband or Myrinet. Web hosting is a common
use of a server farm; such a system is sometimes collectively referred to as a web farm.
Page No.: 6
Chapter – 6
Distributed Compilation:
Distributed Computing:
Distributed computing is a method of computer processing in which different parts of a
program are run simultaneously on two or more computers that are communicating with
each other over a network.
Distributed computing is a type of segmented or parallel computing, but the latter term is
most commonly used to refer to processing in which different parts of a program run
simultaneously on two or more processors that are part of the same computer.
Architecture:
Client/Server System:
The Client-server architecture is a way to provide a service from a central source. There is
a single server that provides a service, and many clients that communicate with the server
to consume its products.
Peer-to-Peer System:
The term peer-to-peer is used to describe distributed systems in which labour is divided
among all the components of the system. All the computers send and receive data, and
they all contribute some processing power and memory. As a distributed system increases
in size, its capacity of computational resources increases.
• distcc can optionally use strongly encrypted and authenticated ssh channels for
communication.
Page No.: 7
Chapter – 7
For example, a cross-platform application may run on Microsoft Windows on the x86
architecture, Linux on the x86 architecture and Mac OS X on either the PowerPC or x86
based Apple Macintosh systems. Cross-platform programs may run on as many as all
existing platforms, or on as few as two platforms.
Scripting languages and virtual machines must be translated into native executable code
each time the application is executed, imposing a performance penalty. This penalty can be
alleviated using advanced techniques like just-in-time compilation
Different platforms require the use of native package formats such as RPM and MSI.
Multi-platform installers such as Install Anywhere address this need.
Cross-platform environments:
Page No.: 8
Continuous Integration Techniques:
- Automate Deployment
• Integration bugs are detected early and are easy to track down due to small change
sets. This saves both time and money over the lifespan of a project.
• Frequent code check-in pushes developers to create modular, less complex code
Page No.: 9
Chapter – 8
Compile Farm:
Objective :
A compile farm is a part of our main server which is used to continue and produce releases
of programs easily and uniformly. Compile farms are composed of machines of various
architectures running on various OS and is intended to allow developers to test and use
their programs on a variety of platforms.
Application :
Considering multiple processor architecture and operating system, each developer needs
to have a machine for each environment. In this scenario, a compile form can be used by
configuring to the target OS and architecture to build and run their software patches.
Overcoming the problems of CFD, which causes an error that prevents functionality of
software code on different CPU/OS, CI (Continuous Integration) scripts automatically
builds the latest version of source tree from a version control repository.
For distributed compilation in which s/w builds requires parallel execution, can be
performed by using separate machines on server.
Page No.: 10
DCF mediator is invoked in order to check a crone job and publishes the builds to SDC.
Resources :
The GCC Compile farm project maintains a set of machines of various architectures and
provides ssh access to free software developers, GCC and others (GPL, BSD, MIT, ...).
Once your account application (see below) is approved, you get full ssh access to all the
farm machines (current and new), architectures currently available:
Page No.: 11
Compile Farm Structure:
For parallel compilation and queuing of s/w patches NI LabView FPGA Compile Farm is
used.
Page No.: 12
NI LabView FPGA Compile Farm:
FPGA stands for Field Programmable gate array to processor which implements the
functionality of systems.
Architecture:
Page No.: 13
Benefits of NI LabVIEW FPGA Compile Farm:
Field upgradable eliminating the expense of custom ASIC re-design and maintainance.
Page No.: 14
Chapter – 9
Implementations:
One example of a compile farm was the service provided by SourceForge until 2006. The
SourceForge compile farm was composed of twelve machines of various computer
architectures running a variety of operating systems, and was intended to allow
developers to test and use their programs on a variety of platforms before releasing them
to the public. After a power spike destroyed several of the machines it became non-
operational some time in 2006, and was officially discontinued on February 8, 2007.
- FreeBSD reports service which lets package maintainers test their own changes
Page No.: 15
REFERENCES
[1]. http://en.wikipedia.org/wiki/Compile_farm
[2]. http://www.ieee.li/pdf/viewgraphs/introduction_to_LabVIEW_FPGA.pdf
[3]. https://gcc.gnu.org/wiki/CompileFarm
[4]. http://nas.nasa.gov/SC10/PDF/Datasheets/Duffy_ClusterComputing_demo.pdf
[5]. http://www.labviewpro.net/...../Whats_new_in_LabVIEW_RT&FPGA_%28NI%29.pdf
[6]. http://en.wikipedia.org/wiki/Computer_cluster
[7]. http://sdcc.sourceforge.net/mediawiki/index.php/Distributed_Compile_Farm
[8]. http://archlinuxarm.org/developers/distcc-cross-compiling
[9]. http://distcc.googlecode.com/svn/trunk/doc/web/compared.html
Page No.: 16