0% found this document useful (0 votes)
555 views18 pages

Compile Farm - A Technique in Distributed Compilation

A compile farm is a server cluster which is basically a collection of computer server usually maintained by an enterprise, has been setup to compile computer programs remotely for various reasons. by Tanmay Baranwal School of Computer Science and Engineering Lovely Professional University, IN

Uploaded by

Tanmay Baranwal
Copyright
© Attribution Non-Commercial No-Derivs (BY-NC-ND)
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
555 views18 pages

Compile Farm - A Technique in Distributed Compilation

A compile farm is a server cluster which is basically a collection of computer server usually maintained by an enterprise, has been setup to compile computer programs remotely for various reasons. by Tanmay Baranwal School of Computer Science and Engineering Lovely Professional University, IN

Uploaded by

Tanmay Baranwal
Copyright
© Attribution Non-Commercial No-Derivs (BY-NC-ND)
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 18

Compile Farm – A Technique in Distributed Compilation

[Term Paper/Journal]

Submitted in Fulfillment of the


Requirement for the Completion of

COMPILER DESIGN – CSE415

by

Tanmay Baranwal (B.Tech.CSE)


[email] : tanmay[dot]11202766[at]lpu[dot]in

RK2202B42-11202766
School of Computer Science and Engineering

Under the Guidance of

Asst. Prof. Harshpreet Singh

Department of Computer Science


Lovely Professional University
Punjab-India

Submission Date : 6th April, 2015

Page No.:
Compile Farm – A Technique in Distributed Compilation
by
Tanmay Baranwal
School of Computer Science and Engineering
Lovely Professional University, IN

This paper represents my own work and is formatted based upon “Standard IEEE
Format for Research Journals” in accordance with University regulations.

/s/. Tanmay Baranwal

Contents

1. Introduction....................................................................................................................1

a. Introduction to Compiler......................................................................................1

b. Compilation...........................................................................................................1

2. Structure of Compiler......................................................................................................2

3. Compiler Construction....................................................................................................3

4. Cluster Computing and Development.............................................................................4

5. Server Farm.....................................................................................................................6

6. Distributed Compilation..................................................................................................7

7. Cross Platform Development.........................................................................................8

a. Continuous Integration Techniques.......................................................................9

8. Compile Farm.................................................................................................................10

a. GCC Based...........................................................................................................11

b. NI LabVIEW FPGA Based....................................................................................13

9. Implementations............................................................................................................15

10. References...................................................................................................................16

Page No.: 0
Chapter - 1
Introduction:

A compile farm is a server cluster which is basically a collection of computer server


usually maintained by an enterprise, has been setup to compile computer programs
remotely for various reasons.

Introduction To Compiler:

A compiler is set of programs that transforms or converts, source code written in a


programming language to system language. System language is basically the object code
which is basically in binary form. Main objective of this is to create an executable program.
A compiler is likely to perform many or all of the following operations: lexical analysis,
preprocessing, parsing, semantic analysis (Syntax-directed translation), code generation,
and code optimization.

Compilation:
Compilers enabled the development of programs that are machine-independent. The first
higher-level language, in the 1950s, machine-dependent assembly language was widely
used. While assembly language produces more reusable and relocatable programs than
machine code on the same architecture, it has to be modified or rewritten if the program is
to be executed on different computer hardware architecture.

Structure:
A compiler consists of three main parts: the front-end, the middle-end, and the
back-end.

Front End:
Programs are checked here in term of Syntax and Semantics of respective
programming language. Type checking is also performed by collecting type
information. The front-end then generates an intermediate representation of the
source code for processing by the middle-end.

Middle End:
Here optimization takes place. Few transformations for optimization are removal of
useless or unreachable code, discovery and relocation of computation code etc.

Back End:
It gives the output as assembly code from optimized code from middle-end.
Memory allocation and register allocation are also performed here for process
register (required for some program variables).

Page No.: 1
Chapter – 2

Structure of Compiler:

In a compiler,

• linear analysis
• is called LEXICAL ANALYSIS or SCANNING and
• is performed by the LEXICAL ANALYZER or LEXER,
• hierarchical analysis
• is called SYNTAX ANALYSIS or PARSING and
• is performed by the SYNTAX ANALYZER or PARSER.
• During the analysis, the compiler manages a SYMBOL TABLE by
• recording the identifiers of the source program
• collecting information (called ATTRIBUTES) about them: storage allocation,
type, scope, and (for functions) signature.
• When the identifier x is found by the lexical analyzer
• generates the token id
• enters the lexeme x in the symbol-table (if it is not already there)
• associates to the generated token a pointer to the symbol-table entry x. This
pointer is called the LEXICAL VALUE of the token.
• During the analysis or synthesis, the compiler may DETECT ERRORS and report on
them.
• However, after detecting an error, the compilation should proceed allowing
further errors to be detected.
• The syntax and semantic phases usually handle a large fraction of the errors
detectable by the compiler.

Page No.: 2
Chapter – 3

Compiler Construction:

All but the smallest of compilers have more than two phases. However, these phases are
usually regarded as being part of the front end or the back end. The point at which these
two ends meet is open to debate. The front end is generally considered to be where
syntactic and semantic processing takes place, along with translation to a lower level of
representation (than source code).

Lexical Analysis Phase:

This phase involves grouping the characters that make up the source program into
meaningful sequences called lexemes. Lexemes belong to token classes such as "integer",
"identifier", or "whitespace". A token is produced for each lexeme. Lexical analysis is also
called scanning.

Syntax Analysis:

The output of lexical analyser is used to create a representation which shows the
grammatical structure of the tokens. Syntax analysis is also called parsing

Due to the extra time and space needed for compiler analysis and optimizations, some
compilers skip them by default. Users have to use compilation options to explicitly tell the
compiler which optimizations should be enabled.

Page No.: 3
Chapter – 4

Cluster Computing:

Cluster computing is defined as a type of parallel or distributed processing system, which


consists of a collection of interconnected stand-alone computers cooperatively working
together as a single, integrated computing resource. These computers are linked together
using high-speed network interfaces between themselves and the actual binding together
of the all the individual computers in the cluster is performed by the operating system and
the software used.

Parallel and distributed computing is the solution. Today a wide range of applications are
eager for higher computing power, and faster execution.

An application may desire more computational power for many reasons, but the following
three are the most common:

Real-time constraints: That is, a requirement that the computation finish within a
certain period of time. Weather forecasting is an example. Another is processing
data produced by an experiment; the data must be processed (or stored) at least
as fast as it is produced.

Throughput: A scientific or engineering simulation may require many computations. A


cluster can provide the resources to process many related simulations. An example
of using a Linux Beowulf cluster for throughput is Google [13], which uses over
15,000 commodity PCs with fault-tolerant software to provide a high-performance
Web search service.

Memory: Some of the most challenging applications require huge amounts of data
as part of the simulation.

Architecture:

A cluster is a type of parallel or distributed processing system, which consists of a


collection of interconnected stand alone computers cooperatively working together as a
single resource.

Cluster Applications:
Cluster computing is rapidly becoming the architecture of choice in Grand Challenge
Applications.

The high scale of complexity such as processing time, memory space, and
communication bandwidth.

Page No.: 4
Scientific computing.

Making movie.

Commercial server (web/database etc).

Distributed Compilation/Cross Compilation

Cluster Development:
The main components of a cluster are the Personal Computer and the interconnection
network. The computer can be built out of Commercial off the shelf components (COTS)
and is available economically.

The cluster mainly consists of 4 major parts. They are:

1. Network,
2. Compute nodes
3. Master server
4. Gateway.

Each part has a specific function that is needed for the hardware to perform its function.

Network: It provides communication between nodes, server, and gateway. Also consists
of fast Ethernet switch, cables, and other networking hardware.

Compute Nodes: Serve as processors for the cluster. Each node is interchangeable, there
are no functionality differences between nodes and consists of all computers in the
cluster other than the gateway and server.

Master Server: Provides network services to the cluster. Actually runs parallel programs
and spawns processes on the nodes and should have minimum requirement.

Gateway: Acts as a bridge/firewall between outside world and cluster and should have
two ethernet card.

Page No.: 5
Chapter – 5

Server Farm:

A server farm, also called a computer cluster, is a group of servers that is kept in a single
location. These servers are networked together, making it possible for them to meet
server needs that are difficult or impossible to handle with just one server. With a server
farm, workload is distributed among multiple server components, providing for expedited
computing processes. In the past, these farms were most frequently used by institutions
that were academic or research-based. In today’s world, they are commonly employed in
companies of all types, providing a way to streamline weighty computerized tasks.

A server farm or cluster might perform such services as providing centralized access
control, file access, printer sharing, and backup for workstation users. The servers may
have individual operating systems or a shared operating system and may also be set up to
provide load balancing when there are many server requests.

Applications:
Server farms are commonly used for cluster computing. Many modern supercomputers
comprise giant server farms of high-speed processors connected by either Gigabit
Ethernet or custom interconnects such as Infiniband or Myrinet. Web hosting is a common
use of a server farm; such a system is sometimes collectively referred to as a web farm.

ARCHITECTURE OF SERVER FARM

Page No.: 6
Chapter – 6
Distributed Compilation:

Distributed Computing:
Distributed computing is a method of computer processing in which different parts of a
program are run simultaneously on two or more computers that are communicating with
each other over a network.
Distributed computing is a type of segmented or parallel computing, but the latter term is
most commonly used to refer to processing in which different parts of a program run
simultaneously on two or more processors that are part of the same computer.

Architecture:

Client/Server System:
The Client-server architecture is a way to provide a service from a central source. There is
a single server that provides a service, and many clients that communicate with the server
to consume its products.

Peer-to-Peer System:
The term peer-to-peer is used to describe distributed systems in which labour is divided
among all the components of the system. All the computers send and receive data, and
they all contribute some processing power and memory. As a distributed system increases
in size, its capacity of computational resources increases.

Tools for Distributed Compilation:

In software development, distcc is a tool for speeding up compilation of source code by


using distributed computing over a computer network. With the right configuration, distcc
can dramatically reduce a project's compilation time.

Distcc is a program to distribute builds of C, C++, Objective C or Objective C++ code


across several machines on a network. distcc should always generate the same results as
a local build, is simple to install and use, and is usually much faster than a local compile.
The cool part is you can use it together with pacman/srcpac.

Some advantages of distcc over other tools:

• distcc doesn't require a shared filesystem.

• distcc can optionally use strongly encrypted and authenticated ssh channels for
communication.

Page No.: 7
Chapter – 7

Cross Platform Development:

In computing, cross-platform, or multi-platform, is an attribute given to computer


software or computing methods and concepts that are implemented and inter-operate on
multiple computer platforms.

For example, a cross-platform application may run on Microsoft Windows on the x86
architecture, Linux on the x86 architecture and Mac OS X on either the PowerPC or x86
based Apple Macintosh systems. Cross-platform programs may run on as many as all
existing platforms, or on as few as two platforms.

Approaches to Cross-platform Development:

There are different ways of approaching the problem of writing a cross-platform


application program. One such approach is simply to create multiple versions of the same
program in different source trees—in other words, the Windows version of a program
might have one set of source code files and the Macintosh version might have another,
while a FOSS *nix system might have another.

Challenges in Cross-platform Development:

Testing cross-platform applications may be considerably more complicated, since different


platforms can exhibit slightly different behaviors or subtle bugs.

Scripting languages and virtual machines must be translated into native executable code
each time the application is executed, imposing a performance penalty. This penalty can be
alleviated using advanced techniques like just-in-time compilation

Different platforms require the use of native package formats such as RPM and MSI.
Multi-platform installers such as Install Anywhere address this need.

Cross-platform execution environments may suffer cross-platform security flaws,


creating a fertile environment for cross-platform malware.

Cross-platform environments:

AppearlQ, fpGUI, Eclipse, Mono, Ultimate++, Xpower++

Page No.: 8
Continuous Integration Techniques:

Continuous integration (CI) is the practice, in software engineering, of merging all


developer working copies with a shared mainline several times a day. It was first named
and proposed by Grady Booch in his method

Few Steps are ;

- Maintaining a Code Repository

- Automate the Build/Compilation

- Committing the Baseline of Programs

- Test in a Clone of Product Environment

- Result of Test Build

- Automate Deployment

Continuous Integration is Used to possess benefits as:

• Integration bugs are detected early and are easy to track down due to small change
sets. This saves both time and money over the lifespan of a project.

• Constant availability of a "current" build for testing, demo, or release purposes

• Frequent code check-in pushes developers to create modular, less complex code

• Enforces discipline of frequent automated testing

• Immediate feedback on system-wide impact of local changes

Page No.: 9
Chapter – 8

Compile Farm:

A compile farm is a server cluster which is basically a collection of computer server


usually maintained by an enterprise, has been setup to compile computer programs
remotely for various reasons.

Objective :

A compile farm is a part of our main server which is used to continue and produce releases
of programs easily and uniformly. Compile farms are composed of machines of various
architectures running on various OS and is intended to allow developers to test and use
their programs on a variety of platforms.

Application :

Compile farms are used widely and is applicable in following areas :

1. Cross Platform Development

2. Continuous Integration in Cross Platform Development

3. Distributed Compile Farms

Cross Platform Development :

Considering multiple processor architecture and operating system, each developer needs
to have a machine for each environment. In this scenario, a compile form can be used by
configuring to the target OS and architecture to build and run their software patches.

Continuous Integration in Cross Platforms :

Overcoming the problems of CFD, which causes an error that prevents functionality of
software code on different CPU/OS, CI (Continuous Integration) scripts automatically
builds the latest version of source tree from a version control repository.

Distributed Compile Farms :

For distributed compilation in which s/w builds requires parallel execution, can be
performed by using separate machines on server.

Page No.: 10
DCF mediator is invoked in order to check a crone job and publishes the builds to SDC.

Resources :

To use compile farms, to builds software patches we need :

Source, which is basically stored on a GIT server or Mercurial hg

SSH Daemon, allowing user to login

One or more cross compiler

Memory, to store builds in clients directory

Some Compile Farm Tools:

GCC Compile Farm:

The GCC Compile farm project maintains a set of machines of various architectures and
provides ssh access to free software developers, GCC and others (GPL, BSD, MIT, ...).
Once your account application (see below) is approved, you get full ssh access to all the
farm machines (current and new), architectures currently available:

GCC Compile Farm Architecture:


• i686 – x64 Architecture
• x86_64 – 32-bit Based x64 Architecture
• armv5tel
• armv7l (with vfp and neon FPU instructions)
• powerpc
• powerpc64 (including POWER7 and POWER8)
• Cell SPE (Sony Playstation 3)
• sparc
• sparc64 (sparcv9)
• alpha (currently offline)
• mipsel
• mips64el
• ia64
• hppa

Page No.: 11
Compile Farm Structure:

Based upon our needs for computational


resource in computer science, the usage of
compiler farm through clustered system is one of
the most promising means by which we can
bridge the gap between variety of programs,
compiling individual source codes on various
systems with various architecture.

There are various of compile farms available in


which GCC GNU is most used and DEBIAN is
most efficient.

For parallel compilation and queuing of s/w patches NI LabView FPGA Compile Farm is
used.

Page No.: 12
NI LabView FPGA Compile Farm:

NI LabView FPGA Compile Farm is a compile farm, built by National Instruments is a


Software defined hardware in which basically no operating system is needed for execution
of logic.

FPGA stands for Field Programmable gate array to processor which implements the
functionality of systems.

Architecture:

Page No.: 13
Benefits of NI LabVIEW FPGA Compile Farm:

Faster I/O Response times and specialized facility.

Exceeding the computing power of digital signal processors.

Implementing custom functionality.

Field upgradable eliminating the expense of custom ASIC re-design and maintainance.

Page No.: 14
Chapter – 9

Implementations:

One example of a compile farm was the service provided by SourceForge until 2006. The
SourceForge compile farm was composed of twelve machines of various computer
architectures running a variety of operating systems, and was intended to allow
developers to test and use their programs on a variety of platforms before releasing them
to the public. After a power spike destroyed several of the machines it became non-
operational some time in 2006, and was officially discontinued on February 8, 2007.

Other examples are:

- GCC Compile Farm http://gcc.gnu.org/wiki/CompileFarm

- OpenSUSE Build Service

- FreeBSD reports service which lets package maintainers test their own changes

on a variety of versions and architectures.

- Launchpad Build Farm https://launchpad.net/builders

- Mozilla has a build farm, but it is not public


https://wiki.mozilla.org/ReleaseEngineering

- Debian has a build farm https://buildd.debian.org/

Page No.: 15
REFERENCES

[1]. http://en.wikipedia.org/wiki/Compile_farm

[2]. http://www.ieee.li/pdf/viewgraphs/introduction_to_LabVIEW_FPGA.pdf

[3]. https://gcc.gnu.org/wiki/CompileFarm

[4]. http://nas.nasa.gov/SC10/PDF/Datasheets/Duffy_ClusterComputing_demo.pdf

[5]. http://www.labviewpro.net/...../Whats_new_in_LabVIEW_RT&FPGA_%28NI%29.pdf

[6]. http://en.wikipedia.org/wiki/Computer_cluster

[7]. http://sdcc.sourceforge.net/mediawiki/index.php/Distributed_Compile_Farm

[8]. http://archlinuxarm.org/developers/distcc-cross-compiling

[9]. http://distcc.googlecode.com/svn/trunk/doc/web/compared.html

Page No.: 16

You might also like